[SERVER-21460] WiredTiger primaries hold on to deleted collection data and index files Created: 13/Nov/15  Updated: 09/Dec/15  Resolved: 24/Nov/15

Status: Closed
Project: Core Server
Component/s: WiredTiger
Affects Version/s: 3.0.6
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Dai Shi Assignee: Kelsey Schubert
Resolution: Cannot Reproduce Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

We are running v3.0.6 mongoD binaries. In one of our clusters, we dropped a large collection last week. I noticed today that all the primaries in the cluster had significantly higher disk utilization than the secondaries, so I tried stepping down a primary. Upon stepping down, the disk util immediately dropped to the level of the secondaries. The output of du also shows that after stepping down, a collection file along with several index files had been deleted. Note the mongoD process was not restarted at all.

Before:

du -sm * | sort -n
1	collection-0-8259577013043377846.wt
1	collection-11-8259577013043377846.wt
1	collection-13-8259577013043377846.wt
1	collection-17-8259577013043377846.wt
1	collection-19-8259577013043377846.wt
1	collection-2-8259577013043377846.wt
1	collection-4-8259577013043377846.wt
1	collection-7-8259577013043377846.wt
1	index-12-8259577013043377846.wt
1	index-14-8259577013043377846.wt
1	index-1-8259577013043377846.wt
1	index-18-8259577013043377846.wt
1	index-20-8259577013043377846.wt
1	index-29-8259577013043377846.wt
1	index-3-8259577013043377846.wt
1	index-5-8259577013043377846.wt
1	index-8-8259577013043377846.wt
1	_mdb_catalog.wt
1	mongod.lock
1	sizeStorer.wt
1	storage.bson
1	_tmp
1	WiredTiger
1	WiredTiger.basecfg
1	WiredTiger.lock
1	WiredTiger.turtle
1	WiredTiger.wt
6	index-35-8259577013043377846.wt
7	index-37-8259577013043377846.wt
8	index-36-8259577013043377846.wt
10	index-26-8259577013043377846.wt
38	collection-25-8259577013043377846.wt
67	index-24-8259577013043377846.wt
76	collection-23-8259577013043377846.wt
77	index-16-8259577013043377846.wt
78	index-31-8259577013043377846.wt
94	index-33-8259577013043377846.wt
166	index-32-8259577013043377846.wt
187	index-30-8259577013043377846.wt
201	journal
205	index-34-8259577013043377846.wt
246	moveChunk
256	index-22-8259577013043377846.wt
616	collection-21-8259577013043377846.wt
2814	collection-6-8259577013043377846.wt
3732	index-10-8259577013043377846.wt
4137	index-28-8259577013043377846.wt
5641	index-27-8259577013043377846.wt
10368	collection-9-8259577013043377846.wt
12011	collection-15-8259577013043377846.wt

After:

sudo du -sm * | sort -n
1	collection-0-8259577013043377846.wt
1	collection-11-8259577013043377846.wt
1	collection-13-8259577013043377846.wt
1	collection-17-8259577013043377846.wt
1	collection-19-8259577013043377846.wt
1	collection-2-8259577013043377846.wt
1	collection-4-8259577013043377846.wt
1	collection-7-8259577013043377846.wt
1	index-12-8259577013043377846.wt
1	index-14-8259577013043377846.wt
1	index-1-8259577013043377846.wt
1	index-18-8259577013043377846.wt
1	index-20-8259577013043377846.wt
1	index-29-8259577013043377846.wt
1	index-3-8259577013043377846.wt
1	index-5-8259577013043377846.wt
1	index-8-8259577013043377846.wt
1	_mdb_catalog.wt
1	mongod.lock
1	sizeStorer.wt
1	storage.bson
1	_tmp
1	WiredTiger
1	WiredTiger.basecfg
1	WiredTiger.lock
1	WiredTiger.turtle
1	WiredTiger.wt
6	index-35-8259577013043377846.wt
7	index-37-8259577013043377846.wt
8	index-36-8259577013043377846.wt
10	index-26-8259577013043377846.wt
38	collection-25-8259577013043377846.wt
67	index-24-8259577013043377846.wt
76	collection-23-8259577013043377846.wt
77	index-16-8259577013043377846.wt
78	index-31-8259577013043377846.wt
94	index-33-8259577013043377846.wt
166	index-32-8259577013043377846.wt
187	index-30-8259577013043377846.wt
201	journal
205	index-34-8259577013043377846.wt
246	moveChunk
256	index-22-8259577013043377846.wt
616	collection-21-8259577013043377846.wt
2814	collection-6-8259577013043377846.wt
12011	collection-15-8259577013043377846.wt

Is this expected behavior? Do we need to step down all primaries after dropping collections or indexes to reclaim disk space?



 Comments   
Comment by Ramon Fernandez Marina [ 24/Nov/15 ]

Understood – thanks dai@foursquare.com. We'll close this ticket for now. If the issue reappears please post here so we can reopen or feel free to open a new ticket.

Regards,
Ramón.

Comment by Dai Shi [ 23/Nov/15 ]

Apologies for the delay, I had to get our staging cluster on the same version with the right data. Unfortunately I was unable to reproduce the behavior in that environment. The collection and index files were deleted upon running the db.collection.drop(), and the disk space became free.

I'm not sure why the primaries did not drop the files in production, but since neither of us can reproduce it we can close this for now. I will try and remember to pay attention to this the next time we drop anything in production.

Comment by Kelsey Schubert [ 23/Nov/15 ]

Hi dai@foursquare.com,

Thank you for the additional details about this behavior. It would be great if you could reproduce on your staging cluster, preferably with log level 1. Please let us know whether are able to successfully reproduce so we can continue to investigate.

Comment by Dai Shi [ 18/Nov/15 ]

Hi Thomas,

Unfortunately the logs have already been rotated out so I no longer have them. I can try reproducing it on our staging cluster.

For what it's worth, I saw this behavior on every single primary in this cluster (8 of them). Immediately after stepping them down, the collection and index files associated with the dropped collection were deleted, and disk space became free.

Comment by Kelsey Schubert [ 18/Nov/15 ]

Hi dai@foursquare.com,

I haven't been able to reproduce this behavior. This issue may be a duplicate of SERVER-17397. Can you provide the logs from 5 minutes around dropping the collection and 5 minutes around the stepdown to help rule out other possibilities?

Thank you,
Thomas

Generated at Thu Feb 08 03:57:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.