[SERVER-29745] Range deletion after moving away a chunk must wait for metadata update to finish before proceeding Created: 20/Jun/17  Updated: 30/Oct/23  Resolved: 13/Jul/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 3.5.11

Type: Task Priority: Major - P3
Reporter: Dianna Hohensee (Inactive) Assignee: Dianna Hohensee (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-30083 Schedule orphan range deletions soone... Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2017-07-10, Sharding 2017-07-31
Participants:
Linked BF Score: 0

 Description   

Range deletion and metadata updates are both done asynchronously without order. If data deletion were to propagate to a secondary before a metadata update, this would be wrong.



 Comments   
Comment by Dianna Hohensee (Inactive) [ 13/Jul/17 ]

Finally determined that the hang was caused by holding a ScopedCollectionMetadata object while scheduling and waiting to range deletion. The solution was to stop holding it – it wasn't necessary, anyway, so it's probably better not to hold on to it anyway. However, holding that scoped object of the latest metadata should not have held up range deletion of an unused range from an old metadata version: it's a bug that clean up was never scheduled. I suspect the error is either related to this not evaluating to true for some reason when clean up is first requested, or the ScopedCollectionMetadata destructor code that should schedule cleanup when old metadata is released.

Comment by Githook User [ 13/Jul/17 ]

Author:

{u'username': u'DiannaHohensee', u'name': u'Dianna Hohensee', u'email': u'dianna.hohensee@10gen.com'}

Message: SERVER-29745 after a successful migration, ensure the metadata update is persisted before range deletion is schedule
Branch: master
https://github.com/mongodb/mongo/commit/4f070aef1c4fe27948db5db93729fdb757f487e5

Comment by Githook User [ 11/Jul/17 ]

Author:

{u'username': u'DiannaHohensee', u'name': u'Dianna Hohensee', u'email': u'dianna.hohensee@10gen.com'}

Message: Revert "SERVER-29745 after a successful migration, ensure the metadata update is persisted before range deletion is schedule"

This reverts commit 3b1554c77ce9c80b30044654ff2cab3aff7070d4.
Branch: master
https://github.com/mongodb/mongo/commit/d25bce8b0954abb003a97c1140c856532dcfb7db

Comment by Githook User [ 11/Jul/17 ]

Author:

{u'username': u'DiannaHohensee', u'name': u'Dianna Hohensee', u'email': u'dianna.hohensee@10gen.com'}

Message: SERVER-29745 after a successful migration, ensure the metadata update is persisted before range deletion is schedule
Branch: master
https://github.com/mongodb/mongo/commit/3b1554c77ce9c80b30044654ff2cab3aff7070d4

Comment by Dianna Hohensee (Inactive) [ 08/Jul/17 ]

Reverted the commit. It appears to be causing a hang, e.g. https://evergreen.mongodb.com/task/mongodb_mongo_master_enterprise_rhel_62_64_bit_slow1_344bf6e257e1427bc594bacac3f5983c2bdeaacf_17_07_07_12_44_23

The hang is in the CollectionRangeDeleter code. There's no corresponding "Finished deleting mr_during_migrate.coll range ...." message after the donor finishes the migration and starts waiting. And one of the thread dumps has CollectionRangeDeleter::DeleteNotification::waitStatus in it. I have not diagnosed the range deletion problem, merely identified that it is the problem and needed to be reverted.

The CollectionRangeDeleter functions called in moveChunk were changed in this commit. It seems to have unwittingly surfaced a bug.

Comment by Githook User [ 08/Jul/17 ]

Author:

{u'username': u'DiannaHohensee', u'name': u'Dianna Hohensee', u'email': u'dianna.hohensee@10gen.com'}

Message: Revert "SERVER-29745 after a successful migration, ensure the metadata update is persisted before range deletion is schedule"

This reverts commit 344bf6e257e1427bc594bacac3f5983c2bdeaacf.
Branch: master
https://github.com/mongodb/mongo/commit/096951f7bc0b7f9cc8d3b6e3334fc74c101fb9c1

Comment by Githook User [ 07/Jul/17 ]

Author:

{u'username': u'DiannaHohensee', u'name': u'Dianna Hohensee', u'email': u'dianna.hohensee@10gen.com'}

Message: SERVER-29745 after a successful migration, ensure the metadata update is persisted before range deletion is schedule
Branch: master
https://github.com/mongodb/mongo/commit/344bf6e257e1427bc594bacac3f5983c2bdeaacf

Generated at Thu Feb 08 04:21:42 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.