[SERVER-42718] dropDatabase commands can be run concurrently, leading to an invalid state Created: 08/Aug/19 Updated: 29/Oct/23 Resolved: 21/Aug/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | 4.2.0-rc8 |
| Fix Version/s: | 4.2.1, 4.3.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Daniel Gottlieb (Inactive) | Assignee: | Daniel Gottlieb (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Backport Requested: |
v4.2
|
||||||||||||
| Sprint: | Execution Team 2019-08-12, Execution Team 2019-08-26 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
dropDatabase runs in two steps. First it drops all collections. Then it waits for those drops to replicate (releasing locks). Lastly, it reacquires locks and finalizes the dropDatabase which among other things, writes the dropDatabase oplog entry. To manage concurrency control when locks are released, the command sets a drop pending flag. This flag is to prevent collections from being created until the dropDatabase is completed. However, that flag does not prevent a second, concurrent dropDatabase from being accepted. A second dropDatabase will not have any collections to drop and fast-paths into writing a dropDatabase oplog entry and deleting the database object. One manifestation of this bug is, when adding a third client that creates a collection, an illegal update chain can be created (an untimestamped write).
Results in the oplog entries:
|
| Comments |
| Comment by Githook User [ 29/Aug/19 ] |
|
Author: {'name': 'Daniel Gottlieb', 'username': 'dgottlieb', 'email': 'daniel.gottlieb@mongodb.com'}Message: (cherry picked from commit 561169684e8160d2f738ba94b404f14c9115dcd1) |
| Comment by Githook User [ 21/Aug/19 ] |
|
Author: {'username': 'dgottlieb', 'email': 'daniel.gottlieb@mongodb.com', 'name': 'Daniel Gottlieb'}Message: |
| Comment by Eric Milkie [ 14/Aug/19 ] |
|
Joining concurrent operations for dropDatabase isn't a bad idea but I think the work for it should go in another ticket for Improvement; the bug fix here should be simple since we are backporting it. |
| Comment by Esha Maharishi (Inactive) [ 12/Aug/19 ] |
|
Hmm. It might still be worth considering having the requests join rather than later ones uassert, since the config server retries commands against shards on network errors. |
| Comment by Daniel Gottlieb (Inactive) [ 12/Aug/19 ] |
|
Prior to two-phase drop (when lock releasing was added), concurrent dropDatabase calls would be processed in series. In this case, I intended to just uassert if a dropDatabase was in effect. That's the current behavior for concurrent dropDatabase and createCollection calls. |
| Comment by Esha Maharishi (Inactive) [ 12/Aug/19 ] |
|
Nice find... We've had similar problems in sharding and generally solved them by making identical requests join each other. |