[SERVER-31732] Recipient shards of migrations hold DBLock in X mode while creating indexes Created: 26/Oct/17 Updated: 27/Oct/23 Resolved: 06/Nov/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.6.0-rc1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Esha Maharishi (Inactive) | Assignee: | [DO NOT USE] Backlog - Sharding Team |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Sharding
|
||||||||
| Operating System: | ALL | ||||||||
| Participants: | |||||||||
| Description |
|
Recipient shards take the DBLock twice, during the first two steps in the MigrationDestinationManager: Step 0 ("copy system.namespaces entry if collection doesn't already exist"): takes the DBLock in mode IX if it has the collection, else in mode X so the collection can be created Step 1 ("copy indexes"): takes the DBLock in mode X It may be unnecessarily heavy-handed to take the DBLock in mode X in Step 1 (although, createIndexes command itself does take the DBLock in mode X in case it needs to create the collection, but only relocks it with mode IX if the index was asked to be built in the background. createIndexes does relock the DBLock in mode X regardless to complete the index build). Note that CC kaloian.manassiev, kevin.pulo |
| Comments |
| Comment by Kaloian Manassiev [ 06/Nov/17 ] |
|
Creating indexes requires holding an X-lock on the database, because the index catalog is per-database. This is something, which needs to be fixed at the storage engine level. |
| Comment by Esha Maharishi (Inactive) [ 31/Oct/17 ] |
|
renctan yes, I realized that when started testing - it looks like it's more of an "auto-healing" behavior, where missing indexes are created on chunk migration. Thanks for mentioning it |
| Comment by Randolph Tan [ 31/Oct/17 ] |
|
esha.maharishi I think the removeExistingIndexes doesn't actually remove the indexes in the catalog, but filters out the indexes in the argument that has been already created. |
| Comment by Eric Milkie [ 26/Oct/17 ] |
|
Removing all existing indexes sounds like a bandaid from back in the day that we could improve. |
| Comment by Esha Maharishi (Inactive) [ 26/Oct/17 ] |
|
You're right, the index should already be there I'm hoping to change this so that the recipient shard only builds the indexes if it's accepting its first chunk for the collection. If it's any later chunk, it will fail the migration if its indexes differ from what's on the donor. |
| Comment by Eric Milkie [ 26/Oct/17 ] |
|
My first thought is: why is there any data in the collection at all when these indexes are built? Shouldn't the collection be empty, and the index build time very short? |
| Comment by Eric Milkie [ 26/Oct/17 ] |
|
Yes, migration manager is using the MultiIndexBlock builder, so it needs to follow the same locking rules that create_indexes.cpp uses. You could switch it to use the background index locking scheme, but that would have repercussions (it would make chunk migration even slower, and would result in larger, less efficient initial B-trees for those indexes). |
| Comment by Esha Maharishi (Inactive) [ 26/Oct/17 ] |
|
milkie, got it. So we should always hold a DB X lock while creating indexes for the reasons you mentioned, including in the migration code? |
| Comment by Eric Milkie [ 26/Oct/17 ] |
|
Minor technicality: The createIndexes command takes a DB X lock because it needs to update the catalog metadata for the database to add the new index entry, regardless of whether the collection exists already. It also reacquires the DB X lock at the end of a background index build, as it needs to update the catalog metadata again to mark the index as "ready". |