[SERVER-31902] Reduce contention on ShardingState::_mutex for insert workloads Created: 10/Nov/17 Updated: 11/Dec/17 Resolved: 17/Nov/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 3.6.0-rc3 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kaloian Manassiev | Assignee: | Kaloian Manassiev |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Currently, each write that's part of a batch performs two separate checks for shard version. Once at the beginning of the batch and a second time when the write has been written to the oplog. Performing these checks causes a hot mutex to be acquired in order to retrieve the sharding state for the collection, so the first one should ideally be removed. |
| Comments |
| Comment by Kaloian Manassiev [ 17/Nov/17 ] |
|
Unfortunately, given the current state of the code it is really difficult to reduce the number of retrievals of the collection metadata on the write path to less than 2. The more proper solution would be to make CollectionShardingState be a decoration on the Collection object, which has already been retrieved under an IX lock. This would eliminate the mutex acquisition altogether. Closing as duplicate of |
| Comment by Eric Milkie [ 13/Nov/17 ] |
|
I changed the title to reflect the problem to solve rather than suggesting a solution. |
| Comment by Eric Milkie [ 13/Nov/17 ] |
|
I am adding the dependency back in because regaining the performance depends on the work here, not because this ticket identifies a cause of the regression from a previous version of MongoDB. |
| Comment by Kaloian Manassiev [ 13/Nov/17 ] |
This will not work exactly because of the yielding problem. If the query part of an update yields and the shard version changes, if we don't check inside of onInsertOp we will lose that write. In either case, these two shard version checks have always been around so they should not have contributed to the performance regression in I am pretty sure the query system doesn't take locks itself (outside of re-acquisition after yield). It relies on being called locked. Also AutoGetCollectionForReadCommand is only intended to be used for read commands.
I think we did code inspection at some point and deemed that the check in the OpObserver is sufficient, but on second review it is not because of the scenario I described. |
| Comment by Eric Milkie [ 13/Nov/17 ] |
|
AutoGetCollectionForReadCommand checks for shard version, but I'm not sure if that's what the query system uses for the query phase of update. |
| Comment by Eric Milkie [ 13/Nov/17 ] |
|
It sounds like we can solve this by removing the shard version check in CollectionShardingState::onInsertOp(), since it should already have been checked when the db lock was acquired. |
| Comment by Kaloian Manassiev [ 13/Nov/17 ] |
|
Unfortunately, in order to cover the case where a write operation does not result in an entry being written to the oplog, it is not possible to remove the check performed at the beginning of the write batch. Otherwise, writes which result in a no-op will never discover that the shard version has changed. In the same train of thought, I noticed that returning from yield doesn't check that the shard version has been preserved. In this case, it is possible that if a chunk moves out of a shard while the query for a write operation has yielded, then updates will potentially not match their predicate, result in a no-op and report success, while in fact the document still exists. milkie, david.storch: Do we need to add a check for shard version after returning from yielding as well? |