[SERVER-22002] Do not retry findAndModify operations on MMAPv1 Created: 23/Dec/15 Updated: 18/Nov/16 Resolved: 21/Jan/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 3.2.0-rc3 |
| Fix Version/s: | 3.2.3, 3.3.1 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Charlie Swanson | Assignee: | Charlie Swanson |
| Resolution: | Done | Votes: | 0 |
| Labels: | code-and-test | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Backport Completed: | |||||||||||||||||||||
| Sprint: | QuInt E (01/11/16), Query F (02/01/16) | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
If a findAndModify command requests a sort, we pass a limit of 1 to that sort so that it will only keep track of the top document, reducing it's memory footprint. This can lead to lots of retrying when there are many findAndModify operations which all specify a sort and are trying to update the same document, which is a common pattern for some sort of work queue. What happens is that each findAndModify operation will chug along doing it's sort (I think this is only a problem if it's an in-memory sort, not an index-provided sort), each one periodically yielding, and one will eventually finish first and update the document. Each other operation will eventually find that the matching document has already been modified, so cannot update it themselves. Since there was a limit of 1, the sort stage does not have the next matching document, and the operations have to restart, and re-sort everything. This is particularly bad on MMAPv1, since the updates have to take an exclusive lock on the collection. The logic to retry operations on MMAPv1 was only introduced in As part of this work, it was discovered that there are cases where we do not retry the operation, even on WiredTiger. We should fix those places as well. |
| Comments |
| Comment by Githook User [ 05/Aug/16 ] |
|
Author: {u'username': u'cswanson310', u'name': u'Charlie Swanson', u'email': u'charlie.swanson@mongodb.com'}Message: Revert " This reverts commit 6f3142b97b00464efebe26ef35cba73d47cf98a4. |
| Comment by Githook User [ 05/Aug/16 ] |
|
Author: {u'username': u'cswanson310', u'name': u'Charlie Swanson', u'email': u'charlie.swanson@mongodb.com'}Message: |
| Comment by Nick Judson [ 26/Mar/16 ] |
|
...and just because no one has mentioned it before, doesn't mean it isn't a bug. Just means it's an edge case. |
| Comment by Nick Judson [ 26/Mar/16 ] |
|
I don't personally have any problem with this as I don't use MMAP, and I can understand the reasons you've given. My point was simply that I would not expect functional changes when switching between storage engines. Keep up the good work! |
| Comment by Charlie Swanson [ 25/Mar/16 ] |
|
nick@innsenroute.com, yes. The retrying behavior differs based on the storage engine. We made this decision for a number of reasons:
|
| Comment by Nick Judson [ 08/Feb/16 ] |
|
So does this mean different behavior for different storage engines? I would be extremely careful making an assumption on "letting a client retry". This would break any typical queue implementation. |
| Comment by Githook User [ 21/Jan/16 ] |
|
Author: {u'username': u'cswanson310', u'name': u'Charlie Swanson', u'email': u'charlie.swanson@mongodb.com'}Message: |
| Comment by Githook User [ 21/Jan/16 ] |
|
Author: {u'username': u'cswanson310', u'name': u'Charlie Swanson', u'email': u'charlie.swanson@mongodb.com'}Message: |
| Comment by Githook User [ 13/Jan/16 ] |
|
Author: {u'username': u'cswanson310', u'name': u'Charlie Swanson', u'email': u'charlie.swanson@mongodb.com'}Message: Revert " This reverts commit 6f3142b97b00464efebe26ef35cba73d47cf98a4. |
| Comment by Githook User [ 13/Jan/16 ] |
|
Author: {u'username': u'cswanson310', u'name': u'Charlie Swanson', u'email': u'charlie.swanson@mongodb.com'}Message: |
| Comment by Charlie Swanson [ 11/Jan/16 ] |
|
Updated the description once again to reflect our new solution. |
| Comment by Charlie Swanson [ 05/Jan/16 ] |
|
Updated description to reflect the fix we went with. We decided not to make the size of the limit configurable, since the 32MB sort limit already is configurable. |