[SERVER-21403] $snapshot can return duplicates on 3.2 in the case of an MMAPv1 document move Created: 11/Nov/15 Updated: 14/Dec/15 Resolved: 17/Nov/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Querying |
| Affects Version/s: | 3.2.0-rc2 |
| Fix Version/s: | 3.2.0-rc4 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Gustavo Niemeyer | Assignee: | David Storch |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Sprint: | QuInt C (11/23/15) | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
The $snapshot option doesn't seem to be working in 3.2. The following Go test case works in every other release, but in 3.2 it breaks with "Error: seen duplicated key: 3". The test consists in exercising a worst case scenario where documents are resized (grown) beyond the padding and thus moved forwards, in reverse order, while doing a forward iteration.
Test was performed using the mmapv1 storage engine. |
| Comments |
| Comment by Githook User [ 17/Nov/15 ] | |||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: | |||||
| Comment by Githook User [ 17/Nov/15 ] | |||||
|
Author: {u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}Message: Includes similar plumbing for the ephemeralForTest storage | |||||
| Comment by David Storch [ 12/Nov/15 ] | |||||
|
The problem has to do with the way in which an MMAPv1 IndexCursor restores its position after a yield. In particular, it may restore incorrectly when an MMAP document's RecordId changes due to a move. This was introduced in the 3.1 dev cycle during the rewrite of the Storage Engine API's index cursor interface. | |||||
| Comment by Scott Hernandez (Inactive) [ 12/Nov/15 ] | |||||
|
I've updated the test to use a large string, 1MB and that seems to do it on master to cause the failure. | |||||
| Comment by Gustavo Niemeyer [ 12/Nov/15 ] | |||||
|
Unlike the Go test case, the attached javascript test is using a very small array, which means an update could fit into the padding. | |||||
| Comment by Scott Hernandez (Inactive) [ 12/Nov/15 ] | |||||
|
Thomas, I updated your jstest to assert instead of printing so we can use it as a test in resmoke and so it fits into the regression tests in the server code base. Unfortunately it doesn't seem to have a problem and passes in both wired tiger and mmapv1 on master for me locally. Please confirm my changes still error on your system, and then we can figure out what the differences are.
| |||||
| Comment by Scott Hernandez (Inactive) [ 11/Nov/15 ] | |||||
|
Can this be reproduced with another driver/language, like in the shell? In mgo, what protocol is being used with the server, the find command or op_query on the wire protocol? Please include the server logs with level 2 verbosity, and explain output as well to help understand what is being sent. | |||||
| Comment by Kelsey Schubert [ 11/Nov/15 ] | |||||
|
This issue was introduced in 3.1.2. | |||||
| Comment by Kelsey Schubert [ 11/Nov/15 ] | |||||
|
I have verified the issue. I have attached a version of this test case in an a program that can be executed outside of the test framework and its fixtures. The issue is only present for mmapv1, and does not occur with WiredTiger. |