[SERVER-34326] Global snapshot reads fail with SnapshotTooOld error Created: 04/Apr/18 Updated: 29/Oct/23 Resolved: 13/Apr/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 3.7.4 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Misha Tyulenev | Assignee: | Misha Tyulenev |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Operating System: | ALL | ||||
| Sprint: | Sharding 2018-04-23 | ||||
| Participants: | |||||
| Linked BF Score: | 62 | ||||
| Description |
|
global snapshot find and aggregate intermittently fails with SnapshotTooOld error even with retries, This indicates that the snapshot window is just too short.
Thnis is a test case where cmd is any mongos find or aggregate commands with readConcern: {snapshot: true} |
| Comments |
| Comment by Githook User [ 13/Apr/18 ] |
|
Author: {'email': 'misha@mongodb.com', 'name': 'Misha Tyulenev', 'username': 'mikety'}Message: |
| Comment by Githook User [ 12/Apr/18 ] |
|
Author: {'email': 'misha@mongodb.com', 'name': 'Misha Tyulenev', 'username': 'mikety'}Message: |
| Comment by Misha Tyulenev [ 10/Apr/18 ] |
|
After offline discussion will change the logic in SnapshotUnavailable retries to use the latest known cluster time. |
| Comment by Eric Milkie [ 05/Apr/18 ] |
|
You can't use the actual lastCommittedOpTime returned with a reject for the next atClusterTime attempt. You have to adjust the timestamp further into the future. |
| Comment by Misha Tyulenev [ 05/Apr/18 ] |
|
I think so, the atClusterTime in a request generated by mongos is set from the lastCommittedOpTime returned with a reject but its always behind. This looks like that there is a noop write happens that moves the snapshot forward. |
| Comment by Eric Milkie [ 05/Apr/18 ] |
|
How are you selecting a new timestamp to read at when you retry? Are you pushing it far enough into the future? |