[SERVER-40024] Rename collection on a secondary can set the collection minimumVisibleSnapshot timestamp backwards in time after a background index build's ghost commit cluster time timestamp Created: 07/Mar/19 Updated: 29/Oct/23 Resolved: 19/Mar/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Storage |
| Affects Version/s: | None |
| Fix Version/s: | 4.0.8, 4.1.10 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Dianna Hohensee (Inactive) | Assignee: | Dianna Hohensee (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Backport Requested: |
v4.0
|
||||||||||||||||||||
| Sprint: | Storage NYC 2019-03-25 | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Linked BF Score: | 63 | ||||||||||||||||||||
| Description |
|
AutoGetCollection will throw if a caller tries to access the collection at a point-in-time earlier than the min visible timestamp. Because renameCollection sets the min visible timestamp back in time, callers can now access the collection at an earlier time than an index build completed. This can hit the DEV (debug only) block in IndexCatalog::numIndexesReady and invariant; or allow callers that expect an index build to be complete to find no ready index because we filter out ready indexes based on commit timestamp, which can fail our testing that primaries and secondaries have the same number of indexes. |
| Comments |
| Comment by Dianna Hohensee (Inactive) [ 10/May/19 ] |
|
So I realized I wasn't quite right about no versioning in the index catalog code. If you call IndexCatalogImpl::findIndexByName, then it will get down to this code, which means that findIndexByName will only look through the indexes that were ready at the opCtx's snapshot time for matches. In case that's relevant. I'd guess you'd have to change the opCtx's PIT read timestamp, or recovery unit, or change opCtxs, for it to be relevant. But the view can change under a lock. james.wahlin |
| Comment by Githook User [ 27/Mar/19 ] |
|
Author: {'name': 'Dianna Hohensee', 'username': 'DiannaHohensee', 'email': 'dianna.hohensee@10gen.com'}Message: (cherry picked from commit 751ac6e4692ebe5a5026d96596ca4cffd40f8ffa) |
| Comment by Githook User [ 19/Mar/19 ] |
|
Author: {'name': 'Dianna Hohensee', 'username': 'DiannaHohensee', 'email': 'dianna.hohensee@10gen.com'}Message: |
| Comment by Daniel Gottlieb (Inactive) [ 07/Mar/19 ] |
|
I think having setMinimumVisibleSnapshot only advance is the right call. |
| Comment by Dianna Hohensee (Inactive) [ 07/Mar/19 ] |
|
I think if we made the setMinimumVisibleSnapshot() functions only ever increase the timestamp, i.e. skip setting any timestamps that are less than the current, then a lot of the test failures would likely stop. At least if dbhash's collStats cmd, and whatever else, retries on ErrorCodes::SnapshotUnavailable errors (what AutoGetCollection appears to throw). I don't know what would happen if a user established an index using query, say, after or at the index commit timestamp X, but then a dropIndexes gets replicated that commits back in time at timestamp Y, where Y < X. daniel.gottlieb, milkie, you both worked on / reviewed the ghost timestamping code, and so are more familiar with what's going on. Any ideas? |