[SERVER-32988] Oplog application, foreground index builds pin an unbounded amount of data in WiredTiger Created: 30/Jan/18 Updated: 27/Oct/23 Resolved: 22/Mar/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 3.7.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Daniel Gottlieb (Inactive) | Assignee: | Vesselina Ratcheva (Inactive) |
| Resolution: | Works as Designed | Votes: | 0 |
| Labels: | rollback-non-functional | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Operating System: | ALL |
| Sprint: | Repl 2018-02-26, Repl 2018-03-12, Repl 2018-03-26 |
| Participants: |
| Description |
|
Note, this only applies to the 3.7 development branch.
The index build uses a WriteUnitOfWork for each document resulting from the collection scan. This write gets the "commit timestamp" on the recovery unit. Foreground index builds use the recovery unit that is in context from the TimestampBlock. Background index builds use their own OperationContext. This causes background index builds to not timestamp their data writes. Moreover, foreground index builds block replication. When replication is not progressing, the oldest_timestamp does not advance. If the oldest_timestamp is not advancing, all of the data writes that are part of the index build stay pinned. This can unnecessarily activate lookaside. |
| Comments |
| Comment by Daniel Gottlieb (Inactive) [ 22/Mar/18 ] |
|
It turns out that bulk index builds on WiredTiger (whether the "bulk" option succeeds or not) are, subtly, done outside a begin/commit transaction. The constructor opens a cursor that is used to perform all inserts into the index. These inserts are self-contained "autocommit" transactions that never have a timestamp applied. Even though the call is inside a committed WUOW, a session had never become "active" (had a transaction start), and likewise commit is not called. The side-effect of not doing this ticket is that other storage engine's that obey the timestamping contract (of which there are none...that don't extend from WT itself) may pin index builds in memory. However, completing this ticket in a way that includes proof that it was done correctly would require changes to WTRecoveryUnits and/or WT index builds. |
| Comment by Ian Whalen (Inactive) [ 02/Feb/18 ] |
|
just bumping this to repl team to make sure repl team sees it. will also move kyle's work over to repl team asap. |