[SERVER-41529] To prevent dangling index records, CollectionBulkLoaderImpl should not call _addDocumentToIndexBlocks in a writeConflictRetry block. Created: 05/Jun/19 Updated: 29/Oct/23 Resolved: 25/Jun/19 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 4.2.0-rc2, 4.3.1 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Suganthi Mani | Assignee: | Allison Easton |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Backport Requested: |
v4.2, v4.0
|
||||||||||||
| Sprint: | Repl 2019-06-17, Repl 2019-07-01 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 0 | ||||||||||||
| Description |
|
For uncapped collections, CollectionBulkLoaderImpl::insertDocuments inserts documents by calling CollectionImpl::insertDocumentForBulkLoader in a writeConflictRetry block. Regardless of this new batching design approach, it can lead to dangling index record entries. Consider, insertDocumentForBulkLoader() throws WriteConflictException. This leads to retrying of below operations.
Since inserting <Index key, RecordId> pair into external sorter is not part of storage transaction, previously failed storage transaction attempt would leave dangling index entries pointing to invalid RecordIds. The solution to fix this bug is that we should not wrap the _addDocumentToIndexBlocks() method in a writeConflictRetry block.And, _addDocumentToIndexBlocks (insertion into external sorter) should be called only after the batch of records got successfully committed in the storage |
| Comments |
| Comment by Githook User [ 06/Nov/19 ] |
|
Author: {'name': 'Suganthi Mani', 'username': 'smani87', 'email': 'suganthi.mani@mongodb.com'}Message: (cherry picked from commit fc4c06660da1e121c817add86a56bbee1ef05f16) |
| Comment by Githook User [ 03/Jul/19 ] |
|
Author: {'name': 'Allison Easton', 'email': 'allison.easton@mongodb.com'}Message: (cherry picked from commit 54ca8a7112746c7637a295b6d57b6f2c3b4df9b7) |
| Comment by Githook User [ 25/Jun/19 ] |
|
Author: {'name': 'Allison Easton', 'email': 'allison.easton@mongodb.com'}Message: |
| Comment by Suganthi Mani [ 05/Jun/19 ] |
|
Contract is _addDocumentToIndexBlocks() won't throw WriteConflictException and it's safe to assume that as it does only two things.
To be noted, we can't retry this step no:2 as that can cause dangling index entries while committing the indexes at the end of collection cloning . But my understanding is that this step WiredTigerRecordStore::insertRecord can throw WriteConflictException. WriteConflictException is not just thrown for some conflicting writes and it can be thrown for other various reasons like wiredTiger memory issues. So, the solution I am suggesting is that we should wrap only the WiredTigerRecordStore::insertRecord() in writeConflictRetry block and once the record insertion/writeUnitOfWork is successfully committed, then we should call addDocumentToIndexBlocks() with list of <doc,recordID> as we now know for sure the records are successfully inserted and committed and no way it can produce duplicate index record entries (same index keys with different recordIds.) |
| Comment by Judah Schvimer [ 05/Jun/19 ] |
|
What should occur if it throws a WriteConflictException? We do not want to have to restart the entire initial sync (though that's better than corrupted indexes). |
| Comment by Suganthi Mani [ 05/Jun/19 ] |
|
As a part of this ticket, we should also write a js test to validate the fix. |
| Comment by Suganthi Mani [ 05/Jun/19 ] |
|
This should be backported to 4.2 and 4.0. |