[SERVER-16675] getMore looks up doc with invalid RecordId, fails with "Didn't find RecordId in WiredTigerRecordStore" Created: 27/Dec/14  Updated: 21/Jan/15  Resolved: 07/Jan/15

Status: Closed
Project: Core Server
Component/s: Index Maintenance, Storage
Affects Version/s: 2.8.0-rc4
Fix Version/s: 2.8.0-rc5

Type: Bug Priority: Major - P3
Reporter: Kamran K. Assignee: David Storch
Resolution: Done Votes: 0
Labels: 28qa, wiredtiger
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-16750 Document never matching query predica... Closed
Tested
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

db.foo.drop();
db.foo.ensureIndex({a: 'text'});
 
db.foo.insert({a: 'bar'});
db.foo.insert({a: 'bar'});
db.foo.insert({a: 'bar'});
 
var cursor = db.foo.find({$text: {$search: 'bar'}});
cursor.batchSize(2);
cursor.next();
cursor.next();
 
db.foo.remove({$text: {$search: 'bar'}});
 
// this calls triggers a 'Didn't find RecordId in WiredTigerRecordStore' error
cursor.itcount();

Participants:
Linked BF Score: 0

 Description   

The included script reliably triggers a 'Didn't find RecordId in WiredTigerRecordStore' error. I wasn't able to trigger the error by using a non-text index and non-text queries in the script.

This issue might be related to SERVER-16336 (and, by extension, SERVER-16351).


Version: b0014456



 Comments   
Comment by Githook User [ 07/Jan/15 ]

Author:

{u'username': u'dstorch', u'name': u'David Storch', u'email': u'david.storch@10gen.com'}

Message: SERVER-16675 force fetch RecordIds buffered by the query system on saveState()

This fixes an issue with WiredTiger query isolation.
Branch: master
https://github.com/mongodb/mongo/commit/c11002f5f414b2b9f18b8abc69b4c69efc82f1fd

Comment by J Rassi [ 28/Dec/14 ]

This report exposes a design issue related to query isolation when using WiredTiger (and no, this isn't related to SERVER-16636 or text search / text indexes).

In mmapv1, a document update or delete generates a broadcast call to PlanExecutor::invalidate() on all registered PlanExecutor objects. invalidate() instructs the PlanExecutor to wipe any saved state it has associated with the given RecordId, as the record may no longer exist or match the original predicate.

In WiredTiger, reads have snapshot isolation, so document updates/deletes do not affect other operations' active reads. Thus, for WiredTiger, the server never makes any calls to PlanExecutor::invalidate(). The supporting comment in CursorManager::invalidateDocument() reads: "If a storage engine supports doc locking, then we do not need to invalidate. The transactional boundaries of the operation protect us."

The issue here is that queries drop their snapshot between calls to getMore; this violates the assumption that the operation is protected from other writes by its transactional boundaries. Specifically: query stages are allowed to save references to RecordIds that they encounter, and the query subsystem guarantees that each RecordId will continue to refer to the same exact document until it is invalidated. In this case, the document is deleted, and the query's new snapshot reflects that the document has been deleted, but the stage was never notified of the deletion. Before a query starts operating on a new snapshot, it needs to be delivered every invalidate() notification that's been generated since the creation of its previous snapshot.

This issue is particularly easy to reproduce with the TEXT stage, which buffers the entire RecordId result set before returning any documents to the user. Though, many stages buffer RecordIds. See the following repro which uses SORT_MERGE (SORT_MERGE buffers the RecordId of the upcoming document from each child scan):

db.foo.drop();
db.foo.ensureIndex({a:1,b:1});
db.foo.insert({a:1,b:1})
db.foo.insert({a:1,b:2})
db.foo.insert({a:2,b:3})
db.foo.insert({a:2,b:4})
var cursor = db.foo.find({a:{$in:[1,2]}}).sort({b:1});
cursor.batchSize(2);
cursor.next();
cursor.next();
db.foo.remove({a:2,b:3});
cursor.next(); // "Didn't find RecordId in WiredTigerRecordStore"

Marking 2.8.0-rc5. cc eliot, schwerin, redbeard0531, david.storch.

Thanks for the report, kamran.khan.

Generated at Thu Feb 08 03:41:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.