Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Duplicate
Priority: Critical - P2
Fix Version/s: None
Affects Version/s: None
Component/s: Querying, Replication, Storage, Write Ops
Labels:
None

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Sprint:
QuInt B (11/02/15)
Confidence Status:
None
Work Order:
0
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

It is possible for an initial sync collection clone to skip over documents that are concurrently updated, under certain circumstances. When this happens, the initial sync will report success, but the newly-synced member will be silently missing these documents.

The following conditions are required to trigger this scenario:

The sync source must be running with the mmapv1 storage engine.
When the collection scan query issued by the initial sync is yielding locks, an update must be issued against the document pointed to by the query's record cursor. This update must meet both of the following criteria:
- The update must increase the size of the document, such that a document move is required.
- The update must fail to generate an oplog entry (e.g. if the update fails with a duplicate key error).

With mmapv1, an update of a document generates an invalidation for all active cursors pointing to that document (as a result, those cursors are advanced). Documents that are updated in this manner during an initial sync are copied to the sync target during the "oplog replay" initial sync phase. However, the copy is not performed if the update does not generate an oplog entry, which causes the synced collection to be missing the document.

This is a regression introduced in the 3.0.x series of the server. In the 2.6.x series and prior, invalidations are not issued if the update would generate an error; this logic was removed with the introduction of the storage API in the 3.0.x series.

This issue can be reproduced with the following script:

var rst = new ReplSetTest({nodes: 2,
                           nodeOptions: {storageEngine: "mmapv1",
                                         setParameter: "internalQueryExecYieldIterations=2"}});
rst.startSet();
rst.initiate();
var primary = rst.getPrimary();
var secondary = rst.getSecondary();
assert.writeOK(primary.getDB("test").foo.insert([{_id: 0, a: 0}, {_id: 1, a: 1}, {_id: 2, a: 2}]));
assert.commandWorked(primary.getDB("test").foo.ensureIndex({a: 1}, {unique: true}));
rst.awaitReplication();
rst.stop(secondary);
startParallelShell(
    'while (true) { \
         db.foo.update({_id: 1}, {$set: {x: new Array(1024).join("x"), a: 2}}); \
         sleep(1000); \
     }', primary.port);
rst.start(secondary);
rst.waitForState(secondary, rst.SECONDARY, 60 * 1000);
reconnect(secondary.getDB("test"));
assert.eq(3, secondary.getDB("test").foo.count());

The assertion on the last line trips with the message "3 != 2", as the newly-synced member is missing the document {_id: 1, a: 1}.

The following patch to the server greatly increases reproducibility:

diff --git a/src/mongo/db/query/query_yield.cpp b/src/mongo/db/query/query_yield.cpp
index 4e0d463..7edde6e 100644
--- a/src/mongo/db/query/query_yield.cpp
+++ b/src/mongo/db/query/query_yield.cpp
@@ -62,6 +62,10 @@ void QueryYield::yieldAllLocks(OperationContext* txn, RecordFetcher* fetcher) {
     // locks). If we are yielding, we are at a safe place to do so.
     txn->recoveryUnit()->abandonSnapshot();

+    if (txn->getNS() == "test.foo") {
+        sleepmillis(2000);
+    }
+
     // Track the number of yields in CurOp.
     CurOp::get(txn)->yielded();

Reproduced with master (07168e08) and 3.0.7.

is related to

SERVER-21057 Collection scan during concurrent move-update can return invalid results, trip fatal assertion (mmapv1 only)

Closed

related to

SERVER-21058 need fail point to stress yielding behavior

Closed

Assignee:: Geert Bosch
Reporter:: J Rassi (Inactive)
Participants:: Geert Bosch, J Rassi
Votes:: 0 Vote for this issue
Watchers:: 13 Start watching this issue

Created:: Oct 20 2015 08:15:31 PM UTC
Updated:: Apr 14 2016 03:19:07 PM UTC
Resolved:: Oct 26 2015 02:44:37 PM UTC

Details

Description

Attachments

Issue Links

Forms

Activity

People

Dates