[SERVER-32085] $changeStream reports incorrect documentKey for unsharded collections that become sharded Created: 26/Nov/17  Updated: 30/Oct/23  Resolved: 06/Dec/17

Status: Closed
Project: Core Server
Component/s: Aggregation Framework, Replication, Sharding
Affects Version/s: 3.6.0-rc5
Fix Version/s: 3.6.1, 3.7.1

Type: Bug Priority: Major - P3
Reporter: Bernard Gorman Assignee: Bernard Gorman
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
is related to SERVER-34090 Allow resuming change stream when res... Closed
is related to SERVER-30834 Make mongos reload the shard registry... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v3.6
Sprint: Query 2017-12-04, Query 2017-12-18
Participants:

 Description   

Following SERVER-30834, an active $changeStream on an unsharded collection will detect when that collection becomes sharded, and will open additional cursors on the new shards as chunks migrate to them. However, because we never re-establish the cursor on the original shard and it has already cached the unsharded documentKey as _id, all insert operations performed on the primary shard post-shardCollection will continue to produce $changeStream entries whose documentKey is _id and which omit the shard key.

The primary shard therefore becomes a 'poison pill' for $changeStream from this point onwards:

  • All resume tokens taken from insert operations on the primary shard - including all inserts that occur post-sharding - are unusable. This is because the token's documentKey field won't match the documentKey returned for that oplog entry by the resumed $changeStream, which does include the shard key. DocumentSourceEnsureResumeTokenPresent consequently rejects the resume attempt and uasserts.
  • Because the sort key for sharded $changeStream is <ts:1,uuid:1,documentKey:1> this bug will produce an undefined sort order for operations from different shards which have the same timestamp (i.e. those which are the Nth operation on their respective shards within the same second).

The existing change_streams_unsharded_becomes_sharded.js test does not exercise this behaviour because it (a) shards the collection on _id, so the documentKey is the same pre- and post-sharding; and (b) does not retrieve any documents from the $changeStream pre-sharding, so the stream is effectively dormant and the documentKey is not cached. The following patch demonstrates the bug:

diff --git a/jstests/sharding/change_streams_unsharded_becomes_sharded.js b/jstests/sharding/change_streams_unsharded_becomes_sharded.js
index e172b28f46..5db3b86319 100644
--- a/jstests/sharding/change_streams_unsharded_becomes_sharded.js
+++ b/jstests/sharding/change_streams_unsharded_becomes_sharded.js
@@ -29,6 +29,7 @@
     const mongosColl = mongosDB[testName];
 
     mongosDB.createCollection(testName);
+    mongosColl.createIndex({x: 1});
 
     st.ensurePrimaryShard(mongosDB.getName(), st.rs0.getURL());
 
@@ -38,44 +39,57 @@
         cst.startWatchingChanges({pipeline: [{$changeStream: {}}], collection: mongosColl});
     assert.eq(0, cursor.firstBatch.length, "Cursor had changes: " + tojson(cursor));
 
+    // Verify that the cursor picks up documents inserted while the collection is unsharded.
+    assert.writeOK(mongosColl.insert({_id: 0, x: 0}));
+    cst.assertNextChangesEqual({
+        cursor: cursor,
+        expectedChanges: [{
+            documentKey: {_id: 0},
+            fullDocument: {_id: 0, x: 0},
+            ns: {db: mongosDB.getName(), coll: mongosColl.getName()},
+            operationType: "insert",
+        }]
+    });
+
     // Enable sharding on the previously unsharded collection.
     assert.commandWorked(mongosDB.adminCommand({enableSharding: mongosDB.getName()}));
 
-    // Shard the collection on _id.
+    // Shard the collection on x.
     assert.commandWorked(
-        mongosDB.adminCommand({shardCollection: mongosColl.getFullName(), key: {_id: 1}}));
+        mongosDB.adminCommand({shardCollection: mongosColl.getFullName(), key: {x: 1}}));
 
     // Split the collection into 2 chunks: [MinKey, 0), [0, MaxKey).
-    assert.commandWorked(
-        mongosDB.adminCommand({split: mongosColl.getFullName(), middle: {_id: 0}}));
+    assert.commandWorked(mongosDB.adminCommand({split: mongosColl.getFullName(), middle: {x: 0}}));
 
-    // Verify that the cursor is still valid and picks up the inserted document.
-    assert.writeOK(mongosColl.insert({_id: 1}));
+    // Move the [minKey, 0) chunk to shard1.
+    assert.commandWorked(mongosDB.adminCommand({
+        moveChunk: mongosColl.getFullName(),
+        find: {x: -1},
+        to: st.rs1.getURL(),
+        _waitForDelete: true
+    }));
+
+    // Make sure the change stream cursor sees a document inserted on the new shard.
+    assert.writeOK(mongosColl.insert({_id: -1, x: -1}));
     cst.assertNextChangesEqual({
         cursor: cursor,
         expectedChanges: [{
-            documentKey: {_id: 1},
-            fullDocument: {_id: 1},
+            documentKey: {x: -1, _id: -1},
+            fullDocument: {_id: -1, x: -1},
             ns: {db: mongosDB.getName(), coll: mongosColl.getName()},
             operationType: "insert",
         }]
     });
 
-    // Move the [minKey, 0) chunk to shard1 and write a document to it.
-    assert.commandWorked(mongosDB.adminCommand({
-        moveChunk: mongosColl.getFullName(),
-        find: {_id: -1},
-        to: st.rs1.getURL(),
-        _waitForDelete: true
-    }));
-    assert.writeOK(mongosColl.insert({_id: -1}));
-
-    // Make sure the change stream cursor sees the inserted document even after the moveChunk.
+    // Verify that the cursor on the original shard is still valid and sees new inserted documents.
+    // TODO: SERVER-32085 The 'documentKey' should now include the shard key in addition to _id.
+    // This test fails at present.
+    assert.writeOK(mongosColl.insert({_id: 1, x: 1}));
     cst.assertNextChangesEqual({
         cursor: cursor,
         expectedChanges: [{
-            documentKey: {_id: -1},
-            fullDocument: {_id: -1},
+            documentKey: {x: 1, _id: 1},
+            fullDocument: {_id: 1, x: 1},
             ns: {db: mongosDB.getName(), coll: mongosColl.getName()},
             operationType: "insert",
         }]



 Comments   
Comment by Bernard Gorman [ 06/Apr/18 ]

As of SERVER-34090, $changeStream can be resumed on a sharded collection using a token generated while it was unsharded.

Comment by Asya Kamsky [ 02/Jan/18 ]

Agreed that this should be documented clearly.

Comment by Ravind Kumar (Inactive) [ 26/Dec/17 ]

alyson.cabral I'd like to open a DOCS ticket against this so we can update the resume token docs to make a note here. Something like, 'if you have a changestream cursor against an unsharded collection that is then sharded, you must open a new change stream cursor. Resume tokens generated by the original cursor are no longer valid post-sharding."

Or we can just generally advise to re-open change stream cursors established on an unsharded collection after sharding that collection.

Comment by Githook User [ 06/Dec/17 ]

Author:

{'name': 'Bernard Gorman', 'username': 'gormanb', 'email': 'bernard.gorman@gmail.com'}

Message: SERVER-32085 $changeStream reports incorrect documentKey for unsharded collections that become sharded

(cherry picked from commit a18859168f73428522d4338fee982329d9d431ed)
Branch: v3.6
https://github.com/mongodb/mongo/commit/992422efd955ad00b49f4fa6b08733e80e81047b

Comment by Githook User [ 06/Dec/17 ]

Author:

{'name': 'Bernard Gorman', 'username': 'gormanb', 'email': 'bernard.gorman@gmail.com'}

Message: SERVER-32085 $changeStream reports incorrect documentKey for unsharded collections that become sharded
Branch: master
https://github.com/mongodb/mongo/commit/a18859168f73428522d4338fee982329d9d431ed

Comment by Andy Schwerin [ 26/Nov/17 ]

I think the behavior is acceptable for now. We'll need to makeore enhancements to the sharded catalog before we can do better.

Comment by Bernard Gorman [ 26/Nov/17 ]

On a related note: even when this bug is fixed, it will continue to be the case that resume tokens from operations that were retrieved pre-sharding will be unusable post-sharding. Is this the intended behaviour?

Generated at Thu Feb 08 04:29:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.