SERVER-30834, an active $changeStream on an unsharded collection will detect when that collection becomes sharded, and will open additional cursors on the new shards as chunks migrate to them. However, because we never re-establish the cursor on the original shard and it has already cached the unsharded documentKey as _id, all insert operations performed on the primary shard post-shardCollection will continue to produce $changeStream entries whose documentKey is _id and which omit the shard key.
The primary shard therefore becomes a 'poison pill' for $changeStream from this point onwards:
- All resume tokens taken from insert operations on the primary shard - including all inserts that occur post-sharding - are unusable. This is because the token's documentKey field won't match the documentKey returned for that oplog entry by the resumed $changeStream, which does include the shard key. DocumentSourceEnsureResumeTokenPresent consequently rejects the resume attempt and uasserts.
- Because the sort key for sharded $changeStream is <ts:1,uuid:1,documentKey:1> this bug will produce an undefined sort order for operations from different shards which have the same timestamp (i.e. those which are the Nth operation on their respective shards within the same second).
The existing change_streams_unsharded_becomes_sharded.js test does not exercise this behaviour because it (a) shards the collection on _id, so the documentKey is the same pre- and post-sharding; and (b) does not retrieve any documents from the $changeStream pre-sharding, so the stream is effectively dormant and the documentKey is not cached. The following patch demonstrates the bug: