[SERVER-31691] Change streams UUID mismatch uasserts with error code that causes invariant Created: 24/Oct/17  Updated: 30/Oct/23  Resolved: 07/Nov/17

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 3.6.0-rc4

Type: Bug Priority: Major - P3
Reporter: Nicholas Zolnierz Assignee: Bernard Gorman
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-31392 Test changeStreams when an unsharded ... Closed
is depended on by SERVER-31394 Create passthroughs of existing chang... Closed
Duplicate
is duplicated by SERVER-31746 Invariant failure on sharded $changeS... Closed
Related
related to SERVER-31847 Allow sharded $changeStream to contin... Closed
is related to SERVER-31707 Test changeStreams on a sharded colle... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Query 2017-11-13
Participants:

 Description   

Found while writing the tests for SERVER-31392, the DocumentSourceChangeStream hits a uassert when the UUID in the oplog entry mismatches the UUID in the scoped collection metadata. Returning ErrorCodes::StaleConfig results in a dynamic_cast a few layers up, and eventually an invariant. I am not sure we should hit this uassert at all (a separate discussion), but for sure it doesn't seem like we should be returning this status.



 Comments   
Comment by Sam Johnson [ 23/Jul/18 ]

Awesome. that fixed the problem. Thank you.

Comment by Bernard Gorman [ 22/Jul/18 ]

sjohnson540, many thanks for providing that additional information. After conferring with some of my colleagues, we believe that you are likely encountering the issue described in SERVER-36154. MongoDB 3.6 introduced per-collection UUIDs, and when upgrading from 3.4 to 3.6 every existing collection is assigned a new UUID when the featureCompatibilityVersion is set to 3.6. However, while the UUIDs for each collection will be persisted to disk, it is possible that some nodes may fail to update their in-memory cache of the collection state to reflect this. When a $changeStream is opened on a collection whose cache does not contain the new UUID, it results in the error you described above.

To resolve the issue, please try the following:

1. Connect directly to each shard's Primary, and issue the command below to refresh the Primary's in-memory collection state cache:

db.adminCommand({forceRoutingTableRefresh: "databaseName.collectionName"})

2. Unfortunately, this will only refresh the Primary's cache, and forceRoutingTableRefresh can only be run on the Primary. To refresh the Secondaries' caches, please restart each Secondary one by one. This should fully resolve the problem without causing any shard downtime.

Once this is done, you should not encounter the error again at any point in the future, since all mongoDs will have an up-to-date cache including the UUIDs generated for each collection during the upgrade, and UUIDs will automatically be cached for any future collections created in 3.6 featureCompatibilityVersion.

Best regards,
Bernard

Comment by Sam Johnson [ 16/Jul/18 ]

@bernard.gorman I am not Basit, but I have run into what seems to be the exact same issue, I can answer the questions:

 

  • Yes, I am seeing this everytime I open a changestream
  • I never receive successful notifications
  • It has NOT been dropped/recreated
  • The shard key is the id
  • Output: (Mongos and the primary returns the same output)

[[ { "name" : "actions", "type" : "collection", "options" : { "flags" : 3 }, "info" : { "readOnly" : false, "uuid" : UUID("4bfddfea-811e-4a59-935d-0f2fce3fdc0d") }, "idIndex" : { "v" : 1, "key" : { "_id" : 1 }, "name" : "_id_", "ns" : "staging.actions" } }]

 

I see this issue is closed. I am on version 3.6.5 and still encountering this issue.

Comment by Bernard Gorman [ 06/Apr/18 ]

Hi basitmustafa!

Thank you for bringing this to our attention. You're correct that the fix for this issue was included in the GA release of 3.6.0, but that fix related only to the behaviour of the server when this scenario was encountered. Before this ticket, a UUID mismatch resulted in an invariant failure, which always causes the server to shut down as a protective measure. The fix implemented here was to change this to an assertion, which simply aborts the current operation and returns an error to the client.

I'm surprised to hear that you're seeing this exception on every change from one particular collection. The assertion is thrown here in 3.6.3, in the collectionDocumentKeyFields method. This is invoked the first time we encounter a change on the target collection; having extracted the UUID from the change's oplog entry, we then consult our cached collection metadata to determine whether the collection is sharded and, if so, what the shard key is.

The only scenario in which I've been able to force our tests to hit this assertion is if (1) a $changeStream is opened on a collection, (2) some writes are performed but no changes have yet been retrieved by the stream, (3) the collection is dropped, recreated (causing it to acquire a new UUID) and sharded, and (4) a getMore is issued on the $changeStream. This pulls the first of the changes, which still has the old UUID, through the pipeline; we attempt to look up the shard key, realise that the UUIDs no longer match, and throw the exception.

However, it seems unlikely that such an artificial sequence of events is happening in your cluster. Can you let me know the following:

  • Are you seeing this every time you try to open a new $changeStream? Or when you try to resume a stream?
  • Do you ever receive any successful notifications before the error?
  • Has this collection been dropped and recreated at any point?
  • What's the collection's shard key?
  • Is this collection distributed across all shards in the cluster, or zoned to a subset of shards?
  • What is the output of running the following commands on mongos (in particular, what is the value of the info.uuid field)?

    use vtDataWarehouse
    db.getCollectionInfos({name: "VTDataObject"})
    

  • If you connect directly to the Primary of each shard and run the same commands as above, do the UUIDs on all of them match the UUID you observed via mongoS?

Best regards,
Bernard

Comment by Basit Mustafa [ 27/Mar/18 ]

I think we are seeing this behavior in the wild in a production release (although I am not intimately familiar with the code base, we are simply a production user, not yet contributors).

On a sharded collection (vtDataWarehouse.VTDataObject) any change events result in an error, but only on this specific collection. Other collections, both sharded and unsharded, things work fine on the same system, same connection, even.

Any advice on a workaround before the supposed 3.7 change around the uassert behavior (although this was supposed fixed for 3.6 if I am understanding correctly)?

error: MongoError: Collection vtDataWarehouse.VTDataObject UUID differs from UUID on change stream operations
at /Users/basitmustafa/GitHub/voltaire-api-v5/node_modules/mongodb-core/lib/connection/pool.js:595:61
at authenticateStragglers (/Users/basitmustafa/GitHub/voltaire-api-v5/node_modules/mongodb-core/lib/connection/pool.js:513:16)
at Connection.messageHandler (/Users/basitmustafa/GitHub/voltaire-api-v5/node_modules/mongodb-core/lib/connection/pool.js:549:5)
at emitMessageHandler (/Users/basitmustafa/GitHub/voltaire-api-v5/node_modules/mongodb-core/lib/connection/connection.js:309:10)
at Socket.<anonymous> (/Users/basitmustafa/GitHub/voltaire-api-v5/node_modules/mongodb-core/lib/connection/connection.js:452:17)
at Socket.emit (events.js:180:13)
at addChunk (_stream_readable.js:269:12)
at readableAddChunk (_stream_readable.js:256:11)
at Socket.Readable.push (_stream_readable.js:213:10)
at TCP.onread (net.js:578:20)

Server is 3.6.3 gitVersion 9586e557d54ef70f9ca4b43c26892cd55257e1a5 on ubuntu 1604 x86_64

Comment by Bernard Gorman [ 07/Nov/17 ]

spencer: having looked into this, I'm not convinced that uasserting here is correct or desirable behaviour. I've pushed a fix to avoid the invariant failure and have opened SERVER-31847 to track followup work.

Comment by Githook User [ 07/Nov/17 ]

Author:

{'name': 'Bernard Gorman', 'username': 'gormanb', 'email': 'bernard.gorman@gmail.com'}

Message: SERVER-31691 Change streams UUID mismatch uasserts with error code that causes invariant
Branch: master
https://github.com/mongodb/mongo/commit/9d82100f7a0a4df9e797eef8f0fb3f6c66938771

Comment by Spencer Brody (Inactive) [ 27/Oct/17 ]

At this point I think we shouldn't worry about trying to properly trigger a metadata refresh. The only case it can possibly be an issue is after an unsharded collection becomes sharded, which has other known issues right now anyway, and it's very easy to work around the issue (just do any versioned read or write against the shard, which will kick off a metadata refresh). So I think just using a normal uassert code should be fine for now.

Comment by Spencer Brody (Inactive) [ 27/Oct/17 ]

Yes, that is a known issue and should be fine. If you haven't even returned any results yet then there should really be no problem in just re-establishing a brand new cursor. So maybe just the test you were working on just needs to be updated to make sure it has processed at least one result before dropping the collection? Although we also need to stop the uassert from triggering an invariant, but I think we're usasserting at the right time at least.

Comment by Bernard Gorman [ 27/Oct/17 ]

spencer, I think that makes sense. From the code, we do check whether we've already obtained and stored the shard key fields in _documentKeyFields before consulting the metadata.

However, that check only happens the first time we pull documents through the transform stage. So if the UUID changes between the time that the cursor is established on the shard and the point where it begins returning results, then I can see how we would hit this uassert.

Comment by Spencer Brody (Inactive) [ 27/Oct/17 ]

I think using the StaleConfig code is wrong, and we probably don't have the right information available to throw StaleConfigException properly.

What I want to understand is if we've already saved the shard key onto the cursor and checked the UUID, why are we hitting this uassert at all on the getMore? The idea behind how we extract the shard key for insert entries was that for a sharded changestream, when the mongod processes its very first notification it will check the collection metadata, extract the shard key, confirm the UUID of the collection metadata matches the UUID of the notification, then store the shard key on the cursor. From that point on, for the lifetime of the cursor, the mongod shouldn't have to be checking the CollectionMetadata at all since the cursor already knows the shard key, so even if the collection is dropped it should be able to continue processing entries until the cursor catches up to the drop entry and returns an "invalidate".

The only case that should fail is if the cursor is lost and the stream has to be resumed with a ResumeToken, making a new cursor. The new cursor will try to load the CollectionMetadata but won't find any information for the collection with that name. In that case we should return a sane error back to the user. It is a known issue that we can't resume after a collection drop, but so long as we're using the same cursor it should work fine.

Comment by Bernard Gorman [ 27/Oct/17 ]

See dupe ticket SERVER-31746 for a slightly different manifestation of this issue. If we open a $changeStream on a sharded collection and that collection is dropped before we have iterated all oplog entries, this uassert prevents us from ever seeing those changes.

Comment by Nicholas Zolnierz [ 24/Oct/17 ]

As far as I can tell, the collection metadata's UUID is valid at the time of issuing the getNext(), which would be after the collection is sharded. If I create a new change stream cursor after sharding the collection, everything works fine and the collection's UUID is unchanged.

The issue seems to be that the UUID associated with the change stream cursor created before sharding the collection is incorrect/stale/!missing. At a first glance, perhaps this check was meant to catch it?

Comment by Spencer Brody (Inactive) [ 24/Oct/17 ]

I has assumed that it was due to there being no UUID in the CollectionMetadata since the config servers don't track any information about unsharded collections, but I'll leave nicholas.zolnierz to confirm.

Comment by Esha Maharishi (Inactive) [ 24/Oct/17 ]

I'm confused why Nick's case causes a UUID mismatch. The UUID shouldn't change when the collection becomes sharded? That sounds like a more serious issue...

Comment by Spencer Brody (Inactive) [ 24/Oct/17 ]

Actually, on further consideration this may not actually be as big a problem in practice as I thought, since I don't think we look for the CollectionMetadata until we process the first actual changeStream notification. By the time there is any actual notifications to return, there must have been at least 1 insert to the collection. Since inserts are versioned, that must have caused the shard to refresh its CollectionMetadata at that point.

So that means having a UUID mismatch between a real collection UUID in the CollectionMetadata and the UUID in the oplog entry probably can't happen. So that means the only case that's an issue is the case Nick describes, after an unsharded collection becomes sharded. In that case we will still need to kick off a refresh of the CollectionMetadata somehow.

Comment by Spencer Brody (Inactive) [ 24/Oct/17 ]

I believe the intent of that code was to handle the case where the shard has outdated CollectionMetadata - corresponding to a previous version of the collection - when the changeStream is established. Since changeStreams are unversioned, if the first operation you perform on the newly sharded collection is opening a change stream the metadata won't be refreshed and you'll get a UUID mismatch between the sharding CollectionMetadata and the on-disk catalog. If that happens we'd like to refresh the CollectionMetadata and if the UUIDs now match continue without erroring to the user.

Comment by Nicholas Zolnierz [ 24/Oct/17 ]

esha.maharishi, yeah I changed to error code InvalidUUID for now to unblock the testing, but StaleEpoch works as well. To give some more context on what's causing this, these are the steps:

1. Open change stream cursor on unsharded collection
2. Shard the collection
3. Insert 2 documents, one for each shard
4. Assert that getNext() on the change stream cursor is able to see the insert

I'm not sure if this is expected to work or not, seems like a perfectly valid use case to me.

Comment by Esha Maharishi (Inactive) [ 24/Oct/17 ]

Ah yes, we should never uassert with a StaleConfig error code. In general, we should instead throw a StaleConfigException, so that it can include the version wanted and version received.

However, I think this particular spot should uassert a different error code, probably StaleEpoch, since it's a UUID mismatch, which is equivalent to what an epoch mismatch meant before we had UUIDs. However, note that throwing a StaleEpoch will not cause the shard to refresh. What is supposed to happen if the change streams cursor's UUID does not match what's on the CollectionMetadata?

Generated at Thu Feb 08 04:27:53 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.