[SERVER-31191] Store the collection UUIDs in the CatalogCache Created: 20/Sep/17  Updated: 30/Oct/23  Resolved: 17/Oct/17

Status: Closed
Project: Core Server
Component/s: Replication, Sharding
Affects Version/s: None
Fix Version/s: 3.6.0-rc1

Type: Task Priority: Major - P3
Reporter: Tess Avitabile (Inactive) Assignee: Nathan Myers
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-30548 Remove ShardCollectionType::uuid field Closed
is depended on by SERVER-31192 Get shard key into documentKey and re... Closed
Duplicate
is duplicated by SERVER-31027 store UUID in routing table cache ent... Closed
Related
related to SERVER-31540 Change uses of boost::optional of UUI... Closed
related to SERVER-31605 Un-blacklist cross-version UUID checks Closed
Backwards Compatibility: Fully Compatible
Sprint: Repl 2017-10-02, Repl 2017-10-23
Participants:

 Description   

When you call CatalogCache::getCollectionRoutingInfo(), the result should include the collection UUID.

This is needed for the change streams project, so that when parsing the shard key from an insert entry, we can be sure we have the correct shard key for the collection according to its UUID. We wish to avoid errors where if a collection is dropped and recreated with a different shard key, we extract the wrong shard key from the insert entry.

This can be accomplished by including the UUID in the CollectionAndChangedChunks and copying it into the ChunkManager in refreshCollectionRoutingInfo().

On a primary, we populate CollectionAndChangedChunks in getChangedChunks(). We can copy the UUID from the CollectionType into the CollectionAndChangedChunks in that function.

On a secondary, we populate the CollectionAndChangedChunks from the shard server's config.collections collection in getPersistedMetadataSinceVersion(). Unfortunately, the shard server's config.collections collection does not include the UUID. In 3.8, we intend to have the _id be the UUID. However, for 3.6, we have approval from kaloian.manassiev and dianna.hohensee to add an extra optional field for the UUID.



 Comments   
Comment by Githook User [ 17/Oct/17 ]

Author:

{'email': 'nathan.myers@10gen.com', 'name': 'Nathan Myers', 'username': 'nathan-myers-mongo'}

Message: SERVER-31191 Plumb Collection UUIDs through catalog cache
Branch: master
https://github.com/mongodb/mongo/commit/aeabbf96ff3c2990f553ba0a5e6e1d18ebddab2f

Comment by Dianna Hohensee (Inactive) [ 22/Sep/17 ]

Ah, whoops, you're talking about adding a new field, not using _id. Well, as it currently works, metadata is persisted to config.collections on the shard before fcv 3.6 is set, so the _id field cannot be uuid before they exist. Only choice is to add a new 'uuid' field. Sounds good. I can't think of any issues: we don't really have any detailed plans for how to use uuid in sharding. We probably already will have to update the schema in the v3.6 - > v3.8 upgrade regardless.

Comment by Dianna Hohensee (Inactive) [ 22/Sep/17 ]

I think adding the UUID should be pretty straightforward as well: there's already a placeholder waiting in ShardCollectionType. (We actually have SERVER-30548 to remove it ). It may just be a matter of: swapping out the redundant string for the uuid, and switching any existing ShardCollectionType::getUUID() callers to ShardCollectionType::getNss(); then adding UUID to CollectionAndChangedChunks to carry up the value from the refresh logic to the in-memory cache.

So the idea is that you have a uuid, then you read something new later and want to refresh to make sure the uuid still matches?

Comment by Kaloian Manassiev [ 22/Sep/17 ]

Yes. I don't see a harm in storing the UUID there, but double-check with dianna.hohensee.

Comment by Tess Avitabile (Inactive) [ 22/Sep/17 ]

Like add an optional UUID field in the shard server's config.collections that is distinct from the _id? If that is acceptable to you, that would be great. I just don't want to mess up the schema for you.

Comment by Kaloian Manassiev [ 22/Sep/17 ]

It shouldn't be difficult to add the UUID to config.collections using the same logic as that in CatalogCache. In 3.8 we wanted to start naming the collections using their UUID, but perhaps if that would help ChangeStreams, we could just include the UUID along with the epoch?

Comment by Tess Avitabile (Inactive) [ 22/Sep/17 ]

spencer, I'm currently blocked on this issue, since on secondaries, we populate the CatalogCache from the shard server's config.collections, and the shard server's config.collections does not contain the collection UUID. I believe we intend to add the UUID to the shard server's config.collections in 3.8 (is that correct, dianna.hohensee?).

If we need sharded change streams to be supported on secondaries in 3.6, then we may wish to add the UUID to the shard server's config.collections in 3.6 when it exists. Alternatively, the mongos could include the shard key in the aggregation command it sends to the shards (though it would also need to send the UUID, to avoid a mismatch).

Generated at Thu Feb 08 04:26:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.