[SERVER-44914] A shard receiving its first chunk should locally drop any indexes not on the donor Created: 02/Dec/19  Updated: 29/Oct/23  Resolved: 03/Jan/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.3.3

Type: Task Priority: Major - P3
Reporter: Jack Mulrow Assignee: Mihai Andrei
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-44918 Concurrency workload with index opera... Closed
Problem/Incident
causes SERVER-80703 Avoid traversing routing table in Mig... Closed
Backwards Compatibility: Fully Compatible
Sprint: Sharding 2019-12-16, Sharding 2019-12-30, Sharding 2020-01-13
Participants:

 Description   

From the design document:

When a shard moves away its last chunk, it does not drop the collection or any indexes locally, creating an "orphaned" collection. Mongos will only target the shards that own chunks according to its routing table, so an index on an orphaned collection may become inconsistent with the indexes or collection options on chunk owning shards.

To handle this, when a non-primary shard receives its first chunk for a collection, it will drop the collection if it exists locally before copying the donor's indexes.

The primary shard for a collection should always have correct indexes so it must not drop the collection when it receives its first chunk.

The recipient shard should verify its DatabaseShardingState is up to date before checking if it is the primary shard.



 Comments   
Comment by Githook User [ 20/Dec/19 ]

Author:

{'name': 'Mihai Andrei', 'email': 'mihai.andrei@mongodb.com'}

Message: SERVER-44914 A shard receiving its first chunk should locally drop any indexes not on the donor
Branch: master
https://github.com/mongodb/mongo/commit/4dc5811a2c427f35ae9dfe5cd110dcdf318bcf9d

Comment by Jack Mulrow [ 19/Dec/19 ]

To keep the primary shard the authoritative source for the collections on a database (unsharded and sharded) we changed this ticket so the recipient will only drop its indexes when receiving its first chunk, instead of dropping the collection.

Comment by Jack Mulrow [ 12/Dec/19 ]

After further discussion, we're dropping the requirement that the primary shard always has correct indexes when it has no chunks, so step 1) from my first comment is unnecessary.

Comment by Jack Mulrow [ 12/Dec/19 ]

Spoke offline with esha.maharishi and she pointed out that the recipient shard in a migration already refreshes its metadata before cloning documents, so we can simplify the implementation from my comment if we move the refresh after the recipient registers the migration. With that change, the recipient should be able to always use its latest CollectionMetadata to check if it owns chunks without needing another refresh.

She also pointed out that getting the primary shard id from the CatalogCache is unsafe even after a database refresh. Instead we can start storing the primary shard id in the DatabaseShardingState and use that, but since we're reconsidering the requirement that the primary shard has authoritative indexes, we might be able to throw out the primary check entirely.

Comment by Jack Mulrow [ 11/Dec/19 ]

esha.maharishi, can you look over the following proposed implementation?

The main idea is that because a shard can only be the recipient in one migration at a time, if it refreshes during the receiving logic, it should know definitively if it owns any chunks. I figure adding a refresh to each migration isn't great though, so I added some optimizations to avoid refreshes in most cases.

  1. Right before a recipient shard clones collection options and indexes from the donor shard, make it check if it is the primary shard for the migration namespace. If it is the primary, continue the migration as normal, otherwise proceed to step 2.
    1. Done by first checking if its DatabaseShardingState has a valid version for the migration database, refreshing from the config server through forceDatabaseRefresh() if it doesn't, and then checking the primaryId from the CachedDatabaseInfo returned by CatalogCache::getDatabase() (called by accessing the CatalogCache through the Grid service context decoration).
      1. This was based on what we talked about last week, but I'm a little iffy on database version, so let me know if I misunderstood.
  2. Make the recipient get the latest filtering metadata for the migration namespace, and only if there is valid metadata (i.e. not boost::none) and it says the recipient owns at least one chunk, continue with the migration as normal, otherwise proceed to step 3.
    1. The filtering metadata will be retrieved by calling getCurrentMetadataIfKnown() on the CollectionShardingRuntime service context decoration after taking an IS lock on the migration namespace
    2. Checking for chunk ownership will be done by either by checking the size of CollectionMetadata::getChunks() on the filtering metadata, or through a new method that calls through into the filtering metadata's ChunkManager
    3. This relies on the assumption that a shard can't move away its last chunk without either correctly updating its metadata or invalidating it (which is why I think SERVER-44598 shouldn't be a problem)
    4. Note that if the recipient's metadata says it owns no chunks, it still needs to refresh its metadata to handle the case where it received a chunk but hasn't refreshed yet
  3. Make the recipient refresh its metadata for the migration namespace from the config server
    1. Done by calling forceShardFilteringMetadataRefresh() after dropping the IS lock taken earlier
  4. Check if the recipient owns at least one chunk using the refreshed metadata. If the shard still does not own a chunk, make the recipient drop the migration namespace locally before continuing with the migration

Technically it's possible after refreshing in step 3 the recipient moves away its last chunk and incorrectly decides it doesn't need to drop the collection in step 4, but I don't think this is a problem because mongos shard versions index operations. If there's a concurrent index operation it must target the shard donating a chunk to this recipient, and the operation will either execute on that donor before this migration reaches its critical section, which will abort the migration, or it executes after and the command will fail with StaleConfig and be retried, and the retry should correctly target the recipient.

Generated at Thu Feb 08 05:07:21 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.