[SERVER-47372] config.cache collections can remain even after collection has been dropped Created: 06/Apr/20  Updated: 29/Oct/23  Resolved: 01/Jul/21

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 5.0.2, 5.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: Antonio Fuschetto
Resolution: Fixed Votes: 0
Labels: sharding-causes-bfs-hard
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File test.js    
Issue Links:
Backports
Depends
is depended on by SERVER-34632 config.chunks change to config.cache.... Backlog
Related
related to SERVER-17397 Dropping a Database or Collection in ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0
Steps To Reproduce:

see attached test.js

Sprint: Sharding EMEA 2021-05-31, Sharding EMEA 2021-06-14, Sharding EMEA 2021-06-28
Participants:
Case:
Linked BF Score: 20

 Description   

Which can lead to secondaries believing that collection is still sharded.

setup:

  • collection test.user is sharded
  • 1 shard, current primary: nodeA
  • shard's nodeB never heard about test.user, so it never had any catalog cache entries.

1. _configsvrDrop deletes all config.chunks and config.collections.
2. nodeA steps down, and nodeB becomes new primary.
3. _configsvrDrop sends setShardVersion (0,0) to all shards. Since nodeB never had any entries, set shard version was a no-op.
4. If secondary read with shard version comes to nodeA, it will try to ask nodeB (the primary) to refresh with _flushRoutingTableCacheUpdates.
5. nodeB will end up calling getDatabase and load all sharded collections under that database, but since test.user is already dropped, it will be skipped.
6. So nodeB will end up returning early without asking the CatalogCacheLoader to reload. The consequence is that since the catalog cache loader did not perform the reload, the config.cache collections for test.user will remain untouched.
7. nodeA gets ok response from _flushRoutingTableCacheUpdates, and then tries to check the version via reading config.cache.chunks, and will find out that there are still documents and erroneously believe that collection is still sharded.



 Comments   
Comment by Githook User [ 20/Jul/21 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-47372 config.cache collections can remain even after collection has been dropped
Branch: v5.0
https://github.com/mongodb/mongo/commit/7bf25928a275bfbd22e85d6dcb8856f9cac4f62b

Comment by Githook User [ 30/Jun/21 ]

Author:

{'name': 'Antonio Fuschetto', 'email': 'antonio.fuschetto@mongodb.com', 'username': 'afuschetto'}

Message: SERVER-47372 config.cache collections can remain even after collection has been dropped
Branch: master
https://github.com/mongodb/mongo/commit/d635f733ad873fd469cf3e35e27452c45f1597c9

Generated at Thu Feb 08 05:13:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.