[SERVER-77911] Shard merge importing collection without timestamp can trigger invariant failure in HistoricalCatalogIdTracker. Created: 08/Jun/23  Updated: 29/Oct/23  Resolved: 14/Aug/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.1.0-rc0

Type: Bug Priority: Major - P3
Reporter: Suganthi Mani Assignee: Suganthi Mani
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Assigned Teams:
Serverless
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Server Serverless 2023-07-10, Server Serverless 2023-07-24, Server Serverless 2023-08-07, Server Serverless 2023-08-21
Participants:
Linked BF Score: 105

 Description   

Shard merge does non-timestamped catalog writes when importing collections (i,e, collections created during physical cloning phase). But, does timestamped catalog writes when dropping those collections. This mixed timestamp mode usage can trigger invariant failure in HistoricalCatalogIdTracker which was introduced to support point-in-time catalog lookups PM-2218.

 

Noting the problematic sequence here.

  1. Fixture starts migrationId1.
  2. Recipient imports the donor collection  say <tenanId1>_db.coll untimestamped.
    a) <catalogRecordID<X>,  boost::none TS> entry gets inserted into HistoricalCatalogIdTracker for this nss.
  3. migrationid1 get committed.
  4. Fixture drops the collection <tenanId1>_db.coll but it’s timestamped writes with dropTS as TS(100).
    a) <boost::none,  TS(100)> entry gets inserted into HistoricalCatalogIdTracker for this nss.
  5. Fixture starts another migrationId2.
  6. Recipient imports the donor collection  say <tenanId1>_db.coll untimestamped.
    a) This will skip inserting entry into HistoricalCatalogIdTracker as HistoricalCatalogId list is non-empty for this nss.
  7. When the oldest ts  is no longer < TS(100), TimestampMonitor deletes the expired 2-a) and 4-a) entries from HistoricalCatalogIdTracker.
  8. migrationid2 get committed.
  9. Fixture drops the collection <tenanId1>_db.coll but it’s timestamped writes.
    a) This throws the invariant failure because HistoricalCatalogId list is empty for this nss.

 



 Comments   
Comment by Githook User [ 14/Aug/23 ]

Author:

{'name': 'Suganthi Mani', 'email': 'suganthi.mani@mongodb.com', 'username': 'smani87'}

Message: SERVER-77911 HistoricalCatalogIdTracker handles mixed mode catalog writes for shard merge.
Branch: master
https://github.com/mongodb/mongo/commit/b3653dc5d0bc4940971bc13591341341e0b4d4fc

Comment by Suganthi Mani [ 14/Aug/23 ]

Just for the ticket watchers, filed SERVER-79982 for the data inconsistency bug.

Generated at Thu Feb 08 06:36:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.