[SERVER-81028] Incorrect $listCatalog behavior in presence of a concurrent collection rename in v7.0 Created: 13/Sep/23  Updated: 11/Dec/23  Resolved: 16/Oct/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 7.0.0
Fix Version/s: 7.1.1, 7.2.0-rc0, 7.0.4, 6.0.13

Type: Bug Priority: Major - P3
Reporter: Craven Huynh Assignee: Jordi Olivares Provencio
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
causes SERVER-83108 $listCatalog doesn't respect readConcern Closed
Assigned Teams:
Storage Execution EMEA
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.1, v7.0, v6.0, v5.0, v4.4
Steps To Reproduce:

To reproduce outside of Mongosync:

  • Launch and connect to v7.0 replica set (I used mlaunch)

m 7.0.0
mlaunch --replicaset --nodes 3
mongosh mongodb://localhost:27017,localhost:27018,localhost:27019

- You will need 2 mongosh for this repro.

  • Create collection and index

use test
db.createCollection("coll")
db.coll.createIndex({x:1})

  • On mongosh 1, run the aggregation pipeline to retrieve the index in an infinite loop (replace collectionUUID with actual value)

stages1 = [{ "$group": Unknown macro: { "_id"}} }, { "$unwind":{ "path": "$catalogEntry" }}, { "$unwind":{ "path": "$catalogEntry.md.indexes" }}, { "$unset": "catalogEntry.md.indexes.spec.ns" }, { "$match":{ "catalogEntry.md.indexes.ready": true }}, { "$set": { "catalogEntry.md.indexes.spec.sparse": { "$cond": { "if":Unknown macro: { "$eq"}, "then": "$$REMOVE", "else": { "$toBool": "$catalogEntry.md.indexes.spec.sparse" } } }, "catalogEntry.md.indexes.spec.expireAfterSeconds": { "$cond": { "if":Unknown macro: { "$eq"}, "then": "$$REMOVE", "else": { "$toInt": "$catalogEntry.md.indexes.spec.expireAfterSeconds" } } }, "catalogEntry.md.indexes.spec.bits": { "$cond": { "if":Unknown macro: { "$eq"}, "then": "$$REMOVE", "else": { "$toInt": "$catalogEntry.md.indexes.spec.bits" } } } } }, { "$group": { "_id": "$catalogEntry.md.indexes.spec.name", "shards":{ "$push": "$catalogEntry.shard" }, "specs": { "$push": { "$objectToArray": { "$ifNull": [ "$catalogEntry.md.indexes.spec", {}] } } }, "allShards": { "$first": "$allShards" } } }]

stages2 = { "$project": { "numShards":{ "$size": "$allShards" }, "missingFromShards": { "$setDifference": [ "$allShards", "$shards"] }, "spec": { "$arrayToObject":{ "$first": "$specs" }}, "inconsistentOptions": { "$setDifference": [ { "$reduce": { "input": "$specs", "initialValue":{ "$arrayElemAt": [ "$specs", 0] }, "in": { "$setUnion": [ "$$value", "$$this"] } } }, { "$reduce": { "input": "$specs", "initialValue":{ "$arrayElemAt": [ "$specs", 0] }, "in": { "$setIntersection": [ "$$value", "$$this"] } } }] } } }

res = db.runCommand({"aggregate": "coll", "collectionUUID": new UUID("85c8ab7b-d107-4768-913e-19f0b107eb2a"), "readConcern": {"level": "majority", "afterClusterTime": Timestamp({ t: 1693905823, i: 20 })}, cursor: {}, pipeline: [\{"$listCatalog": {}}].concat(stages1).concat([\{"$match": {_id: "x_1"} }]).concat([stages2])}).cursor.firstBatch.length

while (res) { res = db.runCommand({"aggregate": "coll", "collectionUUID": new UUID("85c8ab7b-d107-4768-913e-19f0b107eb2a"), "readConcern": {"level": "majority", "afterClusterTime": Timestamp({ t: 1693905823, i: 20 })}, cursor: {}, pipeline: [\{"$listCatalog": {}}].concat(stages1).concat([\{"$match": {_id: "x_1"} }]).concat([stages2])}).cursor.firstBatch.length }

 - On mongosh 2, rename the collection

use test
db.coll.renameCollection("coll2")

The infinite loop in mongosh 1 will exit. In v6.0.8, the infinite loop exits with a CollectionUUIDMismatch error as expected, but v.7.0.0, the infinite loop exits because the server returns an empty list of index.

 

Sprint: Execution EMEA Team 2023-10-16, Execution EMEA Team 2023-10-30
Participants:

 Description   

Mongosync uses $listCatalog to list indexes belonging to collections as follows:

stages1 = [{ "$group": { "_id": null, "catalogEntry": { "$push": "$$ROOT" }, "allShards": { "$addToSet": "$shard" } } }, { "$unwind": { "path": "$catalogEntry" } }, { "$unwind": { "path": "$catalogEntry.md.indexes" } }, { "$unset": "catalogEntry.md.indexes.spec.ns" }, { "$match": { "catalogEntry.md.indexes.ready": true } }, { "$set": { "catalogEntry.md.indexes.spec.sparse": { "$cond": { "if": { "$eq": ["missing", { "$type": "$catalogEntry.md.indexes.spec.sparse" }] }, "then": "$$REMOVE", "else": { "$toBool": "$catalogEntry.md.indexes.spec.sparse" } } }, "catalogEntry.md.indexes.spec.expireAfterSeconds": { "$cond": { "if": { "$eq": ["missing", { "$type": "$catalogEntry.md.indexes.spec.expireAfterSeconds" }] }, "then": "$$REMOVE", "else": { "$toInt": "$catalogEntry.md.indexes.spec.expireAfterSeconds" } } }, "catalogEntry.md.indexes.spec.bits": { "$cond": { "if": { "$eq": ["missing", { "$type": "$catalogEntry.md.indexes.spec.bits" }] }, "then": "$$REMOVE", "else": { "$toInt": "$catalogEntry.md.indexes.spec.bits" } } } } }, { "$group": { "_id": "$catalogEntry.md.indexes.spec.name", "shards": { "$push": "$catalogEntry.shard" }, "specs": { "$push": { "$objectToArray": { "$ifNull": [ "$catalogEntry.md.indexes.spec", {}] } } }, "allShards": { "$first": "$allShards" } } }]

stages2 = { "$project": { "numShards": { "$size": "$allShards" }, "missingFromShards": { "$setDifference": [ "$allShards", "$shards"] }, "spec": { "$arrayToObject": { "$first": "$specs" } }, "inconsistentOptions": { "$setDifference": [ { "$reduce": { "input": "$specs", "initialValue": { "$arrayElemAt": [ "$specs", 0] }, "in": { "$setUnion": [ "$$value", "$$this"] } } }, { "$reduce": { "input": "$specs", "initialValue": { "$arrayElemAt": [ "$specs", 0] }, "in": { "$setIntersection": [ "$$value", "$$this"] } } }] } } }res = 

db.runCommand({"aggregate": "coll", "collectionUUID": new UUID("85c8ab7b-d107-4768-913e-19f0b107eb2a"), "readConcern": {"level": "majority", "afterClusterTime": Timestamp({ t: 1693905823, i: 20 })}, cursor: {}, pipeline: [{"$listCatalog": {}}].concat(stages1).concat([{"$match": {_id: "x_1"} }]).concat([stages2])})

If the collection gets renamed while the aggregation pipeline is running, Mongosync expects to get a CollectionUUIDMismatch error for the server. That is the behavior for v6.0.8.

In v7.0, the result of the aggregation pipeline is actually empty, which causes Mongosync to incorrectly believe that there is no index on the collection.



 Comments   
Comment by Githook User [ 30/Nov/23 ]

Author:

{'name': 'Jordi Olivares Provencio', 'email': 'jordi.olivares-provencio@mongodb.com', 'username': 'jordiolivares'}

Message: SERVER-81028 Fix $listCatalog behavior during concurrent collection renames
Branch: v6.0
https://github.com/mongodb/mongo/commit/4975f8a8df467f2afced12cac74567994e711158

Comment by Jordi Olivares Provencio [ 29/Nov/23 ]

Note that $listCatalog is only present on 6.0+. Cancelling backport to 5.0 and 4.4 as this cannot be used there.

Comment by Jordi Olivares Provencio [ 14/Nov/23 ]

Requesting backports to 6.0 and older since the query can return invalid results in the presence of concurrent renames.

Comment by Huan Li [ 09/Nov/23 ]

Please be aware that since this change, Mongosync has been experiencing many test failures with error message from the server "failed to apply create in a causally consistent session: (CollectionUUIDMismatch) PlanExecutor error during aggregation :: caused by :: Collection UUID does not match that specified"

I have linked one of the Mongosync BF ticket REP-3580 above. 

I think we should take another look at this fix.  craven.huynh@mongodb.com is out this week, please let us know if you want help with the investigation.

Comment by Githook User [ 08/Nov/23 ]

Author:

{'name': 'Jordi Olivares Provencio', 'email': 'jordi.olivares-provencio@mongodb.com', 'username': 'jordiolivares'}

Message: SERVER-81028 Fix $listCatalog behavior during concurrent collection renames
Branch: v7.0
https://github.com/mongodb/mongo/commit/0df10b4a3806b8a249129fa082df91d2705c1598

Comment by Githook User [ 06/Nov/23 ]

Author:

{'name': 'Jordi Olivares Provencio', 'email': 'jordi.olivares-provencio@mongodb.com', 'username': 'jordiolivares'}

Message: SERVER-81028 Fix $listCatalog behavior during concurrent collection renames
Branch: v7.1
https://github.com/mongodb/mongo/commit/0cd223cafc689be45f85b17057a323cc2768f956

Comment by Githook User [ 16/Oct/23 ]

Author:

{'name': 'Jordi Olivares Provencio', 'email': 'jordi.olivares-provencio@mongodb.com', 'username': 'jordiolivares'}

Message: SERVER-81028 Fix $listCatalog behavior during concurrent collection renames
Branch: master
https://github.com/mongodb/mongo/commit/d607a3e9a5eb9a8980dd2fc2dec6e6069b1dba64

Comment by Jordi Olivares Provencio [ 13/Oct/23 ]

Requesting backports to 7.0 and 7.1 since it affects those versions.

Comment by Jordi Olivares Provencio [ 11/Oct/23 ]

After running bisect on this, I identified the issue to come from the PIT catalog lookup project (SERVER-55505). The issue I believe stems from how the $listCatalog stage is implemented. It is currently looking at the durable catalog without going through the PIT catalog so we might be at a case where the checks have passed for UUID mismatch and then the DurableCatalog is accessed at a different point in time showing no elements for the collection.

Generated at Thu Feb 08 06:45:16 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.