[SERVER-51687] Suffix generation for idents in the durable catalog can conflict Created: 16/Oct/20  Updated: 06/Dec/22

Status: Backlog
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Gregory Wlodarek Assignee: Backlog - Storage Execution Team
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Storage Execution
Operating System: ALL
Steps To Reproduce:

Additional logging used for the example:

diff --git a/src/mongo/db/storage/durable_catalog_impl.cpp b/src/mongo/db/storage/durable_catalog_impl.cpp
index e49dcb106d..edb95cfea6 100644
--- a/src/mongo/db/storage/durable_catalog_impl.cpp
+++ b/src/mongo/db/storage/durable_catalog_impl.cpp
@@ -403,7 +403,9 @@ DurableCatalogImpl::DurableCatalogImpl(RecordStore* rs,
       _directoryPerDb(directoryPerDb),
       _directoryForIndexes(directoryForIndexes),
       _rand(_newRand()),
-      _engine(engine) {}
+      _engine(engine) {
+          logd("+++ DurableCatalogImpl::DurableCatalogImpl _rand is " + _rand);
+      }
 
 DurableCatalogImpl::~DurableCatalogImpl() {
     _rs = nullptr;
@@ -415,7 +417,9 @@ std::string DurableCatalogImpl::_newRand() {
 
 bool DurableCatalogImpl::_hasEntryCollidingWithRand() const {
     // Only called from init() so don't need to lock.
+    logd("+++ DurableCatalogImpl::_hasEntryCollidingWithRand");
     for (auto it = _catalogIdToEntryMap.begin(); it != _catalogIdToEntryMap.end(); ++it) {
+        logd("+++ Checking if _rand conflicts with " + it->second.ident);
         if (StringData(it->second.ident).endsWith(_rand))
             return true;
     }

Participants:

 Description   

We use DurableCatalogImpl::_rand as the suffix when generating new idents in the server. The '_rand' is generated at startup and remains const throughout the uptime of the server. During the initialization of the durable catalog, we check if there's an entry colliding with the '_rand' we've generated. If this '_rand' is already in use by an existing ident, we'll generate a new one. However, we only check the contents of the '_catalogIdToEntryMap' for existing idents which only contains the idents starting with "collection-" and not the idents starting with "index-". It isn't guaranteed that all the indexes belonging to a collection share the same ident suffix as demonstrated below.
 
Startup mongod with an empty /data/db directory.

...
2020-10-16T09:45:51.632-04:00 I  STORAGE  [initandlisten] Opening WiredTiger {"config":"create,cache_size=31663M,session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000,close_scan_interval=10,close_handle_minimum=250),statistics_log=(wait=0),verbose=[recovery_progress,checkpoint_progress,compact_progress],debug_mode=(table_logging=true,),"}
...
2020-10-16T09:45:52.362-04:00 I  -        [initandlisten] +++ DurableCatalogImpl::DurableCatalogImpl _rand is -4816864174458775216
2020-10-16T09:45:52.363-04:00 I  -        [initandlisten] +++ DurableCatalogImpl::_hasEntryCollidingWithRand
...
2020-10-16T09:45:52.387-04:00 I  STORAGE  [initandlisten] createCollection {"namespace":"admin.system.version","uuidDisposition":"provided","uuid":{"uuid":{"$uuid":"af53ac04-602d-4739-98f2-aebb186f85a4"}},"options":{"uuid":{"$uuid":"af53ac04-602d-4739-98f2-aebb186f85a4"}}}
2020-10-16T09:45:52.428-04:00 I  INDEX    [initandlisten] Index build: done building {"buildUUID":null,"namespace":"admin.system.version","index":"_id_","commitTimestamp":null}
2020-10-16T09:45:52.428-04:00 I  REPL     [initandlisten] Setting featureCompatibilityVersion {"newVersion":"4.9"}
...

The '_rand' used is -4816864174458775216 and the contents of /data/db are the following:

-rw------- 1 gregory  20K Oct 16 09:46 collection-0--4816864174458775216.wt
-rw------- 1 gregory  20K Oct 16 09:46 collection-2--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:45 collection-4--4816864174458775216.wt
drwx------ 2 gregory 4.0K Oct 16 09:47 diagnostic.data/
-rw------- 1 gregory  20K Oct 16 09:46 index-1--4816864174458775216.wt
-rw------- 1 gregory  20K Oct 16 09:46 index-3--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:45 index-5--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:45 index-6--4816864174458775216.wt
drwx------ 2 gregory 4.0K Oct 16 09:45 journal/
-rw------- 1 gregory  20K Oct 16 09:46 _mdb_catalog.wt
-rw------- 1 gregory    6 Oct 16 09:45 mongod.lock
-rw------- 1 gregory 4.0K Oct 16 09:45 sizeStorer.wt
-rw------- 1 gregory  114 Oct 16 09:45 storage.bson
-rw------- 1 gregory   47 Oct 16 09:45 WiredTiger
-rw------- 1 gregory 4.0K Oct 16 09:45 WiredTigerHS.wt
-rw------- 1 gregory   21 Oct 16 09:45 WiredTiger.lock
-rw------- 1 gregory 1.3K Oct 16 09:46 WiredTiger.turtle
-rw------- 1 gregory  32K Oct 16 09:46 WiredTiger.wt

After creating the collection test.a, the contents of /data/db are the following:

-rw------- 1 gregory  20K Oct 16 09:46 collection-0--4816864174458775216.wt
-rw------- 1 gregory  20K Oct 16 09:46 collection-2--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:45 collection-4--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:47 collection-7--4816864174458775216.wt
drwx------ 2 gregory 4.0K Oct 16 09:47 diagnostic.data/
-rw------- 1 gregory  20K Oct 16 09:46 index-1--4816864174458775216.wt
-rw------- 1 gregory  20K Oct 16 09:46 index-3--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:45 index-5--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:45 index-6--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:47 index-8--4816864174458775216.wt
drwx------ 2 gregory 4.0K Oct 16 09:45 journal/
-rw------- 1 gregory  20K Oct 16 09:46 _mdb_catalog.wt
-rw------- 1 gregory    6 Oct 16 09:45 mongod.lock
-rw------- 1 gregory 4.0K Oct 16 09:45 sizeStorer.wt
-rw------- 1 gregory  114 Oct 16 09:45 storage.bson
-rw------- 1 gregory   47 Oct 16 09:45 WiredTiger
-rw------- 1 gregory 4.0K Oct 16 09:45 WiredTigerHS.wt
-rw------- 1 gregory   21 Oct 16 09:45 WiredTiger.lock
-rw------- 1 gregory 1.3K Oct 16 09:46 WiredTiger.turtle
-rw------- 1 gregory  32K Oct 16 09:46 WiredTiger.wt

From this we can deduce that the idents belonging to collection test.a are

  • collection-7--4816864174458775216.wt
  • index-8--4816864174458775216.wt (_id index)

Now we restart the server to generate a new '_rand'.

...
2020-10-16T09:48:24.053-04:00 I  STORAGE  [initandlisten] Opening WiredTiger {"config":"create,cache_size=31663M,session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000,close_scan_interval=10,close_handle_minimum=250),statistics_log=(wait=0),verbose=[recovery_progress,checkpoint_progress,compact_progress],debug_mode=(table_logging=true,),"}
...
2020-10-16T09:48:25.023-04:00 I  STORAGE  [initandlisten] WiredTiger opened {"durationMillis":970}
...
2020-10-16T09:48:25.027-04:00 I  -        [initandlisten] +++ DurableCatalogImpl::DurableCatalogImpl _rand is 3852150022159100276
2020-10-16T09:48:25.029-04:00 I  -        [initandlisten] +++ DurableCatalogImpl::_hasEntryCollidingWithRand
2020-10-16T09:48:25.029-04:00 I  -        [initandlisten] +++ Checking if _rand conflicts with collection-0--4816864174458775216
2020-10-16T09:48:25.029-04:00 I  -        [initandlisten] +++ Checking if _rand conflicts with collection-2--4816864174458775216
2020-10-16T09:48:25.030-04:00 I  -        [initandlisten] +++ Checking if _rand conflicts with collection-4--4816864174458775216
2020-10-16T09:48:25.030-04:00 I  -        [initandlisten] +++ Checking if _rand conflicts with collection-7--4816864174458775216
...
2020-10-16T09:48:25.078-04:00 I  NETWORK  [listener] Waiting for connections {"port":27017,"ssl":"off"}

As we can see, the DurableCatalogImpl::_hasEntryCollidingWithRand() function only checks for ident conflicts on the idents starting with "collection-". The new '_rand' is now 3852150022159100276

After creating an additional index on collection test.a, the /data/db has the following contents:

-rw------- 1 gregory  20K Oct 16 09:48 collection-0--4816864174458775216.wt
-rw------- 1 gregory  36K Oct 16 09:49 collection-2--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:48 collection-4--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:49 collection-7--4816864174458775216.wt
drwx------ 2 gregory 4.0K Oct 16 09:49 diagnostic.data/
-rw------- 1 gregory 4.0K Oct 16 09:49 index-0-3852150022159100276.wt
-rw------- 1 gregory  20K Oct 16 09:48 index-1--4816864174458775216.wt
-rw------- 1 gregory  36K Oct 16 09:49 index-3--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:48 index-5--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:49 index-6--4816864174458775216.wt
-rw------- 1 gregory 4.0K Oct 16 09:48 index-8--4816864174458775216.wt
drwx------ 2 gregory 4.0K Oct 16 09:48 journal/
-rw------- 1 gregory  36K Oct 16 09:49 _mdb_catalog.wt
-rw------- 1 gregory    6 Oct 16 09:48 mongod.lock
-rw------- 1 gregory  36K Oct 16 09:48 sizeStorer.wt
-rw------- 1 gregory  114 Oct 16 09:45 storage.bson
-rw------- 1 gregory   47 Oct 16 09:45 WiredTiger
-rw------- 1 gregory 4.0K Oct 16 09:48 WiredTigerHS.wt
-rw------- 1 gregory   21 Oct 16 09:45 WiredTiger.lock
-rw------- 1 gregory 1.3K Oct 16 09:49 WiredTiger.turtle
-rw------- 1 gregory  68K Oct 16 09:49 WiredTiger.wt

So now, collection test.a owns the following idents

  • collection-7--4816864174458775216.wt
  • index-8--4816864174458775216.wt (_id index)
  • index-0-3852150022159100276.wt (newly created index)

Upon restarting the server one last time, we see that DurableCatalogImpl::_hasEntryColldingWithRand() only checks against the idents starting with "collection-"

...
2020-10-16T09:49:52.625-04:00 I  STORAGE  [initandlisten] WiredTiger opened {"durationMillis":1110}
...
2020-10-16T09:49:52.630-04:00 I  -        [initandlisten] +++ DurableCatalogImpl::DurableCatalogImpl _rand is -8673726600649296817
2020-10-16T09:49:52.632-04:00 I  -        [initandlisten] +++ DurableCatalogImpl::_hasEntryCollidingWithRand
2020-10-16T09:49:52.632-04:00 I  -        [initandlisten] +++ Checking if _rand conflicts with collection-0--4816864174458775216
2020-10-16T09:49:52.633-04:00 I  -        [initandlisten] +++ Checking if _rand conflicts with collection-2--4816864174458775216
2020-10-16T09:49:52.633-04:00 I  -        [initandlisten] +++ Checking if _rand conflicts with collection-4--4816864174458775216
2020-10-16T09:49:52.633-04:00 I  -        [initandlisten] +++ Checking if _rand conflicts with collection-7--4816864174458775216
...
2020-10-16T09:49:52.694-04:00 I  NETWORK  [listener] Waiting for connections {"port":27017,"ssl":"off"}

Because we're only checking for conflicts against idents starting with "collection-", it's possible for us to re-use '_rand' 3852150022159100276 because of this. Statistically speaking, this has a very low probability to happen.



 Comments   
Comment by Connie Chen [ 22/Oct/20 ]

We believe the probability of someone hitting this is extremely low.

Generated at Thu Feb 08 05:26:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.