|
The solution to add an option to skip temporary collections in dbhash and enable it in shard split passthrough was approved following a meeting with dave.gotlieb@pearson.com and christopher.caplinger@mongodb.com . This decision was based on :
- Commands that uses temporary collections are not necessary incompatible with shard split. In general, shard split will return an TenantMigrationCommitted error code to the caller, which can then re-run that command on the recipient set. Therefore we want out test suits to cover such commands to prevent regressions.
- Any temporary collection copied over to the recipient set will be deleted when a node of this newly-created replica set is elected primary (tmp collections are dropped in stepUp). Meanwhile the donor replica set is frozen due to the access blockers. It guarantees hash incompatibility between the recipient primary and donor primary for temporary collections. This has no impact on the user as commands using these temporary collections should be retried.
- If there are operations or commands that would be impacted by the deletion, these should be individually prevented to run at the same time as shard split (we don't have any example of such commands at this time)
- It is not possible to know if a collection is temporary from the mongo shell, this is not exposed to the user. Adding an option in dbhash has a smaller footprint then exposing precisely which collections are temporary (that could possibly be reused for unintended purposes)
- Enable the option in shard split passthroughs is reasonably clean and easy to track.
|
|
I have tried reproducing the issue (to test a fix) by creating a temp collection from the ShardSplitDonorService before inserting the state document. I'm adding the following code :
.then([this, executor, primaryToken] { return AsyncTry([this] { auto opCtxHolder = _cancelableOpCtxFactory->makeOperationContext(&cc()); auto opCtx = opCtxHolder.get();
|
auto dummyCol = NamespaceString{"mydb.tmp_coll1"};
|
AutoGetDb curDb(opCtx, dummyCol.dbName(), MODE_IX);
|
CollectionOptions options; options.temp = true;
|
AutoGetCollection autoColl(opCtx, dummyCol, MODE_IX);
|
writeConflictRetry(opCtx, "InsertTmpCollection", dummyCol.ns(), [&]() { WriteUnitOfWork wuow(opCtx);
|
invariant(curDb.ensureDbExists(opCtx)->createCollection( opCtx, dummyCol, options));
|
wuow.commit(); });
|
return repl::ReplClientInfo::forClient(opCtx->getClient()).getLastOp(); }) .until([](StatusWith<repl::OpTime> swOpTime) { return swOpTime.getStatus().isOK(); }) .withBackoffBetweenIterations(kExponentialBackoff) .on(**executor, primaryToken) .then([](StatusWith<repl::OpTime> swOpTime) { return swOpTime.getStatus(); }); })
|
This creates the tmp collection and I can see "Deferring table drop for collection","attr":{"namespace":"mydb.tmp_coll1","ident":"collection-40-7080495107915326535","commitTimestamp":{"$timestamp":{"t":1660940519,"i":9}}}}". However the collection is not actually deleted (maybe it's too short, possibly due to the stable point ?). My plan would be :
- Assert through basic_shard_split.js the tmp collection is deleted.
- Run the passthrough using the modified shard split service, asserts it's failing.
- Modify the dbhash check (most likely by adding the option of skipping temp collections and using it) to make the passthrough pass.
- Merge the dbhash modification.
|