Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-59694

Resharding Prohibited Commands Incorrectly Assumes Consistency In Config.Cache.Collections Collection

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 5.0.4, 5.1.0-rc0
    • Affects Version/s: 5.0.0
    • Component/s: Sharding
    • None
    • Fully Compatible
    • ALL
    • v5.0
    • Hide
      1. Apply the git diff below
      2. Run the test with the following command
      Unable to find source-code formatter for language: shell. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
      buildscripts/resmoke.py run jstests/sharding/resharding_prohibited_commands.js
      
      Unable to find source-code formatter for language: diff. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
      diff --git a/src/mongo/db/s/shard_server_catalog_cache_loader.cpp b/src/mongo/db/s/shard_server_catalog_cache_loader.cpp
      index 07f713c866..76024fe7bf 100644
      --- a/src/mongo/db/s/shard_server_catalog_cache_loader.cpp
      +++ b/src/mongo/db/s/shard_server_catalog_cache_loader.cpp
      @@ -27,6 +27,7 @@
        *    it in the license file.
        */
       
      +#include "mongo/util/time_support.h"
       #define MONGO_LOGV2_DEFAULT_COMPONENT ::mongo::logv2::LogComponent::kSharding
       
       #define LOGV2_FOR_CATALOG_REFRESH(ID, DLEVEL, MESSAGE, ...) \
      @@ -118,7 +119,7 @@ Status persistCollectionAndChangedChunks(OperationContext* opCtx,
       
           // Mark the chunk metadata as refreshing, so that secondaries are aware of refresh.
           update.setRefreshing(true);
      -
      +    sleepsecs(6);
           Status status =
               updateShardCollectionsEntry(opCtx,
                                           BSON(ShardCollectionType::kNssFieldName << nss.ns()),
      
      Show
      Apply the git diff below Run the test with the following command Unable to find source-code formatter for language: shell. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml buildscripts/resmoke.py run jstests/sharding/resharding_prohibited_commands.js Unable to find source-code formatter for language: diff. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml diff --git a/src/mongo/db/s/shard_server_catalog_cache_loader.cpp b/src/mongo/db/s/shard_server_catalog_cache_loader.cpp index 07f713c866..76024fe7bf 100644 --- a/src/mongo/db/s/shard_server_catalog_cache_loader.cpp +++ b/src/mongo/db/s/shard_server_catalog_cache_loader.cpp @@ -27,6 +27,7 @@ * it in the license file. */ +#include "mongo/util/time_support.h" #define MONGO_LOGV2_DEFAULT_COMPONENT ::mongo::logv2::LogComponent::kSharding #define LOGV2_FOR_CATALOG_REFRESH(ID, DLEVEL, MESSAGE, ...) \ @@ -118,7 +119,7 @@ Status persistCollectionAndChangedChunks(OperationContext* opCtx, // Mark the chunk metadata as refreshing, so that secondaries are aware of refresh. update.setRefreshing( true ); - + sleepsecs(6); Status status = updateShardCollectionsEntry(opCtx, BSON(ShardCollectionType::kNssFieldName << nss.ns()),
    • Sharding 2021-09-06
    • 151
    • 1

      Background & Context
      The JS Test, resharding_prohibited_commands.js utilizes the config.cache.collections collections in order to verify that the committing decision has been relayed to the recipient.

      It does this because it assumes either it will find the cached collection document where the `reshardingFields.state` property will be 'committing' or one of the other state values.

      However, unlike other collections, the internal `config.cache.collections` collection has no such consistency guarantees. So it's possible that in between an old document being deleted and the new one being inserted, that it will find nothing.

      In the ShardServerCatalogCacheLoader, the function that handles the refreshes to the `config.cache.collections` collection will first delete and then insert a new document in the case of an epoch change.

      Since Resharding utilizes epoc changes to invalidate the shard's cache of the collection information, there is a space of time between the delete and insertion of the document in the `config.cache.collections` collection, where the test can read an invalid state (no collection document).

      The test therefore is making an invalid assumption about the consistency that the `config.cache.collections` collection actually adheres to.

      Proposed Solution
      A simple solution would be to update the usages of find queries (3 in total) in the `cache.config.collections` collection in this test to first check if the response is null (because it found no documents matching the query) before using the value.

      Such as:

      return res && res.reshardingFields.status === "comitting"
      

            Assignee:
            luis.osta@mongodb.com Luis Osta (Inactive)
            Reporter:
            luis.osta@mongodb.com Luis Osta (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: