Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-70373

Invariant failure in case resharding metrics are not restored

    • Fully Compatible
    • ALL
    • v5.0
    • Sharding 2022-10-17, Sharding NYC 2022-10-31, Sharding NYC 2022-11-14
    • 55
    • 3

      Refer to logs below:

      [cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.122Z I  TEST     5551107 [main] "Running case","attr":\{"test":"DropsTemporaryReshardingCollectionOnAbort","isAlsoDonor":true}
      [cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.122Z I  STORAGE  20320   [main] "createCollection","attr":\{"namespace":"sourcedb.sourcecollection","uuidDisposition":"provided","uuid":{"uuid":{"$uuid":"36ff9f19-59f3-4f94-bf9d-c06b781bf9db"}},"options":\{"uuid":{"$uuid":"36ff9f19-59f3-4f94-bf9d-c06b781bf9db"}}}
      [cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.143Z I  INDEX    20345   [main] "Index build: done building","attr":\{"buildUUID":null,"collectionUUID":{"uuid":{"$uuid":"36ff9f19-59f3-4f94-bf9d-c06b781bf9db"}},"namespace":"sourcedb.sourcecollection","index":"_id_","ident":"index-4-5654127791289670109","collectionIdent":"collection-3-5654127791289670109","commitTimestamp":\{"$timestamp":{"t":2,"i":6}}}
      [cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.144Z I  RESHARD  5279506 [ReshardingRecipientService-2] "Transitioned resharding recipient state","attr":\{"newState":"creating-collection","oldState":"awaiting-fetch-timestamp","namespace":"sourcedb.sourcecollection","collectionUUID":{"uuid":{"$uuid":"36ff9f19-59f3-4f94-bf9d-c06b781bf9db"}},"reshardingUUID":\{"uuid":{"$uuid":"13016f1c-a1ee-4e5e-a6a4-8698776869f0"}}}
      [cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.146Z I  REPL     5123007 [main] "Interrupting PrimaryOnlyService due to stepDown","attr":\{"service":"ReshardingRecipientService","numInstances":1,"numOperationContexts":1}
      [cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.150Z I  REPL     5123005 [ReshardingRecipientService-0] "Rebuilding PrimaryOnlyService due to stepUp","attr":\{"service":"ReshardingRecipientService"}
      [cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.158Z F  ASSERT   23081   [ReshardingRecipientService-0] "Invariant failure","attr":\{"expr":"_currentOp","msg":"No operation is in progress","file":"src/mongo/db/s/resharding/resharding_metrics.cpp","line":541}
      [cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.158Z F  ASSERT   23082   [ReshardingRecipientService-0] "\n\n***aborting after invariant() failure\n\n"
      [cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.158Z F  CONTROL  6384300 [ReshardingRecipientService-0] "Writing fatal message","attr":\{"message":"Got signal: 6 (Aborted).\n"}
      
      

      In the unit test DropsTemporaryReshardingCollectionOnAbort, the invariant might fail if the resharding metrics restoration (via _startMetrics > _restoreMetricsWithRetry >_restoreMetrics) was not completed due to the second abort call

            Assignee:
            nandini.bhartiya@mongodb.com Nandini Bhartiya
            Reporter:
            nandini.bhartiya@mongodb.com Nandini Bhartiya
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: