[SERVER-70373] Invariant failure in case resharding metrics are not restored Created: 07/Oct/22  Updated: 29/Oct/23  Resolved: 03/Nov/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 5.0.0, 6.0.0
Fix Version/s: 5.0.14, 6.0.3

Type: Bug Priority: Major - P3
Reporter: Nandini Bhartiya Assignee: Nandini Bhartiya
Resolution: Fixed Votes: 0
Labels: sharding-nyc-subteam1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Problem/Incident
causes SERVER-71112 Fix count of resharding errors in Res... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.0
Sprint: Sharding 2022-10-17, Sharding NYC 2022-10-31, Sharding NYC 2022-11-14
Participants:
Linked BF Score: 55
Story Points: 3

 Description   

Refer to logs below:

[cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.122Z I  TEST     5551107 [main] "Running case","attr":\{"test":"DropsTemporaryReshardingCollectionOnAbort","isAlsoDonor":true}
[cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.122Z I  STORAGE  20320   [main] "createCollection","attr":\{"namespace":"sourcedb.sourcecollection","uuidDisposition":"provided","uuid":{"uuid":{"$uuid":"36ff9f19-59f3-4f94-bf9d-c06b781bf9db"}},"options":\{"uuid":{"$uuid":"36ff9f19-59f3-4f94-bf9d-c06b781bf9db"}}}
[cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.143Z I  INDEX    20345   [main] "Index build: done building","attr":\{"buildUUID":null,"collectionUUID":{"uuid":{"$uuid":"36ff9f19-59f3-4f94-bf9d-c06b781bf9db"}},"namespace":"sourcedb.sourcecollection","index":"_id_","ident":"index-4-5654127791289670109","collectionIdent":"collection-3-5654127791289670109","commitTimestamp":\{"$timestamp":{"t":2,"i":6}}}
[cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.144Z I  RESHARD  5279506 [ReshardingRecipientService-2] "Transitioned resharding recipient state","attr":\{"newState":"creating-collection","oldState":"awaiting-fetch-timestamp","namespace":"sourcedb.sourcecollection","collectionUUID":{"uuid":{"$uuid":"36ff9f19-59f3-4f94-bf9d-c06b781bf9db"}},"reshardingUUID":\{"uuid":{"$uuid":"13016f1c-a1ee-4e5e-a6a4-8698776869f0"}}}
[cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.146Z I  REPL     5123007 [main] "Interrupting PrimaryOnlyService due to stepDown","attr":\{"service":"ReshardingRecipientService","numInstances":1,"numOperationContexts":1}
[cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.150Z I  REPL     5123005 [ReshardingRecipientService-0] "Rebuilding PrimaryOnlyService due to stepUp","attr":\{"service":"ReshardingRecipientService"}
[cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.158Z F  ASSERT   23081   [ReshardingRecipientService-0] "Invariant failure","attr":\{"expr":"_currentOp","msg":"No operation is in progress","file":"src/mongo/db/s/resharding/resharding_metrics.cpp","line":541}
[cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.158Z F  ASSERT   23082   [ReshardingRecipientService-0] "\n\n***aborting after invariant() failure\n\n"
[cpp_unit_test:db_s_shard_server_test] | 2022-09-20T05:34:05.158Z F  CONTROL  6384300 [ReshardingRecipientService-0] "Writing fatal message","attr":\{"message":"Got signal: 6 (Aborted).\n"}

In the unit test DropsTemporaryReshardingCollectionOnAbort, the invariant might fail if the resharding metrics restoration (via _startMetrics > _restoreMetricsWithRetry >_restoreMetrics) was not completed due to the second abort call



 Comments   
Comment by Githook User [ 09/Nov/22 ]

Author:

{'name': 'nandinibhartiyaMDB', 'email': 'nandini.bhartiya@mongodb.com', 'username': 'nandinibhartiyaMDB'}

Message: SERVER-70373: Avoid invariant failure while aborting a resharding operation

(cherry picked from commit fed6f5b5c41b8f9ba5451192f5bf72eb7742c1d5)

SERVER-71112: Fix resharding failures count in unittest

(cherry picked from commit d8329fbe00da54b5f8b0f60889364e7920968756)
Branch: v5.0
https://github.com/mongodb/mongo/commit/ec65af7e2185c7e266bc6a04ea7777eea713fa58

Comment by Githook User [ 03/Nov/22 ]

Author:

{'name': 'nandinibhartiyaMDB', 'email': 'nandini.bhartiya@mongodb.com', 'username': 'nandinibhartiyaMDB'}

Message: SERVER-70373: Avoid invariant failure while aborting a resharding operation
Branch: v6.0
https://github.com/mongodb/mongo/commit/fed6f5b5c41b8f9ba5451192f5bf72eb7742c1d5

Generated at Thu Feb 08 06:16:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.