Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-68495

Resharding a collection with a very large number of zones configured may stall on config server primary indefinitely

    • Fully Compatible
    • ALL
    • v6.0, v5.0
    • Sharding 2022-08-08
    • 2

      The changes from b15ce76 as part of SERVER-58433 didn't remove the multi-update to the config.tags collection from the replica set transaction on the config server primary for transitioning to CoordinatorStateEnum::kCommitting. The replica set transaction may therefore repeatedly failed with a WriteConflict due to the storage transaction being aborted by WiredTiger cache eviction when the multi-update affects a very large number of documents.

      [js_test:resharding_large_number_of_initial_chunks] c20042| 2022-08-01T16:55:28.987+00:00 I  WRITE    51803   [ReshardingCoordinatorService-2] "Slow query","attr":{"type":"update","ns":"config.tags","command":{"q":{"ns":"db.system.resharding.efb50f47-9623-4443-9f82-6ef1d7fe4c58"},"u":{"$set":{"ns":"db.foo"}},"hint":{"ns":1,"min":1},"multi":true,"upsert":false},"planSummary":"IXSCAN { ns: 1, min: 1 }","keysInserted":268146,"keysDeleted":268146,"numYields":0,"queryHash":"8B227D4B","ok":0,"errMsg":"-31800: oldest pinned transaction ID rolled back for eviction :: caused by :: WriteConflict error: this operation conflicted with another operation. Please retry your operation or multi-document transaction.","errName":"WriteConflict","errCode":112,"locks":{"FeatureCompatibilityVersion":{"acquireCount":{"w":4}},"ReplicationStateTransition":{"acquireCount":{"w":6}},"Global":{"acquireCount":{"w":4}},"Database":{"acquireCount":{"w":3}},"Collection":{"acquireCount":{"w":5}}},"flowControl":{"acquireCount":1,"timeAcquiringMicros":3},"storage":{"data":{"bytesRead":15401422,"timeReadingMicros":7467},"timeWaitingMicros":{"cache":846972}},"durationMillis":17347}
      ...
      [js_test:resharding_large_number_of_initial_chunks] c20042| 2022-08-01T17:04:57.892+00:00 I  WRITE    51803   [ReshardingCoordinatorService-2] "Slow query","attr":{"type":"update","ns":"config.tags","command":{"q":{"ns":"db.system.resharding.efb50f47-9623-4443-9f82-6ef1d7fe4c58"},"u":{"$set":{"ns":"db.foo"}},"hint":{"ns":1,"min":1},"multi":true,"upsert":false},"planSummary":"IXSCAN { ns: 1, min: 1 }","keysInserted":262668,"keysDeleted":262668,"numYields":0,"queryHash":"8B227D4B","ok":0,"errMsg":"-31800: oldest pinned transaction ID rolled back for eviction :: caused by :: WriteConflict error: this operation conflicted with another operation. Please retry your operation or multi-document transaction.","errName":"WriteConflict","errCode":112,"locks":{"FeatureCompatibilityVersion":{"acquireCount":{"w":9}},"ReplicationStateTransition":{"acquireCount":{"w":21}},"Global":{"acquireCount":{"w":9}},"Database":{"acquireCount":{"w":8}},"Collection":{"acquireCount":{"w":20}}},"flowControl":{"acquireCount":6,"timeAcquiringMicros":22},"storage":{"data":{"bytesRead":15692190,"timeReadingMicros":14707},"timeWaitingMicros":{"cache":514930}},"durationMillis":114614}
      

      https://evergreen.mongodb.com/lobster/build/4053c7c842f00af9f14444235c8bf1ab/test/62e80502c2ab681a664a7e36#bookmarks=0%2C4928&f~=100~oldest%20pinned%20transaction%20ID%20rolled%20back%20for%20eviction&l=1

            Assignee:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Reporter:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: