Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-46733

Consider appending TransientTransactionError labels to ConflictingOperationInProgress errors

    • Sharding NYC

      As part of my work for SERVER-44409, I ran into many ConflictingOperationInProgress errors, e.g.:

      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         Foreground jstests/concurrency/fsm_workloads/CRUD_and_commands.js
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         Error: command failed: {
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         	"ok" : 0,
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         	"errmsg" : "unable to initialize targeter for write op for collection test18_fsmdb0.fsmcoll0 :: caused by :: No chunks were found for the collection",
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         	"code" : 117,
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         	"codeName" : "ConflictingOperationInProgress",
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         	"operationTime" : Timestamp(1583777991, 80),
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         	"$clusterTime" : {
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         		"clusterTime" : Timestamp(1583777991, 80),
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         		"signature" : {
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         			"keyId" : NumberLong(0)
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         		}
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         	}
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         }

      I encountered this error both inside transactions and outside of transactions. Per a discussion with jack.mulrow, in an aggressive concurrency workload with dropCollection in parallel with CRUD ops in sharding suites, it is possible to run into this kind of error even though CRUD ops and dropCollection take conflicting locks.

      Can we consider adding a TransientTransactionError label when we encounter this error, to facilitate retrying? Conceptually, this seems like a similar case to the existing TransientTransactionError cases.

            Assignee:
            backlog-server-sharding-nyc [DO NOT USE] Backlog - Sharding NYC
            Reporter:
            maria.vankeulen@mongodb.com Maria van Keulen
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: