Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-46733

Consider appending TransientTransactionError labels to ConflictingOperationInProgress errors

    XMLWordPrintableJSON

Details

    • Sharding NYC

    Description

      As part of my work for SERVER-44409, I ran into many ConflictingOperationInProgress errors, e.g.:

      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         Foreground jstests/concurrency/fsm_workloads/CRUD_and_commands.js
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         Error: command failed: {
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         	"ok" : 0,
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         	"errmsg" : "unable to initialize targeter for write op for collection test18_fsmdb0.fsmcoll0 :: caused by :: No chunks were found for the collection",
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         	"code" : 117,
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         	"codeName" : "ConflictingOperationInProgress",
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.516+0000         	"operationTime" : Timestamp(1583777991, 80),
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         	"$clusterTime" : {
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         		"clusterTime" : Timestamp(1583777991, 80),
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         		"signature" : {
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         			"keyId" : NumberLong(0)
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         		}
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         	}
      [fsm_workload_test:CRUD_and_commands] 2020-03-09T18:19:52.517+0000         }

      I encountered this error both inside transactions and outside of transactions. Per a discussion with jack.mulrow, in an aggressive concurrency workload with dropCollection in parallel with CRUD ops in sharding suites, it is possible to run into this kind of error even though CRUD ops and dropCollection take conflicting locks.

      Can we consider adding a TransientTransactionError label when we encounter this error, to facilitate retrying? Conceptually, this seems like a similar case to the existing TransientTransactionError cases.

      Attachments

        Activity

          People

            backlog-server-sharding-nyc Backlog - Sharding NYC
            maria.vankeulen@mongodb.com Maria van Keulen
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: