Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-36921

Transaction lock timeout errors when running transactions concurrently with Logical Session Cache refreshes

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.1.4
    • Component/s: None
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL

      Description

      TransientTransactionErrors were uncovered on multiple tests when running a custom jsCore passthrough suite with the Logical Session Cache refresh set to 100ms.

      The cleanest example of this error is in multi_statement_transaction – but there aren't any log lines in this test to investigate.

      assert: command failed: {
      errorLabels" : [
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.039+0000 		"TransientTransactionError"
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.039+0000 	],
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.039+0000 	"operationTime" : Timestamp(1535488481, 2),
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.039+0000 	"ok" : 0,
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.043+0000 	"errmsg" : "Unable to acquire lock '{6237343057549539649: Database, 1625657039122151745, config}' within a max lock request timeout of '5ms' milliseconds.",
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.043+0000 	"code" : 24,
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.043+0000 	"codeName" : "LockTimeout",
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.050+0000 	"$clusterTime" : {
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.051+0000 		"clusterTime" : Timestamp(1535488481, 2),
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.051+0000 		"signature" : {
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.052+0000 			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.052+0000 			"keyId" : NumberLong(0)
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.052+0000 		}
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.052+0000 	}
      [js_test:multi_statement_transaction] 2018-08-28T20:34:41.053+0000 }
      

      The error also persists in the collation and transactions_profiling_with_drops tests (which have plenty of logs to look at).

      After a cursory investigation of collation.js: it seems that a connection takes an exclusive write lock on config.system.sessions for an attempted update, then the LogicalSessionCache thread tries to take the lock as well – then fails. Check the highlighted lines on this Lobster log.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              backlog-server-sharding Backlog - Sharding Team
              Reporter:
              blake.oler Blake Oler
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: