[CDRIVER-4252] "commitTransaction retry fails on new mongos" test fails to read from socket Created: 29/Dec/21  Updated: 22/Mar/22  Resolved: 22/Mar/22

Status: Closed
Project: C Driver
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Unknown
Reporter: Kevin Albertson Assignee: Unassigned
Resolution: Duplicate Votes: 0
Labels: flaky-tests
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates CDRIVER-4268 /transactions/legacy/mongos-recovery-... Backlog

 Description   

Example failure with relevant logs:

[2021/12/28 23:35:08.455]   - commitTransaction explicit retries include recoveryToken
[2021/12/28 23:35:08.455] 2021/12/28 23:35:08.0454: [27794]:    DEBUG:       mongoc:      running operation startTransaction : { "name" : "startTransaction", "object" : "session0" }
[2021/12/28 23:35:08.455] 2021/12/28 23:35:08.0455: [27794]:    DEBUG:       mongoc:      running operation insertOne : { "name" : "insertOne", "object" : "collection", "arguments" : { "session" : "session0", "document" : { "_id" : 1 } }, "result" : { "insertedId" : 1 } }
[2021/12/28 23:35:08.455] { "insert" : "test", "ordered" : true, "$db" : "transaction-tests", "lsid" : { "id" : { "$binary" : { "base64" : "zfjAuwXAQsO46TNVu5xWtw==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734507, "i" : 38 } }, "signature" : { "hash" : { "$binary" : { "base64" : "ZxdmkHCPfPWS/w+0ILJylDCn35o=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "startTransaction" : true, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false, "documents" : [ { "_id" : { "$numberInt" : "1" } } ] }
[2021/12/28 23:35:08.457] 2021/12/28 23:35:08.0457: [27794]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "n" : { "$numberInt" : "1" }, "ok" : { "$numberDouble" : "1.0" }, "operationTime" : { "$timestamp" : { "t" : 1640734508, "i" : 6 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734508, "i" : 6 } }, "signature" : { "hash" : { "$binary" : { "base64" : "MfoWJcZJ02xaElgfHbu9YJZY250=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "recoveryToken" : { "recoveryShardId" : "sh01" } }
[2021/12/28 23:35:08.457] 2021/12/28 23:35:08.0457: [27794]:    DEBUG:       mongoc:      running operation commitTransaction : { "name" : "commitTransaction", "object" : "session0" }
[2021/12/28 23:35:08.457] { "commitTransaction" : { "$numberInt" : "1" }, "recoveryToken" : { "recoveryShardId" : "sh01" }, "$db" : "admin", "lsid" : { "id" : { "$binary" : { "base64" : "zfjAuwXAQsO46TNVu5xWtw==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734508, "i" : 6 } }, "signature" : { "hash" : { "$binary" : { "base64" : "MfoWJcZJ02xaElgfHbu9YJZY250=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false }
[2021/12/28 23:35:08.459] 2021/12/28 23:35:08.0459: [27794]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "ok" : { "$numberDouble" : "1.0" }, "operationTime" : { "$timestamp" : { "t" : 1640734508, "i" : 7 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734508, "i" : 7 } }, "signature" : { "hash" : { "$binary" : { "base64" : "MfoWJcZJ02xaElgfHbu9YJZY250=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "recoveryToken" : { "recoveryShardId" : "sh01" } }
[2021/12/28 23:35:08.459] 2021/12/28 23:35:08.0459: [27794]:    DEBUG:       mongoc:      running operation commitTransaction : { "name" : "commitTransaction", "object" : "session0" }
[2021/12/28 23:35:08.459] { "commitTransaction" : { "$numberInt" : "1" }, "recoveryToken" : { "recoveryShardId" : "sh01" }, "$db" : "admin", "lsid" : { "id" : { "$binary" : { "base64" : "zfjAuwXAQsO46TNVu5xWtw==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734508, "i" : 7 } }, "signature" : { "hash" : { "$binary" : { "base64" : "MfoWJcZJ02xaElgfHbu9YJZY250=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "writeConcern" : { "w" : "majority", "wtimeout" : { "$numberLong" : "10000" } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false }
[2021/12/28 23:35:08.463] 2021/12/28 23:35:08.0462: [27794]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "ok" : { "$numberDouble" : "1.0" }, "operationTime" : { "$timestamp" : { "t" : 1640734508, "i" : 7 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734508, "i" : 7 } }, "signature" : { "hash" : { "$binary" : { "base64" : "MfoWJcZJ02xaElgfHbu9YJZY250=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "recoveryToken" : { "recoveryShardId" : "sh01" } }
[2021/12/28 23:35:08.463] 2021/12/28 23:35:08.0462: [27794]:    DEBUG:       mongoc:      running operation commitTransaction : { "name" : "commitTransaction", "object" : "session0" }
[2021/12/28 23:35:08.463] { "commitTransaction" : { "$numberInt" : "1" }, "recoveryToken" : { "recoveryShardId" : "sh01" }, "$db" : "admin", "lsid" : { "id" : { "$binary" : { "base64" : "zfjAuwXAQsO46TNVu5xWtw==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734508, "i" : 7 } }, "signature" : { "hash" : { "$binary" : { "base64" : "MfoWJcZJ02xaElgfHbu9YJZY250=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "writeConcern" : { "w" : "majority", "wtimeout" : { "$numberLong" : "10000" } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false }
[2021/12/28 23:35:09.118] 2021/12/28 23:35:08.0464: [27794]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "ok" : { "$numberDouble" : "1.0" }, "operationTime" : { "$timestamp" : { "t" : 1640734508, "i" : 7 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734508, "i" : 7 } }, "signature" : { "hash" : { "$binary" : { "base64" : "MfoWJcZJ02xaElgfHbu9YJZY250=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "recoveryToken" : { "recoveryShardId" : "sh01" } }
[2021/12/28 23:35:09.118]   - commitTransaction retry succeeds on new mongos
[2021/12/28 23:35:09.369] 2021/12/28 23:35:09.0369: [27794]:    DEBUG:       mongoc:      running operation startTransaction : { "name" : "startTransaction", "object" : "session0", "arguments" : { "options" : { "writeConcern" : { "w" : "majority" } } } }
[2021/12/28 23:35:09.369] 2021/12/28 23:35:09.0369: [27794]:    DEBUG:       mongoc:      running operation insertOne : { "name" : "insertOne", "object" : "collection", "arguments" : { "session" : "session0", "document" : { "_id" : 1 } }, "result" : { "insertedId" : 1 } }
[2021/12/28 23:35:09.369] { "insert" : "test", "ordered" : true, "$db" : "transaction-tests", "lsid" : { "id" : { "$binary" : { "base64" : "Wlqozol6QxuWLkK8Fu4xOA==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734508, "i" : 47 } }, "signature" : { "hash" : { "$binary" : { "base64" : "MfoWJcZJ02xaElgfHbu9YJZY250=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "startTransaction" : true, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false, "documents" : [ { "_id" : { "$numberInt" : "1" } } ] }
[2021/12/28 23:35:09.453] 2021/12/28 23:35:09.0371: [27794]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "n" : { "$numberInt" : "1" }, "ok" : { "$numberDouble" : "1.0" }, "operationTime" : { "$timestamp" : { "t" : 1640734509, "i" : 6 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734509, "i" : 6 } }, "signature" : { "hash" : { "$binary" : { "base64" : "5FeJ6BzOz8fOivFNDz9kNnInwOE=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "recoveryToken" : { "recoveryShardId" : "sh01" } }
[2021/12/28 23:35:09.453] 2021/12/28 23:35:09.0371: [27794]:    DEBUG:       mongoc:      running operation targetedFailPoint : { "name" : "targetedFailPoint", "object" : "testRunner", "arguments" : { "session" : "session0", "failPoint" : { "configureFailPoint" : "failCommand", "mode" : { "times" : 1 }, "data" : { "failCommands" : [ "commitTransaction" ], "writeConcernError" : { "code" : 91, "errmsg" : "Replication is being shut down", "errorLabels" : [ "RetryableWriteError" ] } } } } }
[2021/12/28 23:35:09.453] 2021/12/28 23:35:09.0452: [27794]:    DEBUG:       mongoc:      running operation commitTransaction : { "name" : "commitTransaction", "object" : "session0" }
[2021/12/28 23:35:09.453] { "commitTransaction" : { "$numberInt" : "1" }, "recoveryToken" : { "recoveryShardId" : "sh01" }, "$db" : "admin", "lsid" : { "id" : { "$binary" : { "base64" : "Wlqozol6QxuWLkK8Fu4xOA==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734509, "i" : 6 } }, "signature" : { "hash" : { "$binary" : { "base64" : "5FeJ6BzOz8fOivFNDz9kNnInwOE=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "writeConcern" : { "w" : "majority" }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false }
[2021/12/28 23:35:09.543] 2021/12/28 23:35:09.0456: [27794]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "ok" : { "$numberDouble" : "1.0" }, "writeConcernError" : { "code" : { "$numberInt" : "91" }, "errmsg" : "Replication is being shut down", "errorLabels" : [ "RetryableWriteError" ] }, "operationTime" : { "$timestamp" : { "t" : 1640734509, "i" : 7 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734509, "i" : 7 } }, "signature" : { "hash" : { "$binary" : { "base64" : "5FeJ6BzOz8fOivFNDz9kNnInwOE=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "recoveryToken" : { "recoveryShardId" : "sh01" } }
[2021/12/28 23:35:09.543] { "commitTransaction" : { "$numberInt" : "1" }, "recoveryToken" : { "recoveryShardId" : "sh01" }, "$db" : "admin", "lsid" : { "id" : { "$binary" : { "base64" : "Wlqozol6QxuWLkK8Fu4xOA==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734509, "i" : 7 } }, "signature" : { "hash" : { "$binary" : { "base64" : "5FeJ6BzOz8fOivFNDz9kNnInwOE=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "writeConcern" : { "w" : "majority", "wtimeout" : { "$numberLong" : "10000" } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false }
[2021/12/28 23:35:10.385] 2021/12/28 23:35:09.0574: [27794]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "ok" : { "$numberDouble" : "1.0" }, "operationTime" : { "$timestamp" : { "t" : 1640734509, "i" : 7 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734509, "i" : 7 } }, "signature" : { "hash" : { "$binary" : { "base64" : "5FeJ6BzOz8fOivFNDz9kNnInwOE=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "recoveryToken" : {  } }
[2021/12/28 23:35:10.386]   - commitTransaction retry fails on new mongos
[2021/12/28 23:35:10.679] 2021/12/28 23:35:10.0679: [27794]:    DEBUG:       mongoc:      running operation startTransaction : { "name" : "startTransaction", "object" : "session0" }
[2021/12/28 23:35:10.679] 2021/12/28 23:35:10.0679: [27794]:    DEBUG:       mongoc:      running operation insertOne : { "name" : "insertOne", "object" : "collection", "arguments" : { "session" : "session0", "document" : { "_id" : 1 } }, "result" : { "insertedId" : 1 } }
[2021/12/28 23:35:10.680] { "insert" : "test", "ordered" : true, "$db" : "transaction-tests", "lsid" : { "id" : { "$binary" : { "base64" : "DA7+yPRVRKOnb4ZwKszGdQ==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734510, "i" : 6 } }, "signature" : { "hash" : { "$binary" : { "base64" : "aqf5zRfBgKzNqU/+dO3v9Jn8rUc=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "startTransaction" : true, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false, "documents" : [ { "_id" : { "$numberInt" : "1" } } ] }
[2021/12/28 23:35:10.853] 2021/12/28 23:35:10.0681: [27794]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "n" : { "$numberInt" : "1" }, "ok" : { "$numberDouble" : "1.0" }, "operationTime" : { "$timestamp" : { "t" : 1640734510, "i" : 12 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734510, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "aqf5zRfBgKzNqU/+dO3v9Jn8rUc=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "recoveryToken" : { "recoveryShardId" : "sh01" } }
[2021/12/28 23:35:10.853] 2021/12/28 23:35:10.0681: [27794]:    DEBUG:       mongoc:      running operation targetedFailPoint : { "name" : "targetedFailPoint", "object" : "testRunner", "arguments" : { "session" : "session0", "failPoint" : { "configureFailPoint" : "failCommand", "mode" : { "times" : 7 }, "data" : { "failCommands" : [ "commitTransaction", "isMaster", "hello" ], "closeConnection" : true } } } }
[2021/12/28 23:35:10.853] 2021/12/28 23:35:10.0852: [27794]:    DEBUG:       mongoc:      running operation commitTransaction : { "name" : "commitTransaction", "object" : "session0", "result" : { "errorLabelsContain" : [ "TransientTransactionError" ], "errorLabelsOmit" : [ "UnknownTransactionCommitResult" ], "errorCodeName" : "NoSuchTransaction" } }
[2021/12/28 23:35:10.853] { "commitTransaction" : { "$numberInt" : "1" }, "recoveryToken" : { "recoveryShardId" : "sh01" }, "$db" : "admin", "lsid" : { "id" : { "$binary" : { "base64" : "DA7+yPRVRKOnb4ZwKszGdQ==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734510, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "aqf5zRfBgKzNqU/+dO3v9Jn8rUc=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false }
[2021/12/28 23:35:10.854] 2021/12/28 23:35:10.0854: [27794]:    DEBUG:       mongoc: <-- commitTransaction COMMAND FAILED: Failed to send "commitTransaction" command with database "admin": Failed to read 4 bytes: socket error or timeout
[2021/12/28 23:35:10.854] REPLY: { }
[2021/12/28 23:35:10.854] { "commitTransaction" : { "$numberInt" : "1" }, "recoveryToken" : { "recoveryShardId" : "sh01" }, "$db" : "admin", "lsid" : { "id" : { "$binary" : { "base64" : "DA7+yPRVRKOnb4ZwKszGdQ==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1640734510, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "aqf5zRfBgKzNqU/+dO3v9Jn8rUc=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7046897338131939350" } } }, "writeConcern" : { "w" : "majority", "wtimeout" : { "$numberLong" : "10000" } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false }
[2021/12/28 23:35:29.948] heartbeat failed:
[2021/12/28 23:35:29.948] error calling legacy hello: 'No suitable servers found: `serverselectiontimeoutms` timed out: [Failed to receive length header from server. calling hello on 'localhost:27017']'
[2021/12/28 23:35:29.948] URI = mongodb://bob:pwd123@localhost:27017/?ssl=true
[2021/12/28 23:35:29.990] Aborted
[2021/12/28 23:35:29.990] Command failed: command encountered problem: error waiting on process '5409f9ba-adf8-4795-b326-9f0623e4649a': exit status 134
[2021/12/28 23:35:29.990] Task completed - FAILURE.



 Comments   
Comment by Jeremy Mikola [ 29/Dec/21 ]

Revising the title as the test in question appears to be "commitTransaction retry fails on new mongos". That is the last test name printed in the logs before the unexpected error (see: another task failure).

Generated at Wed Feb 07 21:20:24 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.