[CDRIVER-4351] /transactions/legacy/pin-mongos failure Created: 11/Apr/22  Updated: 10/Oct/23

Status: Backlog
Project: C Driver
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Roberto Sanchez Assignee: Unassigned
Resolution: Unresolved Votes: 0
Labels: failing-on-waterfall, flaky-tests
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Epic Link: Stabilize Evergreen

 Description   

The /transactions/legacy/pin-mongos/ JSON spec test currently fails and/or is flaky. The specific failure is in the "unpin after transient error within a transaction and commit" sub-test. The test failure is a result of server selection timeout.

The test is currently being skipped, so to work on it, it will first be necessary to remove it from .evergreen/skip-tests.txt and run a patch build to determine the current failure mode(s).



 Comments   
Comment by Kevin Albertson [ 10/Oct/23 ]

Test was unskipped and run in a patch build with all variants and tasks with names matching .*sharded.*: https://spruce.mongodb.com/version/6525492157e85af25577b156

tsan-sasl-cyrus-openssl-ubuntu1804-clang-test-4.4-sharded-auth failed with logs:

[2023/10/10 12:58:12.406]   - remain pinned after non-transient error on commit
[2023/10/10 12:58:12.633] 2023/10/10 12:58:12.0632: [ 3622]:    DEBUG:       mongoc:      running operation startTransaction : { "name" : "startTransaction", "object" : "session0" }
[2023/10/10 12:58:12.633] 2023/10/10 12:58:12.0632: [ 3622]:    DEBUG:       mongoc:      running operation insertMany : { "name" : "insertMany", "object" : "collection", "arguments" : { "documents" : [ { "_id" : 3 }, { "_id" : 4 } ], "session" : "session0" }, "result" : { "insertedIds" : { "0" : 3, "1" : 4 } } }
[2023/10/10 12:58:12.633] { "insert" : "test", "ordered" : true, "$db" : "transaction-tests", "lsid" : { "id" : { "$binary" : { "base64" : "OOq9OPUZTYSazLaoykJ6nQ==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942691, "i" : 14 } }, "signature" : { "hash" : { "$binary" : { "base64" : "ZV1ZLqZbllV8ZnSHFZfRc2MQH78=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "startTransaction" : true, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false, "documents" : [ { "_id" : { "$numberInt" : "3" } }, { "_id" : { "$numberInt" : "4" } } ] }
[2023/10/10 12:58:12.715] 2023/10/10 12:58:12.0634: [ 3622]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "n" : { "$numberInt" : "2" }, "ok" : { "$numberDouble" : "1.0" }, "operationTime" : { "$timestamp" : { "t" : 1696942692, "i" : 12 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942692, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "OTh7hEfA8clhqFpZwoRofnE94k0=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "recoveryToken" : { "recoveryShardId" : "sh01" } }
[2023/10/10 12:58:12.715] 2023/10/10 12:58:12.0635: [ 3622]:    DEBUG:       mongoc:      running operation assertSessionPinned : { "name" : "assertSessionPinned", "object" : "testRunner", "arguments" : { "session" : "session0" } }
[2023/10/10 12:58:12.715] 2023/10/10 12:58:12.0635: [ 3622]:    DEBUG:       mongoc:      running operation targetedFailPoint : { "name" : "targetedFailPoint", "object" : "testRunner", "arguments" : { "session" : "session0", "failPoint" : { "configureFailPoint" : "failCommand", "mode" : { "times" : 1 }, "data" : { "failCommands" : [ "commitTransaction" ], "errorCode" : 51 } } } }
[2023/10/10 12:58:12.715] 2023/10/10 12:58:12.0715: [ 3622]:    DEBUG:       mongoc:      running operation commitTransaction : { "name" : "commitTransaction", "object" : "session0", "result" : { "errorLabelsOmit" : [ "TransientTransactionError" ], "errorCode" : 51 } }
[2023/10/10 12:58:12.715] { "commitTransaction" : { "$numberInt" : "1" }, "recoveryToken" : { "recoveryShardId" : "sh01" }, "$db" : "admin", "lsid" : { "id" : { "$binary" : { "base64" : "OOq9OPUZTYSazLaoykJ6nQ==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942692, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "OTh7hEfA8clhqFpZwoRofnE94k0=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false }
[2023/10/10 12:58:12.717] 2023/10/10 12:58:12.0716: [ 3622]:    DEBUG:       mongoc: <-- commitTransaction COMMAND FAILED: Failing command due to 'failCommand' failpoint
[2023/10/10 12:58:12.717] REPLY: { "ok" : { "$numberDouble" : "0.0" }, "errmsg" : "Failing command due to 'failCommand' failpoint", "code" : { "$numberInt" : "51" }, "codeName" : "ManualInterventionRequired", "operationTime" : { "$timestamp" : { "t" : 1696942692, "i" : 12 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942692, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "OTh7hEfA8clhqFpZwoRofnE94k0=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } } }
[2023/10/10 12:58:12.717] 2023/10/10 12:58:12.0716: [ 3622]:    DEBUG:       mongoc:      running operation assertSessionPinned : { "name" : "assertSessionPinned", "object" : "testRunner", "arguments" : { "session" : "session0" } }
[2023/10/10 12:58:12.717] 2023/10/10 12:58:12.0716: [ 3622]:    DEBUG:       mongoc:      running operation commitTransaction : { "name" : "commitTransaction", "object" : "session0" }
[2023/10/10 12:58:12.717] { "commitTransaction" : { "$numberInt" : "1" }, "recoveryToken" : { "recoveryShardId" : "sh01" }, "$db" : "admin", "lsid" : { "id" : { "$binary" : { "base64" : "OOq9OPUZTYSazLaoykJ6nQ==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942692, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "OTh7hEfA8clhqFpZwoRofnE94k0=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "writeConcern" : { "w" : "majority", "wtimeout" : { "$numberLong" : "10000" } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false }
[2023/10/10 12:58:13.480] 2023/10/10 12:58:12.0719: [ 3622]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "ok" : { "$numberDouble" : "1.0" }, "operationTime" : { "$timestamp" : { "t" : 1696942692, "i" : 13 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942692, "i" : 14 } }, "signature" : { "hash" : { "$binary" : { "base64" : "OTh7hEfA8clhqFpZwoRofnE94k0=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "recoveryToken" : { "recoveryShardId" : "sh01" } }
[2023/10/10 12:58:13.480] 2023/10/10 12:58:12.0720: [ 3622]:    DEBUG:       mongoc:      running operation assertSessionPinned : { "name" : "assertSessionPinned", "object" : "testRunner", "arguments" : { "session" : "session0" } }
[2023/10/10 12:58:13.480]   - unpin after transient error within a transaction
[2023/10/10 12:58:13.707] 2023/10/10 12:58:13.0707: [ 3622]:    DEBUG:       mongoc:      running operation startTransaction : { "name" : "startTransaction", "object" : "session0" }
[2023/10/10 12:58:13.707] 2023/10/10 12:58:13.0707: [ 3622]:    DEBUG:       mongoc:      running operation insertOne : { "name" : "insertOne", "object" : "collection", "arguments" : { "session" : "session0", "document" : { "_id" : 3 } }, "result" : { "insertedId" : 3 } }
[2023/10/10 12:58:13.707] { "insert" : "test", "ordered" : true, "$db" : "transaction-tests", "lsid" : { "id" : { "$binary" : { "base64" : "QsjwfuheSFKGmXHmFi9XEg==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942692, "i" : 14 } }, "signature" : { "hash" : { "$binary" : { "base64" : "OTh7hEfA8clhqFpZwoRofnE94k0=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "startTransaction" : true, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false, "documents" : [ { "_id" : { "$numberInt" : "3" } } ] }
[2023/10/10 12:58:13.856] 2023/10/10 12:58:13.0709: [ 3622]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "n" : { "$numberInt" : "1" }, "ok" : { "$numberDouble" : "1.0" }, "operationTime" : { "$timestamp" : { "t" : 1696942693, "i" : 12 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942693, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "ILAwEaapufVSx1t6gLI2Su41eKs=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "recoveryToken" : { "recoveryShardId" : "sh01" } }
[2023/10/10 12:58:13.856] 2023/10/10 12:58:13.0709: [ 3622]:    DEBUG:       mongoc:      running operation targetedFailPoint : { "name" : "targetedFailPoint", "object" : "testRunner", "arguments" : { "session" : "session0", "failPoint" : { "configureFailPoint" : "failCommand", "mode" : { "times" : 1 }, "data" : { "failCommands" : [ "insert" ], "closeConnection" : true } } } }
[2023/10/10 12:58:13.856] 2023/10/10 12:58:13.0855: [ 3622]:    DEBUG:       mongoc:      running operation insertOne : { "name" : "insertOne", "object" : "collection", "arguments" : { "session" : "session0", "document" : { "_id" : 4 } }, "result" : { "errorLabelsContain" : [ "TransientTransactionError" ], "errorLabelsOmit" : [ "UnknownTransactionCommitResult" ] } }
[2023/10/10 12:58:13.856] { "insert" : "test", "ordered" : true, "$db" : "transaction-tests", "lsid" : { "id" : { "$binary" : { "base64" : "QsjwfuheSFKGmXHmFi9XEg==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942693, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "ILAwEaapufVSx1t6gLI2Su41eKs=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false, "documents" : [ { "_id" : { "$numberInt" : "4" } } ] }
[2023/10/10 12:58:13.857] 2023/10/10 12:58:13.0856: [ 3622]:    DEBUG:      cluster: could not read message length, stream probably closed or timed out
[2023/10/10 12:58:13.857] 2023/10/10 12:58:13.0857: [ 3622]:    DEBUG:       mongoc: <-- insert COMMAND FAILED: Failed to send "insert" command with database "transaction-tests": Failed to read 4 bytes: socket error or timeout
[2023/10/10 12:58:13.857] REPLY: { "errorLabels" : [ "TransientTransactionError" ] }
[2023/10/10 12:58:13.857] 2023/10/10 12:58:13.0857: [ 3622]:    DEBUG:       mongoc:      running operation abortTransaction : { "name" : "abortTransaction", "object" : "session0" }
[2023/10/10 12:58:13.857] { "abortTransaction" : { "$numberInt" : "1" }, "recoveryToken" : { "recoveryShardId" : "sh01" }, "$db" : "admin", "lsid" : { "id" : { "$binary" : { "base64" : "QsjwfuheSFKGmXHmFi9XEg==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942693, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "ILAwEaapufVSx1t6gLI2Su41eKs=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false }
[2023/10/10 12:58:14.561] 2023/10/10 12:58:13.0858: [ 3622]:    DEBUG:       mongoc: <-- abortTransaction COMMAND FAILED: cannot continue txnId -1 for session 42c8f07e-e85e-4852-8699-71e6162f5712 - Il1bctjvfYRA0JpLUY4Xle2E6bw+KM1YpMr4x52qGgE= with txnId 1
[2023/10/10 12:58:14.561] REPLY: { "ok" : { "$numberDouble" : "0.0" }, "errmsg" : "cannot continue txnId -1 for session 42c8f07e-e85e-4852-8699-71e6162f5712 - Il1bctjvfYRA0JpLUY4Xle2E6bw+KM1YpMr4x52qGgE= with txnId 1", "code" : { "$numberInt" : "251" }, "codeName" : "NoSuchTransaction", "operationTime" : { "$timestamp" : { "t" : 1696942693, "i" : 12 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942693, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "ILAwEaapufVSx1t6gLI2Su41eKs=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "errorLabels" : [ "TransientTransactionError" ] }
[2023/10/10 12:58:14.561]   - unpin after transient error within a transaction and commit
[2023/10/10 12:58:14.855] 2023/10/10 12:58:14.0787: [ 3622]:    DEBUG:       mongoc:      running operation startTransaction : { "name" : "startTransaction", "object" : "session0" }
[2023/10/10 12:58:14.855] 2023/10/10 12:58:14.0787: [ 3622]:    DEBUG:       mongoc:      running operation insertOne : { "name" : "insertOne", "object" : "collection", "arguments" : { "session" : "session0", "document" : { "_id" : 3 } }, "result" : { "insertedId" : 3 } }
[2023/10/10 12:58:14.855] { "insert" : "test", "ordered" : true, "$db" : "transaction-tests", "lsid" : { "id" : { "$binary" : { "base64" : "AjKZsNb8SCi1nO8bJAHxAg==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942693, "i" : 12 } }, "signature" : { "hash" : { "$binary" : { "base64" : "ILAwEaapufVSx1t6gLI2Su41eKs=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "startTransaction" : true, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false, "documents" : [ { "_id" : { "$numberInt" : "3" } } ] }
[2023/10/10 12:58:14.937] 2023/10/10 12:58:14.0856: [ 3622]:    DEBUG:       mongoc: <-- COMMAND SUCCEEDED: { "n" : { "$numberInt" : "1" }, "ok" : { "$numberDouble" : "1.0" }, "operationTime" : { "$timestamp" : { "t" : 1696942694, "i" : 15 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942694, "i" : 15 } }, "signature" : { "hash" : { "$binary" : { "base64" : "DfOhszV6syGGXVMx7TRn4VQb5Y4=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "recoveryToken" : { "recoveryShardId" : "sh01" } }
[2023/10/10 12:58:14.937] 2023/10/10 12:58:14.0857: [ 3622]:    DEBUG:       mongoc:      running operation targetedFailPoint : { "name" : "targetedFailPoint", "object" : "testRunner", "arguments" : { "session" : "session0", "failPoint" : { "configureFailPoint" : "failCommand", "mode" : { "times" : 7 }, "data" : { "failCommands" : [ "insert", "isMaster", "hello" ], "closeConnection" : true } } } }
[2023/10/10 12:58:14.937] 2023/10/10 12:58:14.0936: [ 3622]:    DEBUG:       mongoc:      running operation insertOne : { "name" : "insertOne", "object" : "collection", "arguments" : { "session" : "session0", "document" : { "_id" : 4 } }, "result" : { "errorLabelsContain" : [ "TransientTransactionError" ], "errorLabelsOmit" : [ "UnknownTransactionCommitResult" ] } }
[2023/10/10 12:58:14.937] { "insert" : "test", "ordered" : true, "$db" : "transaction-tests", "lsid" : { "id" : { "$binary" : { "base64" : "AjKZsNb8SCi1nO8bJAHxAg==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942694, "i" : 15 } }, "signature" : { "hash" : { "$binary" : { "base64" : "DfOhszV6syGGXVMx7TRn4VQb5Y4=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false, "documents" : [ { "_id" : { "$numberInt" : "4" } } ] }
[2023/10/10 12:58:14.938] 2023/10/10 12:58:14.0937: [ 3622]:    DEBUG:      cluster: could not read message length, stream probably closed or timed out
[2023/10/10 12:58:14.938] 2023/10/10 12:58:14.0938: [ 3622]:    DEBUG:       mongoc: <-- insert COMMAND FAILED: Failed to send "insert" command with database "transaction-tests": Failed to read 4 bytes: socket error or timeout
[2023/10/10 12:58:14.938] REPLY: { "errorLabels" : [ "TransientTransactionError" ] }
[2023/10/10 12:58:14.938] 2023/10/10 12:58:14.0938: [ 3622]:    DEBUG:       mongoc:      running operation commitTransaction : { "name" : "commitTransaction", "object" : "session0", "result" : { "errorLabelsContain" : [ "TransientTransactionError" ], "errorLabelsOmit" : [ "UnknownTransactionCommitResult" ], "errorCodeName" : "NoSuchTransaction" } }
[2023/10/10 12:58:14.938] { "commitTransaction" : { "$numberInt" : "1" }, "recoveryToken" : { "recoveryShardId" : "sh01" }, "$db" : "admin", "lsid" : { "id" : { "$binary" : { "base64" : "AjKZsNb8SCi1nO8bJAHxAg==", "subType" : "04" } } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942694, "i" : 15 } }, "signature" : { "hash" : { "$binary" : { "base64" : "DfOhszV6syGGXVMx7TRn4VQb5Y4=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "txnNumber" : { "$numberLong" : "1" }, "autocommit" : false }
[2023/10/10 12:58:34.014] 2023/10/10 12:58:14.0940: [ 3622]:    DEBUG:       mongoc: <-- commitTransaction COMMAND FAILED: Recovering the transaction's outcome found the transaction aborted
[2023/10/10 12:58:34.014] REPLY: { "errorLabels" : [ "TransientTransactionError" ], "ok" : { "$numberDouble" : "0.0" }, "errmsg" : "Recovering the transaction's outcome found the transaction aborted", "code" : { "$numberInt" : "251" }, "codeName" : "NoSuchTransaction", "operationTime" : { "$timestamp" : { "t" : 1696942694, "i" : 16 } }, "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1696942694, "i" : 16 } }, "signature" : { "hash" : { "$binary" : { "base64" : "DfOhszV6syGGXVMx7TRn4VQb5Y4=", "subType" : "00" } }, "keyId" : { "$numberLong" : "7288313219297312791" } } }, "recoveryToken" : {  } }
[2023/10/10 12:58:34.014] test error in: /data/mci/b9cb4729387f9d7ecca03494c0c112d5/mongoc/src/libmongoc/tests/test-libmongoc.c 924:call_hello_with_host_and_port()
[2023/10/10 12:58:34.014] error calling legacy hello: 'No suitable servers found: `serverselectiontimeoutms` timed out: [Failed to receive length header from server. calling hello on 'localhost:27017']'

Leaving in "Backlog".

Generated at Wed Feb 07 21:20:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.