[SERVER-39187] Rerunning commitTransaction on a new mongos blocks forever Created: 25/Jan/19  Updated: 29/Oct/23  Resolved: 30/Jan/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 4.1.7
Fix Version/s: 4.1.8

Type: Bug Priority: Major - P3
Reporter: Shane Harvey Assignee: Matthew Saltz (Inactive)
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File reproHangingCommit.js    
Issue Links:
Depends
Related
related to SERVER-37344 Implement recovery token for retrying... Closed
is related to SERVER-39349 Recovering the state of a completed s... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2019-02-11
Participants:

 Description   

Rerunning commitTransaction, with the recoveryToken added in SERVER-37344, on a new mongos blocks forever. It also seems to get the cluster into a state where it cannot accept any writes (even to other databases) but the shard still reports itself as the primary. Also, both the shard server and config server do not shutdown normally and need to be killed with SIGKILL.

To reproduce start a sharded cluster with at least two mongoses (my cluster a one config server and a one node shard). Run the repro script: reproHangingCommit.js

$ mongo reproHangingCommit.js
MongoDB shell version v4.0.1
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 4.1.7
WARNING: shell and server versions do not match
Starting transaction on mongos #1: {
	"insert" : "test",
	"documents" : [
		{
			"_id" : ObjectId("5c4a55e0542fbbcc137ad1cd")
		}
	],
	"lsid" : {
		"id" : UUID("6f579bae-6919-4e07-ac80-fe056861b2b9")
	},
	"txnNumber" : NumberLong(1),
	"autocommit" : false,
	"startTransaction" : true
}
Commit transaction on mongos #1: {
	"commitTransaction" : 1,
	"lsid" : {
		"id" : UUID("6f579bae-6919-4e07-ac80-fe056861b2b9")
	},
	"txnNumber" : NumberLong(1),
	"autocommit" : false,
	"recoveryToken" : {
		"shardId" : "demo-set-0"
	}
}
Commit transaction on mongos #2: {
	"commitTransaction" : 1,
	"lsid" : {
		"id" : UUID("6f579bae-6919-4e07-ac80-fe056861b2b9")
	},
	"txnNumber" : NumberLong(1),
	"autocommit" : false,
	"recoveryToken" : {
		"shardId" : "demo-set-0"
	}
}
// Hangs forever waiting for the commit on mongos #2

db.currentOp() reports an ongoing coordinateCommitTransaction command that never ends. I've attached an example currentOp output at the bottom of the repro script.



 Comments   
Comment by Matthew Saltz (Inactive) [ 30/Jan/19 ]

shane.harvey Note that even with the fix, if the transaction does not go through two-phase commit, the recovery mongos will have to wait for the coordinator to expire, which means that it will still take 60 seconds by default (the value of the transactionLifetimeLimitSeconds parameter) before recovering the commit decision.

Comment by Githook User [ 30/Jan/19 ]

Author:

{'username': 'saltzm', 'email': 'matthew.saltz@mongodb.com', 'name': 'Matthew Saltz'}

Message: SERVER-39187 Signal coordinator's decision promise when the coordinator is canceled before beginning commit
Branch: master
https://github.com/mongodb/mongo/commit/d1771e696b6df883af70eedaaa0733548c573fec

Comment by Esha Maharishi (Inactive) [ 29/Jan/19 ]

shane.harvey,

Is this a consequence of the behavior described in this comment:

yes, this is correct.

Comment by Shane Harvey [ 29/Jan/19 ]

I think the hang occurs because the TransactionCoordinator's _finalDecisionPromise is not set when the coordinator times out waiting to hear the participant list

Why does the coordinator time out at all? The transaction has already committed on the first attempt. Shouldn't the second commit immediately return success? Is this a consequence of the behavior described in this comment:

single shard transactions don't send coordinateCommit. And the current behavior is that attempting to recover a single shard transaction commit will make it wait for the coordinator to timeout and then check for the commit state of the transaction locally.

Comment by Matthew Saltz (Inactive) [ 29/Jan/19 ]

I also reproduced this in a unit test. I'm not sure why BrokenPromise isn't thrown when the coordinator is destroyed after get() is called, but we aren't supposed to rely on that behavior anyway so I'll fix it to set an error when cancelIfCommitNotYetStarted is called.

Comment by Esha Maharishi (Inactive) [ 29/Jan/19 ]

I think the hang occurs because the TransactionCoordinator's _finalDecisionPromise is not set when the coordinator times out waiting to hear the participant list, because the _finalDecisionPromise is only set when a decision is made or in _handleCompletionStatus when coordinating a commit or continuing to coordinate a commit fails with an error.

This causes the coordinateCommitTransaction command to hang, because coordinateCommitTransaction is waiting for _finalDecisionPromise to be signaled.

Running the repro with shards verbosity at 3, I see the following log lines under the TXN component:

c20001| 2019-01-29T10:06:30.811-0500 I TXN      [TransactionCoordinator-StepUp] Need to resume coordinating commit for 0 transactions
s20004| 2019-01-29T10:06:54.452-0500 I TXN      [conn12] f7097417-d05b-49dc-bfa8-9ec1f090baa0:1 New transaction started
s20004| 2019-01-29T10:06:54.452-0500 I TXN      [conn12] f7097417-d05b-49dc-bfa8-9ec1f090baa0:1 Setting global snapshot timestamp to { ts: Timestamp(1548774414, 6) } on statement 0
d20000| 2019-01-29T10:06:54.459-0500 D TXN      [conn23] Inserting coordinator for transaction 1 on session { id: UUID("f7097417-d05b-49dc-bfa8-9ec1f090baa0"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) } into in-memory catalog
s20004| 2019-01-29T10:06:54.477-0500 I TXN      [conn12] f7097417-d05b-49dc-bfa8-9ec1f090baa0:1 Committing single shard transaction, single participant: __unknown_name__-rs0
d20000| 2019-01-29T10:06:54.483-0500 D TXN      [conn24] Received commitTransaction for transaction with txnNumber 1 on session { id: UUID("f7097417-d05b-49dc-bfa8-9ec1f090baa0"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }
s20005| 2019-01-29T10:06:54.506-0500 I TXN      [conn12] f7097417-d05b-49dc-bfa8-9ec1f090baa0:1 New transaction started
d20000| 2019-01-29T10:06:54.525-0500 D TXN      [conn26] Coordinator shard received request to recover commit decision for transaction 1 on session { id: UUID("f7097417-d05b-49dc-bfa8-9ec1f090baa0"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }
d20000| 2019-01-29T10:07:54.454-0500 D TXN      [Sharding-Fixed-0] Removing coordinator for transaction 1 on session { id: UUID("f7097417-d05b-49dc-bfa8-9ec1f090baa0"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) } from in-memory catalog
d20000| 2019-01-29T10:07:54.454-0500 D TXN      [Sharding-Fixed-0] Signaling last active coordinator removed

and the coordinator shard primary (d20000) hanging here:

#0  0x00007f2b3ccd3360 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f2b4900095c in __gthread_cond_wait (__mutex=<optimized out>, __cond=<optimized out>)
    at /data/mci/cbe0dcd7144cecba8abf51bbc964b13c/toolchain-builder/build-gcc-v2.sh-szx/x86_64-mongodb-linux/libstdc++-v3/include/x86_64-mongodb-linux/bits/gthr-default.h:864
#2  std::condition_variable::wait (this=<optimized out>, __lock=...) at ../../../../../gcc-5.4.0/libstdc++-v3/src/c++11/condition_variable.cc:53
#3  0x00007f2b3fa5918e in (anonymous namespace)::Waitable::wait (waitable=0x0, clkSource=0x562b9ca0dd60, cv=..., lk=...) at src/mongo/util/waitable.h:64
#4  0x00007f2b3fa53f49 in operator() (this=0x7f2b240a35f8) at src/mongo/db/operation_context.cpp:288
#5  0x00007f2b3fa53ba2 in (anonymous namespace)::OperationContext::waitForConditionOrInterruptNoAssertUntil (this=0x7f29d0002aa0, cv=..., m=..., deadline=...) at src/mongo/db/operation_context.cpp:286
#6  0x00007f2b49710851 in (anonymous namespace)::Interruptible::waitForConditionOrInterruptNoAssert (this=0x7f29d0002aa0, cv=..., m=...) at src/mongo/util/interruptible.h:230
#7  0x00007f2b49710711 in (anonymous namespace)::Interruptible::waitForConditionOrInterrupt (this=0x7f29d0002aa0, cv=..., m=...) at src/mongo/util/interruptible.h:207
---Type <return> to continue, or q <return> to quit---
#8  0x00007f2b49710356 in (anonymous namespace)::Interruptible::waitForConditionOrInterrupt<(lambda at src/mongo/util/future.h:334:61)>((anonymous namespace)::(anonymous namespace)::condition_variable &, (anonymous namespace)::unique_lock<std::mutex> &, class {...}) (this=0x7f29d0002aa0, cv=..., m=..., pred=...) at src/mongo/util/interruptible.h:220
#9  0x00007f2b497101b2 in (anonymous namespace)::(anonymous namespace)::SharedStateBase::wait (this=0x7f29d0003370, interruptible=0x7f29d0002aa0) at src/mongo/util/future.h:334
#10 0x00007f2b48adb172 in (anonymous namespace)::(anonymous namespace)::Future<mongo::txn::CommitDecision>::getImpl (this=0x7f2b240a42e8, interruptible=0x7f29d0002aa0) at src/mongo/util/future.h:1213
#11 0x00007f2b48ad913d in (anonymous namespace)::(anonymous namespace)::Future<mongo::txn::CommitDecision>::get (this=0x7f2b240a42e8, interruptible=0x7f29d0002aa0) at src/mongo/util/future.h:746
#12 0x00007f2b48ad49bd in (anonymous namespace)::(anonymous namespace)::CoordinateCommitTransactionCmd::Invocation::typedRun (this=0x7f29d0002c80, opCtx=0x7f29d0002aa0) at src/mongo/db/s/txn_two_phase_commit_cmds.cpp:245
#13 0x00007f2b48ad3cf8 in (anonymous namespace)::TypedCommand<mongo::(anonymous namespace)::CoordinateCommitTransactionCmd>::InvocationBase::_callTypedRun (this=0x7f29d0002c80, opCtx=0x7f29d0002aa0) at src/mongo/db/commands.h:775
#14 0x00007f2b48ad3cc1 in (anonymous namespace)::TypedCommand<mongo::(anonymous namespace)::CoordinateCommitTransactionCmd>::InvocationBase::_runImpl (this=0x7f29d0002c80, opCtx=0x7f29d0002aa0) at src/mongo/db/commands.h:778
#15 0x00007f2b48ad39a5 in (anonymous namespace)::TypedCommand<mongo::(anonymous namespace)::CoordinateCommitTransactionCmd>::InvocationBase::run (this=0x7f29d0002c80, opCtx=0x7f29d0002aa0, reply=0x7f29d00030c0)
    at src/mongo/db/commands.h:786
#16 0x00007f2b3c2d306c in (anonymous namespace)::(anonymous namespace)::runCommandImpl (opCtx=0x7f29d0002aa0, invocation=0x7f29d0002c80, request=..., replyBuilder=0x7f29d00030c0, startOperationTime=..., behaviors=..., 
<snipped>

I think one fix would be to call _handleCompletionStatus in cancelCommitIfNotYetStarted, where currently only _transitionToDone is called.

However, I remember matthew.saltz had wanted to do a small refactor to simplify _handleCompletionStatus and _transitionToDone.

Matthew, do you want to do that refactor under this ticket while also fixing this bug?

Generated at Thu Feb 08 04:51:17 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.