[SERVER-35829]  Make router TransactionParticipant attach txnNumber to requests if not already there Created: 26/Jun/18  Updated: 29/Oct/23  Resolved: 25/Jul/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.1.2

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: Jack Mulrow
Resolution: Fixed Votes: 0
Labels: ShardedTxn:RouterSupport
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-36260 Mongos should reject commands that ar... Closed
is related to SERVER-36277 Allow mongos to forward txnNumbers th... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2018-07-30
Participants:

 Description   

For example, sending the command like this to mongos:

{ killCursors: "kill_cursors_in_transaction", cursors: [ 8889987431650893989 ], lsid: { id: UUID("016700de-5430-4f62-9659-6c09e1e46bf6") }, txnNumber: 1, autocommit: false, stmtId: 0, startTransaction: true, $db: "test" }                 

Will result in txnNumber field missing when mongos sends killCursors to shard.



 Comments   
Comment by Githook User [ 25/Jul/18 ]

Author:

{'name': 'Jack Mulrow', 'email': 'jack.mulrow@mongodb.com', 'username': 'jsmulrow'}

Message: SERVER-35829 Make router TransactionParticipant attach txnNumber to requests if not already there
Branch: master
https://github.com/mongodb/mongo/commit/a3cde2d51b6ed790b0cbb1c6949e91f6e45b878e

Comment by Kaloian Manassiev [ 25/Jul/18 ]

Ah ok, that makes sense. I am fine with going the static analysis approach first then in order to make progress.

However, please file a ticket though to consider fixing these problems, which make it difficult to just use transactions and let mongod deal with it. We may not be able to fix it this release, but we should at least be aware of it.

Comment by Jack Mulrow [ 25/Jul/18 ]

The way I understand your proposal Jack Mulrow is that we will share the whitelist between mongos and mongod and will fix the mongos commands, which currently are not "transaction aware" to make mongos match what is supported by mongod, right?

kaloian.manassiev, yeah that's pretty much what I'm proposing. We could move a lot of the static checks from mongod into functions that mongos and mongod could share.

Since we attach transaction information at a very low level, in the ShardingTaskExecutor, where everything funnels, would it be easy to just keep passing transactions through from mongos, if txnNumber is included and just letting the distributed transaction machinery and the mongod deal with it?

I do think it would be nicest to just attach the txnNumber in the sharding task executor if there is one on the operation context and let mongod decide what to do, but when I tried that, I realized mongos can re-use the operation context to perform metadata operations (like here when the ChunkManagerTargeter might create a collection) and it would attach the txnNumber to those requests too, which isn't allowed.

So I was thinking we can either change all metadata refreshes / operations to use a different operation context (or somehow disable the txnNumber), which might be a lot of work, or fix the few commands on mongos that actually support transactions to not drop txnNumbers. I'm fine doing either, but since the other commands already have been fixed to forward txnNumbers, I figured it would be less work to just modify killCursors. We may need to do something about refreshes eventually though, because I'm worried our current machinery will consider the config server a participant if an operation requires a metadata refresh.

Comment by Kaloian Manassiev [ 25/Jul/18 ]

The way I understand your proposal jack.mulrow is that we will share the whitelist between mongos and mongod and will fix the mongos commands, which currently are not "transaction aware" to make mongos match what is supported by mongod, right?

Since we attach transaction information at a very low level, in the ShardingTaskExecutor, where everything funnels, would it be easy to just keep passing transactions through from mongos, if txnNumber is included and just letting the distributed transaction machinery and the mongod deal with it?

Here is my thinking:

On one hand I was thinking that this is a little bit difficult to reason about, because for example what happens if in a newer mongod version we support more (or less) transactional commands than what the mongos from the previous version supports. It will be the case that either mongos sends transactions for something mongod will reject or otherwise will reject something, which mongod would have accepted.

On the other hand, I realize that mongos will need to know whether transactions are supported anyways in order to involve the distributed commit machinery, but why not involve it unconditionally if someone is sending it requests with txnNumbers? If mongod doesn't support something, it will reject it and the transaction will get rolled back. We will also get more test coverage from this.

Comment by Randolph Tan [ 24/Jul/18 ]

jack.mulrow Sounds good to me.

Comment by Jack Mulrow [ 24/Jul/18 ]

I think every command that can run in a transaction already does forward the txnNumber (based on this list in session.cpp), except killCursors. I don't think it makes sense to make mongos support forwarding transaction numbers for commands that can't actually run in a transaction, so I would propose instead making this ticket track adding txnNumber support for ClusterKillCursors and adding a new ticket to make mongos reject commands that don't support txnNumbers if they have one, since we can't rely on mongos not losing the txnNumber when forwarding to a shard.

renctan, kaloian.manassiev, does that sound good?

Generated at Thu Feb 08 04:41:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.