[SERVER-37458] Mongos does not apply commitTransaction's writeConcern Created: 03/Oct/18  Updated: 29/Oct/23  Resolved: 14/Dec/18

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.1.7

Type: Bug Priority: Major - P3
Reporter: Shane Harvey Assignee: Esha Maharishi (Inactive)
Resolution: Fixed Votes: 0
Labels: ShardedTxn:DistributedCommit
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by SERVER-37925 Transaction coordinator shard should ... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

from pymongo import MongoClient, WriteConcern
client = MongoClient()
client.t.t.insert_one({}) # Create collection
with client.start_session() as s, s.start_transaction(write_concern=WriteConcern(w="foo")):
    client.t.t.insert_one({}, session=s)

Sprint: Sharding 2018-12-17
Participants:

 Description   

Mongos seems to ignore the user's writeConcern on commitTransaction. For example:

01 Oct 18 15:00 -0700 (Connection: 1:890886553)  op_msg commitTransaction admin Request:{"sections":[{"payload":{"$clusterTime":{"clusterTime":{"$timestamp":{"t":1538431201,"i":1}},"signature":{"hash":{"$binary":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","$type":"00"},"keyId":{"$numberLong":"0"}}},"$db":"admin","$readPreference":{"mode":"primary"},"autocommit":false,"commitTransaction":1,"lsid":{"id":{"$binary":"aK08elgGS0yy+LqHWjEHQQ==","$type":"04"}},"txnNumber":{"$numberLong":"4"},"writeConcern":{"w":"foo"}},"payloadType":0}]}
 
 
01 Oct 18 15:00 -0700 (Connection: 1:890886553) +955µs op_msg reply   Response:{"sections":[{"payload":{"$clusterTime":{"clusterTime":{"$timestamp":{"t":1538431201,"i":2}},"signature":{"hash":{"$binary":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","$type":"00"},"keyId":{"$numberLong":"0"}}},"ok":1.0,"operationTime":{"$timestamp":{"t":1538431201,"i":2}}},"payloadType":0}]}

Notice writeConcern:{w:"foo"} which should cause an unknown write concern error.

Tested on https://github.com/mongodb/mongo/commit/860b392d9d3c006090a4c7fc3c6f3fa5460e5c5c:

mongodb-macos-x86_64-4.1.3-233-g860b392/bin/mongos --version
mongos version v4.1.3-233-g860b392
git version: 860b392d9d3c006090a4c7fc3c6f3fa5460e5c5c
allocator: system
modules: none
build environment:
    distarch: x86_64
    target_arch: x86_64



 Comments   
Comment by Esha Maharishi (Inactive) [ 14/Dec/18 ]

(Note, "2) The coordinator shard needs to set its Client's last OpTime to the system last opTime" was committed under SERVER-36853 here because that ticket added testing where the config servers are the coordinator, and config servers upconvert writeConcern to majority for internal connections (e.g., connections from a router)).

Comment by Githook User [ 14/Dec/18 ]

Author:

{'username': 'EshaMaharishi', 'email': 'esha.maharishi@mongodb.com', 'name': 'Esha Maharishi'}

Message: SERVER-37458 Mongos does not apply commitTransaction's writeConcern
Branch: master
https://github.com/mongodb/mongo/commit/8e403c9cce000c5bc505e03c1a94c3f441512720

Comment by Githook User [ 09/Dec/18 ]

Author:

{'name': 'Esha Maharishi', 'email': 'esha.maharishi@mongodb.com', 'username': 'EshaMaharishi'}

Message: SERVER-37458 Allow coordinateCommitTransaction command to accept writeConcern
Branch: master
https://github.com/mongodb/mongo/commit/a1442e88e77fff49dd20a11953a6012be66d4b79

Comment by Esha Maharishi (Inactive) [ 04/Dec/18 ]

Note, there is a bug in mongos's propagation of the client's writeConcern on coordinateCommitTransaction that means mongos is actually not propagating the client's writeConcern currently.

The bug is that coordinateCommitCmd.toBSON(opCtx->getWriteConcern(), false) should instead be written coordinateCommitCmd.toBSON(BSON("writeConcern" << opCtx->getWriteConcern()), false).

When I fixed the bug and ran a test of passing an invalid writeConcern to commitTransaction through mongos, I found that the shard currently returns this error:

[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.948-0500 assert: command failed: {
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.948-0500 	"ok" : 0,
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.948-0500 	"errmsg" : "writeConcern is not allowed within a multi-statement transaction",
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.948-0500 	"code" : 72,
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.948-0500 	"codeName" : "InvalidOptions",
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.949-0500 	"operationTime" : Timestamp(1543938786, 21),
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.949-0500 	"$clusterTime" : {
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.949-0500 		"clusterTime" : Timestamp(1543938786, 27),
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.949-0500 		"signature" : {
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.949-0500 			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.949-0500 			"keyId" : NumberLong(0)
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.949-0500 		}
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.949-0500 	}
[js_test:txn_basic_two_phase_commit] 2018-12-04T15:53:06.949-0500 }

This means, as part of this ticket,

1) The shard needs to be made to accept writeConcern on coordinateCommitTransaction here.

2) The coordinator shard needs to set its Client's last OpTime to the system last OpTime, to actually wait for writeConcern of the writes done in the background thread.

Comment by Esha Maharishi (Inactive) [ 13/Nov/18 ]

Note: We might have already fixed this in SERVER-37882 by making the "drive coordinator" logic happen inside a background thread which internally waits for majority writeConcern. The client's thread blocks on being notified from this background thread that the decision has been made. So, the client thread properly waits for the client's writeConcern on the client's request's way out.

However, until SERVER-37364 ("coordinateCommit should return to client as soon as commit decision is persisted"), the client will end up waiting for the decision to be majority-committed anyway because the "drive coordinator" background thread will only signal the client thread after the decision has been majority-committed.

Comment by Esha Maharishi (Inactive) [ 13/Nov/18 ]

The fix is to make the coordinator shard respect the client's writeConcern; the router already forwards the client's writeConcern to the coordinator shard.

Comment by Shane Harvey [ 04/Oct/18 ]

Can you expand on why the writeConcern is not configurable? Are there plans to make it configurable in the future?

The main reason I'm concerned is that writeConcern is already supported by replica sets in 4.0 and it's part of the driver's api. So I think it would be confusing for writeConcern to be supported on a replica set but not supported in Mongos. And what about wtimeout? Is that ignored as well?

Comment by Randolph Tan [ 04/Oct/18 ]

Note that writeConcern for mongos commitTransaction is currently not configurable (commit is internally done with w: majority). shane.harvey, would it be sufficient for your case that mongos validates that it should be majority if provided?

Generated at Thu Feb 08 04:46:03 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.