[SERVER-66471] 5.0 mongos write hangs on PSA shard after second is shutdown Created: 16/May/22 Updated: 07/Nov/22 Resolved: 07/Nov/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | jing xu | Assignee: | Ali Mir |
| Resolution: | Won't Fix | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Operating System: | ALL |
| Sprint: | Repl 2022-11-14 |
| Participants: |
| Description |
|
hello: if [ (#arbiters > 0) AND (#non-arbiters <= majority(#voting-nodes)) ] else To reproduce on psa replication: ) ) To reproduce using mongos on single psa shard,it not wok. mongos> db.testWriteConcern.insert({_id:10,name:"xiaoxu"}, {w:1}) ) it is not nerver timeout when no timeout anon@127.0.0.1:31002:PRIMARY:[db]test> rs.status(); , , , , , , , , , , , ], |
| Comments |
| Comment by Ali Mir [ 07/Nov/22 ] | |||
|
Hey there 601290552@qq.com! Thanks for this ticket. I'm on the replication team here at MongoDB, and we worked on updating the write concern default to w: "majority" in 5.0. Please note that this bug around sharded clusters and PSA sets has been fixed in future versions of MongoDB. If you upgrade to 6.0, you will not see this issue. In later versions, if you attempt to start a sharded cluster with any shard that is a PSA set, you'll receive an error on startup. To avoid the error, you'll need to set a cluster wide write concern via the setDefaultRWConcern command (as chris.kelly@mongodb.com mentioned). To get around this issue on 5.0.2, please follow the steps outlined by Chris. Namely, you should set the CWWC with:
to set a default of w:1 for the cluster. I'm going to close out this ticket, but feel free to reply with any additional questions. Thanks! | |||
| Comment by Chris Kelly [ 03/Jun/22 ] | |||
|
Hi Jing, Thank you for your report! I went ahead and replicated your situation by creating a 2 shard cluster with a primary, secondary, and arbiter in each shard. I shut down the secondary on shard #2 and attempted your query on mongos. When you run:
before shutting down a secondary node on shard #2, it will work. After secondary on shard #2 is stopped, the query will hang. If you run the query specifying writeConcern: 1 instead on mongos, it will work:
However, if you run setDefaultRWConcern on mongos, you can set this to 1 yourself to get it to insert again without specifying it on each query.
I will follow up on whether the writeConcern is supposed to be different on mongos by default, regardless of whether arbiters exist on the shards, but you can use this to remediate the hanging for now. You should also be trying to avoid the use of arbiters if at all possible. Regards.
|