[SERVER-48600] RefineCollectionShardKey does not check for transaction write concern errors Created: 04/Jun/20 Updated: 29/Oct/23 Resolved: 24/Aug/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication, Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 4.7.0, 4.4.2 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Judah Schvimer | Assignee: | Jack Mulrow |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | ALL | ||||||||||||
| Backport Requested: |
v4.4
|
||||||||||||
| Sprint: | Sharding 2020-09-07 | ||||||||||||
| Participants: | |||||||||||||
| Description |
|
It checks for top level errors here but never checks for write concern errors. Thus the command can succeed and then roll back. Here is an example of checking for both types of errors. |
| Comments |
| Comment by Githook User [ 17/Sep/20 ] |
|
Author: {'name': 'Jack Mulrow', 'email': 'jack.mulrow@mongodb.com', 'username': 'jsmulrow'}Message: (cherry picked from commit 574976df29afa984cae7e28c5772e71bb0906ec9) |
| Comment by Githook User [ 21/Aug/20 ] |
|
Author: {'name': 'Jack Mulrow', 'email': 'jack.mulrow@mongodb.com', 'username': 'jsmulrow'}Message: |
| Comment by Jack Mulrow [ 04/Jun/20 ] |
|
Actually, the internal transaction runs on a different operation context from a different client than the one servicing the _configsvrRefineCollectionShardKey command, so the writes from the transaction won't be considered when waitForWriteConcern() checks if the command's last op increased when deciding whether to wait for write concern after the command executes. This isn't a problem because after the transaction completes, the command's original operation context is used to write to the config.changelog collection, which will advance the client's last op and trigger waiting for majority write concern of the logging write, which should be greater than the op time of the transaction. This is pretty fragile, so in addition to checking for write concern where the ticket description linked, we may want to set the last op on the command's client to the last op from the client used for the internal transaction after it commits. |
| Comment by Jack Mulrow [ 04/Jun/20 ] |
|
Mongos sends _configsvrRefineCollectionShardKey to the config server with majority write concern, so even though that command doesn't check for a write concern error when committing the refine shard key transaction, I don't think a refine can succeed if there was a write concern error because waiting for write concern after the command finishes executing should fail. It's probably still worth fixing this though since the config server will trigger routing table refreshes on each shard with a chunk for the refined namespace after committing the internal transaction, which is wasted work if the transaction does roll back. |