[SERVER-53457] Handle multi-updates correctly in tenant migrations Created: 18/Dec/20 Updated: 17/Nov/23 Resolved: 23/Feb/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 5.0.0 |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Jason Zhang | Assignee: | Andrew Shuvalov (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | pm-1791_milestone-A | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||
| Sprint: | Sharding 2021-01-25, Sharding 2021-02-22, Sharding 2021-03-08 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Linked BF Score: | 9 | ||||||||||||||||||||||||||||
| Description |
|
If a tenant migration begins blocking between writes for a multi:true update that matches multiple documents, the proxy will retry the entire update which can double apply it on documents that had already been updated in the first attempt. |
| Comments |
| Comment by Githook User [ 23/Feb/21 ] |
|
Author: {'name': 'Andrew Shuvalov', 'email': 'andrew.shuvalov@mongodb.com', 'username': 'shuvalov-mdb'}Message: |
| Comment by Githook User [ 20/Feb/21 ] |
|
Author: {'name': 'Andrew Shuvalov', 'email': 'andrew.shuvalov@mongodb.com', 'username': 'shuvalov-mdb'}Message: |
| Comment by Githook User [ 20/Feb/21 ] |
|
Author: {'name': 'Kaloian Manassiev', 'email': 'kaloian.manassiev@mongodb.com', 'username': 'kaloianm'}Message: Revert " This reverts commit c5e2fb25365a7ef113607eb59e0ec37e9c14412b. |
| Comment by Githook User [ 20/Feb/21 ] |
|
Author: {'name': 'Andrew Shuvalov', 'email': 'andrew.shuvalov@mongodb.com', 'username': 'shuvalov-mdb'}Message: |
| Comment by Jack Mulrow [ 22/Dec/20 ] |
|
One way to fix this is to somehow label these errors as not retryable, which matches the behavior for a multi:true update that is interrupted for a different reason, like a failover. This could be done by changing the error code to something other than TenantMigrationAborted / TenantMigrationCommitted. |