[SERVER-74473] Abort movePrimary operation on BSONObjectTooLarge error Created: 01/Mar/23 Updated: 27/Mar/23 Resolved: 27/Mar/23 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Antonio Fuschetto | Assignee: | Antonio Fuschetto |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Assigned Teams: |
Sharding EMEA
|
||||||||
| Sprint: | Sharding EMEA 2023-03-20, Sharding EMEA 2023-04-03 | ||||||||
| Participants: | |||||||||
| Description |
|
The cloning phase of the movePrimary operation writes on the coordinator document the list of collections belonging to the database to be cloned. This information is serialized to a BSON object (i.e., collectionsToClone filed), and its size could potentially exceed the maximum limit. This would trigger a BSONObjectTooLarge error, that is considered retryable by the resilient cloning procedure of the movePrimary (see The goal of this ticket is to handle this error by causing the operation to fail. |
| Comments |
| Comment by Antonio Fuschetto [ 27/Mar/23 ] |
|
At the present, the cloning phase of the movePrimary command stores the list of collections actually cloned on the coordinator document. That list is then used in the cleaning phase, where the donor shard drops all the local collections (moved to the recipient shard). However, in the context of the Online movePrimary project, the DonorService POS takes care of deleting the cloned collection (kAborted state). Refer to the Technical Design document, specifically to this comment. |