[SERVER-74473] Abort movePrimary operation on BSONObjectTooLarge error Created: 01/Mar/23  Updated: 27/Mar/23  Resolved: 27/Mar/23

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Antonio Fuschetto Assignee: Antonio Fuschetto
Resolution: Won't Fix Votes: 0
Labels: sharding-wfbf-day
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-74185 Support for cleanup in Recoverable Sh... Closed
Assigned Teams:
Sharding EMEA
Sprint: Sharding EMEA 2023-03-20, Sharding EMEA 2023-04-03
Participants:

 Description   

The cloning phase of the movePrimary operation writes on the coordinator document the list of collections belonging to the database to be cloned. This information is serialized to a BSON object (i.e., collectionsToClone filed), and its size could potentially exceed the maximum limit. This would trigger a BSONObjectTooLarge error, that is considered retryable by the resilient cloning procedure of the movePrimary (see SERVER-74185).

The goal of this ticket is to handle this error by causing the operation to fail.



 Comments   
Comment by Antonio Fuschetto [ 27/Mar/23 ]

At the present, the cloning phase of the movePrimary command stores the list of collections actually cloned on the coordinator document. That list is then used in the cleaning phase, where the donor shard drops all the local collections (moved to the recipient shard).

However, in the context of the Online movePrimary project, the DonorService POS takes care of deleting the cloned collection (kAborted state). Refer to the Technical Design document, specifically to this comment.

Generated at Thu Feb 08 06:27:31 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.