[SERVER-58622] DDL coordinator handle write concern error incorrectly when removing coordinator document Created: 16/Jul/21  Updated: 29/Oct/23  Resolved: 13/Jan/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 5.3.0, 5.0.6, 5.2.1

Type: Bug Priority: Major - P3
Reporter: Marcos José Grillo Ramirez Assignee: Marcos José Grillo Ramirez
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File bf_21905_reproducible.patch     File reproducible.js    
Issue Links:
Backports
Depends
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.2, v5.0
Sprint: Sharding EMEA 2021-11-01, Sharding EMEA 2021-11-15, Sharding EMEA 2021-11-29, Sharding EMEA 2021-12-13, Sharding EMEA 2021-12-27, Sharding EMEA 2022-01-10, Sharding EMEA 2022-01-24
Participants:
Linked BF Score: 17

 Description   

If a write concern error occurs when trying to remove the coordinator document the primary only service cleanup is not being executed, leaving the shard with a DDL coordinator service that will always return an error, even if the operation could be retried and the document could be removed successfully later on, this might be happening because it is not being considered that the _removeDocument function might throw.

You can find a reproducible attached.



 Comments   
Comment by Githook User [ 20/Jan/22 ]

Author:

{'name': 'Marcos José Grillo Ramirez', 'email': 'marcos.grillo@mongodb.com', 'username': 'm4nti5'}

Message: SERVER-58622 Retry removing the DDL coordinator document after the DDL is finished unless there is a stepdown

(cherry picked from commit 5fa70b4e4d6b4252fd505ab12cea771b197d2cf0)
Branch: v5.2
https://github.com/mongodb/mongo/commit/87ad76bdf85dd77050041d78c6c6eb4c8e04e8f8

Comment by Githook User [ 17/Jan/22 ]

Author:

{'name': 'Marcos José Grillo Ramirez', 'email': 'marcos.grillo@mongodb.com', 'username': 'm4nti5'}

Message: SERVER-58622 Retry removing the DDL coordinator document after the DDL is finished unless there is a stepdown

(cherry picked from commit 5fa70b4e4d6b4252fd505ab12cea771b197d2cf0)
Branch: v5.0
https://github.com/mongodb/mongo/commit/c7db7debed3584e36f8439ee195a489d58412383

Comment by Githook User [ 13/Jan/22 ]

Author:

{'name': 'Marcos José Grillo Ramirez', 'email': 'marcos.grillo@mongodb.com', 'username': 'm4nti5'}

Message: SERVER-58622 Retry removing the DDL coordinator document after the DDL is finished unless there is a stepdown
Branch: master
https://github.com/mongodb/mongo/commit/5fa70b4e4d6b4252fd505ab12cea771b197d2cf0

Comment by Marcos José Grillo Ramirez [ 08/Nov/21 ]

After talking with tommaso.tocci, we could either retry indefinitely the remove, or, find a way to retry the remove part. This should only happen if there are no step-downs.

Generated at Thu Feb 08 05:45:00 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.