[SERVER-63722] Rename collection participants get stuck upon errors different from stepdown/shutdown Created: 16/Feb/22  Updated: 29/Oct/23  Resolved: 18/Feb/22

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 5.2.0, 5.3.0-rc0, 5.0.5
Fix Version/s: 6.0.0-rc0, 5.0.7, 5.3.0-rc2, 5.2.2

Type: Bug Priority: Major - P3
Reporter: Pierlauro Sciarelli Assignee: Pierlauro Sciarelli
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v5.3, v5.2, v5.0
Sprint: Sharding EMEA 2022-02-21
Participants:

 Description   

By design, it was assumed that rename collection participants could only fail due to stepdown/shutdown errors, so the rationale was: promises would be invalidated before releasing POS instances and eventually new primaries would resume such instances with a clean state.

However, it turns out that there are some scenarios in which "non-stepdown" recoverable errors can happen, meaning that promises get invalidated but POS participant instances do not get released. As a consequence, any retry results in the following flow: get the POS instance, check the promises and fail again.

This can happen for example in case of index builds happening concurrently on a collection being renamed (participants get stuck with BackgroundOperationInProgressForNamespace error).

Workaround in case this bug is hit by some users: trigger an election on all shards with a stuck rename participant.



 Comments   
Comment by Githook User [ 24/Feb/22 ]

Author:

{'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}

Message: SERVER-63722 Rename collection participants resilient to errors different from stepdown/shutdown
Branch: v5.3
https://github.com/mongodb/mongo/commit/3080405bc62db70605ba63e90929efe3b1d5b052

Comment by Githook User [ 24/Feb/22 ]

Author:

{'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}

Message: SERVER-63722 Rename collection participants resilient to errors different from stepdown/shutdown
Branch: v5.2
https://github.com/mongodb/mongo/commit/deabbd397ee55c94c88cfbc3ab1f8edbbe76aa6e

Comment by Githook User [ 24/Feb/22 ]

Author:

{'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}

Message: SERVER-63722 Rename collection participants resilient to errors different from stepdown/shutdown
Branch: v5.0
https://github.com/mongodb/mongo/commit/e8fd6294503dfdab828fe5656607e88fb2628b0a

Comment by Githook User [ 18/Feb/22 ]

Author:

{'name': 'Pierlauro Sciarelli', 'email': 'pierlauro.sciarelli@mongodb.com', 'username': 'pierlauro'}

Message: SERVER-63722 Rename collection participants resilient to errors different from stepdown/shutdown
Branch: master
https://github.com/mongodb/mongo/commit/fa7787c9455dd4fee5c7cbb54bec5c2847353d93

Generated at Thu Feb 08 05:58:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.