[SERVER-37880] Add in backoff for retrying sending commit and abort messages Created: 01/Nov/18  Updated: 29/Oct/23  Resolved: 16/Jan/19

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 4.1.7

Type: Bug Priority: Major - P3
Reporter: Matthew Saltz (Inactive) Assignee: Kaloian Manassiev
Resolution: Fixed Votes: 0
Labels: ShardedTxn:DistributedCommit, todo_in_code, transaction-coordinator-management
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-38798 shard registry busy-loops in commitTr... Closed
is duplicated by SERVER-38444 Coordinating a transaction should run... Closed
is duplicated by SERVER-38795 Add test that if coordinator fails to... Closed
is duplicated by SERVER-38881 Use ThreadClient rather than onCreate... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: Sharding 2019-01-14, Sharding 2019-01-28
Participants:
Linked BF Score: 0

 Description   

Right now, if a commit, abort, or prepare message to a transaction participant fails, we retry immediately. It might be worth adding some sort of backoff so that we slow down retries if they keep failing. Same goes for targeting a shard in the coordinator code.



 Comments   
Comment by Githook User [ 16/Jan/19 ]

Author:

{'username': 'kaloianm', 'email': 'kaloian.manassiev@mongodb.com', 'name': 'Kaloian Manassiev'}

Message: SERVER-37880 Treat non-retryable errors during prepare as implied abort
Branch: master
https://github.com/mongodb/mongo/commit/fe187bdf22a6c67d5ff6f035d51a308870255e10

Comment by Githook User [ 16/Jan/19 ]

Author:

{'username': 'kaloianm', 'email': 'kaloian.manassiev@mongodb.com', 'name': 'Kaloian Manassiev'}

Message: SERVER-37880 Introduce backoff for retrying commit and abort messages
Branch: master
https://github.com/mongodb/mongo/commit/cc2fba8d8dfae009369ac9084375c0fc513793d4

Comment by Githook User [ 15/Jan/19 ]

Author:

{'username': 'kaloianm', 'email': 'kaloian.manassiev@mongodb.com', 'name': 'Kaloian Manassiev'}

Message: SERVER-37880 Implement an AsyncWorkScheduler without cancellation
Branch: master
https://github.com/mongodb/mongo/commit/b29905578c6a537a2e94c9c934601aff1c02fd9b

Comment by Githook User [ 14/Jan/19 ]

Author:

{'username': 'kaloianm', 'email': 'kaloian.manassiev@mongodb.com', 'name': 'Kaloian Manassiev'}

Message: SERVER-37880 Make time_support's Backoff class non-blocking
Branch: master
https://github.com/mongodb/mongo/commit/0cb50f1023c394ba3e51e70a5daa2aea417cea5d

Comment by Githook User [ 10/Jan/19 ]

Author:

{'username': 'kaloianm', 'email': 'kaloian.manassiev@mongodb.com', 'name': 'Kaloian Manassiev'}

Message: SERVER-37880 Merge the barrier functionality to be part of the unittests library
Branch: master
https://github.com/mongodb/mongo/commit/c1bc93bc4e903fd1ea5eb035023d5054c6c86497

Generated at Thu Feb 08 04:47:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.