[SERVER-43060] CheckReplDBHashInBackground should retry the "dbHash" command on WriteConflicts Created: 27/Aug/19  Updated: 29/Oct/23  Resolved: 09/Jul/20

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 4.4.0-rc14, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Louis Williams Assignee: Louis Williams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
is related to SERVER-44128 Wrap dbHash in writeConflictRetry loop Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4
Sprint: Execution Team 2020-07-13
Participants:
Linked BF Score: 18

 Description   

The "dbHash" command, by using a long-running snapshot read with $_internalReadAtClusterTime, has the tendency to induce cache pressure.

When application threads are performing eviction and the WiredTiger cache gets stuck, WiredTiger will abort the oldest transaction in order to make progress. This error manifests as a WT_ROLLBACK error code, and then MongoDB converts this to a WriteConflictException.

For that reason, it would be expected for dbHash to get WriteConflicts occasionally, and we should have logic to retry or ignore failures in this case.



 Comments   
Comment by Githook User [ 17/Jul/20 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-43060 CheckReplDBHashInBackground should retry the dbHash command on WriteConflicts in debug builds

(cherry picked from commit ab4d803f1f2f57cf9dbec89175f8eb52cb4761f2)
Branch: v4.4
https://github.com/mongodb/mongo/commit/97cd266f1b5df7f3af81d487e6dab2ca1060935a

Comment by Githook User [ 09/Jul/20 ]

Author:

{'name': 'Louis Williams', 'email': 'louis.williams@mongodb.com', 'username': 'louiswilliams'}

Message: SERVER-43060 CheckReplDBHashInBackground should retry the dbHash command on WriteConflicts in debug builds
Branch: master
https://github.com/mongodb/mongo/commit/ab4d803f1f2f57cf9dbec89175f8eb52cb4761f2

Comment by Louis Williams [ 15/Jun/20 ]

Reassigning to Execution because we have the most context on how to fix this.

Comment by Brooke Miller [ 15/Jun/20 ]

louis.williams, is this something that Execution team could do, since you have a better understanding of the best retry strategy for avoiding WriteConflict exceptions? 

Generated at Thu Feb 08 05:02:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.