[SERVER-47681] Background validation uses the kNoOverlap read source instead of kAllDurableSnapshot to prevent us from having to take the PBWM lock on secondaries Created: 21/Apr/20  Updated: 29/Oct/23  Resolved: 25/Sep/20

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 4.8.0, 4.4.2

Type: Bug Priority: Major - P3
Reporter: Gregory Wlodarek Assignee: Eric Milkie
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
depends on WT-6503 Cache stuck: eviction couldn't able t... Closed
depends on WT-6349 Don't truncate history store updates ... Closed
depends on WT-6490 Acquire snapshot for eviction threads Closed
Duplicate
is duplicated by SERVER-50398 appliedThrough document write and bac... Closed
is duplicated by SERVER-49012 Background validation needs to use ig... Closed
is duplicated by SERVER-50353 Background validation should periodic... Closed
Problem/Incident
Related
related to SERVER-51302 Override read timestamp check for ref... Closed
is related to SERVER-50398 appliedThrough document write and bac... Closed
is related to WT-6767 Adding a new read timestamp config th... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v4.4
Sprint: Execution Team 2020-05-04, Execution Team 2020-07-13, Execution Team 2020-07-27, Execution Team 2020-08-24, Execution Team 2020-10-05
Participants:
Linked BF Score: 37

 Description   

If the PBWM lock is held during the entire background collection validation, then on secondaries it could stall replication, which is undesirable. After opening all of the collection and index cursors, we can release the PBWM lock.



 Comments   
Comment by Githook User [ 29/Sep/20 ]

Author:

{'name': 'Eric Milkie', 'email': 'milkie@10gen.com', 'username': 'milkie'}

Message: SERVER-47681 do not ignore prepare conflicts on background validations SERVER-50586 Collection validation should append the collection's namespace to the output before any exceptions can be thrown

(cherry picked from commit 86f5e6928d6213a30dd85362eb5d157976483495)
Branch: v4.4
https://github.com/mongodb/mongo/commit/f31ce895f1475e1184bafbf74f3c2032b9f1801f

Comment by Githook User [ 25/Sep/20 ]

Author:

{'name': 'Eric Milkie', 'email': 'milkie@10gen.com', 'username': 'milkie'}

Message: SERVER-47681 do not ignore prepare conflicts on background validations
Branch: master
https://github.com/mongodb/mongo/commit/86f5e6928d6213a30dd85362eb5d157976483495

Comment by Githook User [ 12/May/20 ]

Author:

{'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}

Message: Revert "SERVER-47681 Background validation uses the kNoOverlap read source instead of kAllDurableSnapshot to prevent us from having to take the PBWM lock on secondaries"

This reverts commit c68a391f8f1ce0390ee019997625c06eb43aea6b.
Branch: v4.4
https://github.com/mongodb/mongo/commit/c48086529bef73468c7840fc4f12d6680b5243f3

Comment by Daniel Gottlieb (Inactive) [ 12/May/20 ]

Apologies for the double-notification
We're reverting this change because letting secondaries run validation while processing oplog entries is causing a stall in eviction that's leading to a large amount of uninteresting BFs. The cache pressure stall is a bug that's being investigated will be patched, but there aren't enough resources to diagnose and prioritize a fix at the moment.

We intend to re-enable this for the 4.4 release (leaving the code in the reverted state is actually a bug).

cc pasette kelsey.schubert

Comment by Gregory Wlodarek [ 12/May/20 ]

I'm reverting this for now after discussing with daniel.gottlieb as this is causing too much BF noise. Once the related BF is closed, we should push these changes back into master and v4.4 again. I've marked this ticket with 4.4.0. Without this change, secondary nodes running validation in the background will stall replication.

 

(The backport is marked as released, I don't know how to reset its state. Don't be fooled!)

Comment by Githook User [ 24/Apr/20 ]

Author:

{'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}

Message: SERVER-47681 Background validation uses the kNoOverlap read source instead of kAllDurableSnapshot to prevent us from having to take the PBWM lock on secondaries

(cherry picked from commit 1f6db03c2b428a96215d407030fa7c1650456263)
Branch: v4.4
https://github.com/mongodb/mongo/commit/c68a391f8f1ce0390ee019997625c06eb43aea6b

Comment by Githook User [ 23/Apr/20 ]

Author:

{'name': 'Gregory Wlodarek', 'email': 'gregory.wlodarek@mongodb.com', 'username': 'GWlodarek'}

Message: SERVER-47681 Background validation uses the kNoOverlap read source instead of kAllDurableSnapshot to prevent us from having to take the PBWM lock on secondaries
Branch: master
https://github.com/mongodb/mongo/commit/1f6db03c2b428a96215d407030fa7c1650456263

Generated at Thu Feb 08 05:14:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.