[SERVER-79955] Need a more complete mechanism for internal readers to avoid fassert when crossing member state PRIMARY to SECONDARY transition Created: 12/Aug/23  Updated: 29/Oct/23  Resolved: 18/Sep/23

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: 7.2.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Max Hirschhorn Assignee: Gregory Noma
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-80280 Consider introducing concept of drain... Closed
Related
is related to SERVER-59775 ReshardingDonorOplogIterator triggers... Closed
is related to SERVER-79802 Allow resharding donor oplog iterator... Closed
is related to SERVER-48452 Internal readers should default to re... Closed
Assigned Teams:
Storage Execution NAMER
Backwards Compatibility: Fully Compatible
Sprint: Execution NAMR Team 2023-09-18, Execution NAMR Team 2023-10-02
Participants:
Linked BF Score: 150

 Description   

The checkInvariantsForReadOptions() function out of an abundance of caution will fassert() unless a reader is known to be prepared to handle not reading from a monotonic view containing earlier writes. For external Clients (are associated with a network connection), the consistency model of the server allows non-snapshot readers to see earlier versions of documents and so switching from ReadSource::kNoTimestamp to ReadSource::kLastApplied is acceptable. However, at the same time, internal Clients (not associated with a network connection) which aren't running their operation using DBDirectClient return false from canReadAtLastApplied() because they aren't known to be prepared to switch to reading from a staler version of the data. This leads to those internal Client readers triggering the fassert() in db_raii.cpp.

With the advent of PrimaryOnlyServices, the codebase contains more internal Clients which aren't using DBDirectClient and are also running operations also not immediately halted on stepdown. These operations therefore are very likely to cross the member state PRIMARY to SECONDARY transition. Interruption for primary-only service Instances is delivered synchronously but quiescing is intentionally deferred until the node would step back up as primary. So far we have seen this come up in two places in resharding components yet it seems probable other PrimaryOnlyServices would be affected by a similar pattern of wanting to read data to write other data. Some PrimaryOnlyServices may have unknowingly avoided this problem by using AutoGetCollection directly to opt-out of lock-free reads and therefore continue to synchronize around the RSTL lock.

We should improve the behavior of checkInvariantsForReadOptions() such that we can be confident internal Client readers are getting the consistency level they depend on but also aren't requiring a stepdown to be triggered in testing to have successfully identified all components which can accept a relaxed consistency level.



 Comments   
Comment by Gregory Noma [ 18/Sep/23 ]

The change we ended up going with is having all internal readers (regardless of whether they are from a direct client or not) change their read source to last applied, aside from those who explicitly opt out by setting OperationContext::setEnforceConstraints to false.

Comment by Githook User [ 18/Sep/23 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-79955 Default internal secondary readers to last applied
Branch: master
https://github.com/mongodb/mongo/commit/7e2fcb1487144405d69750c42e606746178b67ea

Comment by Max Hirschhorn [ 12/Aug/23 ]

Internal operations use DBDirectClient as a loopback to perform local operations, and they expect the same level of consistency guarantees as any user operation. For that reason, DBDirectClient should be able to change the owning operation's ReadSource in order to serve consistent data.

I'm not sure I agree with the justification made for DBDirectClient in canReadAtLastApplied(). It seems entirely arbitrary whether C++ code run by internal Clients uses DBDirectClient or invokes ServiceEntryPoint::handleRequest() directly or accesses the data through CollectionPtr directly. We don't allow DBDirectClient to change the ReadConcernArgs of the OperationContext and so it is unclear to me why we permit it to change the ReadSource of the RecoveryUnit.

Generated at Thu Feb 08 06:42:20 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.