[SERVER-79360] Avoid accessing `OpDebug` from other threads Created: 26/Jul/23  Updated: 26/Jan/24  Resolved: 06/Sep/23

Status: Closed
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: 7.2.0-rc0, 7.0.2, 7.1.0-rc3

Type: Bug Priority: Major - P3
Reporter: Amirsaman Memaripour Assignee: Erwin Pe
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Assigned Teams:
Server Security
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.1, v7.0
Sprint: Security 2023-09-04, Security 2023-09-18
Participants:
Linked BF Score: 5

 Description   

OpDebug is a non-synchronized structure, designed to hold debug information for an operation. This structure is not intended to be visible to any thread other than the one that is running the operation:

/**
 * Returns a structure containing data used for profiling, accessed only by a thread
 * currently executing the operation context associated with this CurOp.
 */
OpDebug& debug() {
    return _debug;
}

SERVER-74962 and SERVER-75265 introduced a non-thread-safe usage of OpDebug by having the operation thread write a state that is later accessed/read by any thread that calls into CurOp::reportState:

StatusWith<CursorResponse> ClusterFind::runGetMore(OperationContext* opCtx, ...) {
    ...
    {
        CurOp::get(opCtx)->debug().nShards = pinnedCursor.getValue()->getNumRemotes();
        CurOp::get(opCtx)->debug().cursorid = cursorId;
        CurOp::get(opCtx)->debug().shouldOmitDiagnosticInformation =
            pinnedCursor.getValue()->shouldOmitDiagnosticInformation();
        stdx::lock_guard<Client> lk(*opCtx->getClient());
        CurOp::get(opCtx)->setOriginatingCommand_inlock(
            pinnedCursor.getValue()->getOriginatingCommand());
        CurOp::get(opCtx)->setGenericCursor_inlock(pinnedCursor.getValue().toGenericCursor());
    }
    ...
}
 
void CurOp::reportState(BSONObjBuilder* builder, ...) {
    ...
    bool omitAndRedactInformation = CurOp::get(opCtx)->debug().shouldOmitDiagnosticInformation;
    builder->append("redacted", omitAndRedactInformation);
    ...
}

The above is identified by TSAN as a read-after-write data race. A possible fix is to move any shared data into CurOp, currently protected by the Client lock.



 Comments   
Comment by Githook User [ 13/Sep/23 ]

Author:

{'name': 'Erwin Pe', 'email': 'erwin.pe@mongodb.com', 'username': 'erwee'}

Message: SERVER-79360 Fix data race on OpDebug::shouldOmitDiagnosticInformation

(cherry picked from commit 348baacb41655fd083aec822c1fbbdad23f8c6b0)
Branch: v7.0
https://github.com/mongodb/mongo/commit/483ea821302ca73ad5fc67635fe71d9473fa4d2d

Comment by Githook User [ 13/Sep/23 ]

Author:

{'name': 'Erwin Pe', 'email': 'erwin.pe@mongodb.com', 'username': 'erwee'}

Message: SERVER-79360 Fix data race on OpDebug::shouldOmitDiagnosticInformation

(cherry picked from commit 348baacb41655fd083aec822c1fbbdad23f8c6b0)
Branch: v7.1
https://github.com/mongodb/mongo/commit/cb663cb4acea834697251b263c9a675ed5260f82

Comment by Githook User [ 06/Sep/23 ]

Author:

{'name': 'Erwin Pe', 'email': 'erwin.pe@mongodb.com', 'username': 'erwee'}

Message: SERVER-79360 Fix data race on OpDebug::shouldOmitDiagnosticInformation
Branch: master
https://github.com/mongodb/mongo/commit/348baacb41655fd083aec822c1fbbdad23f8c6b0

Generated at Thu Feb 08 06:40:46 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.