[SERVER-43672] Invariant failure session->cursorOut() in wiredtiger_session_cache.cpp Created: 27/Sep/19  Updated: 08/Jan/24  Resolved: 17/Dec/19

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.6.7
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jiri Sula Assignee: Backlog - Query Team (Inactive)
Resolution: Incomplete Votes: 0
Labels: qexec-team
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Mongo 3.6.7, git version: 2628472127e9f1826e02c665c1d93880a204075e
Debian Stretch x86_64, 4.4.112 kernel, xfs file system


Attachments: Text File backtrace.txt     Text File errors.txt     File mongod.27023.log-20190926.bz2    
Issue Links:
Related
is related to SERVER-43674 mongoDB crash with "Got signal: 11 (S... Closed
Assigned Teams:
Query
Operating System: ALL
Sprint: Query 2019-11-18, Query 2019-12-02, Query 2019-12-16
Participants:

 Description   

Good day!

We faced a server crash, attaching error log fragment + trace.

Thank you for any activity tracing this!



 Comments   
Comment by Craig Homa [ 17/Dec/19 ]

Closing as incomplete as the team has spent a few days investigating without progress and would need more information to pursue further.

Comment by Martin Neupauer [ 18/Nov/19 ]

Cursory check of the source code did not find anything.
Given the code uses RAII it will most likely be some kind of race between an exception thrown and cleanup from other thread. It will be challenging to catch it.

Comment by David Storch [ 31/Oct/19 ]

martin.neupauer, could you investigate the cause of this invariant failure?

Comment by Maria van Keulen [ 24/Oct/19 ]

david.storch This looks to be a misuse of cursors, so assigning to Query.

Comment by Louis Williams [ 08/Oct/19 ]

The invariant says: a session was released into the MongoDB session cache but there was at least one WT cursor open on that session. This indicates there was a missed call to releaseCursor. This call is part of the destructor for our wrapper, WiredTigerCursor. We should explore where there are other callers of WiredTigerSession::getCursor that may not properly release their cursors.

Comment by Vaclav Bilek [ 03/Oct/19 ]

Its on different  physical HW, primary replicas of a sharded cluster.

Comment by Danny Hatcher (Inactive) [ 30/Sep/19 ]

jiri.sula@livesport.eu, is this the same mongod that crashed in SERVER-43674 or are they different? Is this a different process on the same underlying hardware? Please upload the full mongod logs from start (or at least a few days before) up through the crash for each ticket so we can take a closer look.

Generated at Thu Feb 08 05:03:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.