[SERVER-59350] Invariant failure exception Created: 16/Aug/21  Updated: 24/Sep/21  Resolved: 24/Sep/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Basil Markov Assignee: Eric Sedor
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-53566 Investigate and reproduce "opCtx != n... Closed
Operating System: ALL
Participants:

 Description   

We have 10 shards cluster in our production environment (3 x MongoDB replica set (primary-secondary-secondary) for each shard that is to say 30 x MongoDB physical hosts in total), plus 3 x ConfigServer hosts (1 RS) plus 3 x Mongos.

All of the MongoDB services have 4.4.4 version.

15.07 we found half of our 30 mongo services shut down due to some strange exception:

{"t":{"$date":"2021-08-15T10:21:01.620+03:00"},"s":"F""c":"-",        "id":23079,   "ctx":"waitForMajority","msg":"Invariant failure","attr":{"expr":"opCtx != nullptr && _opCtx == nullptr","file":"src/mongo/db/client.cpp","line":126}}

So I've made my own investigation and found some similar cases - https://jira.mongodb.org/browse/SERVER-52735

Next, what I've found is an official recommendation not to use MongoDB 4.4.5 in any variation according to https://docs.mongodb.com/manual/release-notes/4.4-changelog/

MongoDB version 4.4.5 is not recommended for production use due to a critical issue, WT-7426. The issue is fixed in version 4.4.6.

https://jira.mongodb.org/browse/WT-7426 in turn has some resemblant cases linked.

So what is your final recommendation for alleviation of InvariantFailure exceptions?
What version of MongoDB do we have to use in our production circuit right now?

If you wanted to see some additional logging or diagnostic data/shard configuration I can upload it for your further investigation.



 Comments   
Comment by Eric Sedor [ 24/Sep/21 ]

Thanks Basil; please write in again if you see other issues.

While I am replying, I thought I'd suggest moving from 4.4.8 to 4.4.9 if you haven't already. 4.4.9 has critical fixes for issues reported at https://www.mongodb.com/alerts.

Comment by Basil Markov [ 16/Aug/21 ]

Thanks for your reply, all MongoDB upgrades will be made.

Comment by Eric Sedor [ 16/Aug/21 ]

Hi haltandcatchfire91@gmail.com, I believe you have correctly identified the set of related tickets around the opCtx issue. SERVER-53566 tracks the fix in 4.4.5.

Our recommendation is to move to MongoDB 4.4.8, the latest release in the 4.4 series.

Generated at Thu Feb 08 05:47:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.