[SERVER-38101] Secondary Crashing with "aborting after fassert() failure" error Created: 13/Nov/18  Updated: 16/Nov/21  Resolved: 13/Nov/18

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Sree Himakunthala Assignee: Kelsey Schubert
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates SERVER-36332 CursorNotFound error in GetMore on a ... Closed
Operating System: ALL
Steps To Reproduce:

N/A

Participants:

 Description   

I upgraded this test mongodb replica set from 3.4.x to 3.6.7  about 2 weeks back.

 

Today, i experienced a secondary mongod crashing with the following errors:

Secondary Crashing with "aborting after fassert() failure" error

Mongod Log extracts:

 

2018-11-12T17:16:13.664-0800 I NETWORK [thread28] Successfully connected to 10.10.190.6:27017 (49250 connections now open to xxx with a 0 second timeout)
2018-11-12T17:16:13.664-0800 I NETWORK [thread28] scoped connection to xxxx not being returned to the pool
2018-11-12T17:16:41.838-0800 I NETWORK [listener] connection accepted from xxx #49461 (15 connections now open)
2018-11-12T17:16:41.939-0800 I NETWORK [conn49461] received client metadata from xxxx:42908 conn49461: { driver:

{ name: "mongo-csharp-driver", version: "2.7.0.0" }

, os: { type: "Linux", name: "Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC 2016", architecture: "x86_64", version: "4.2.0-42-generic" }, platform: ".NET Core 4.6.26328.01" }
2018-11-12T17:17:09.062-0800 F REPL [repl writer worker 13] writer worker caught exception: NamespaceNotFound: Failed to apply operation due to missing collection (6e46dffd-9a3c-496c-81b9-3a7cfa4f1ac6): { ts: Timestamp(1542071829, 1), t: 27, h: -7111336552031876369, v: 2, op: "i", ns: "config.system.sessions", ui: UUID("6e46dffd-9a3c-496c-81b9-3a7cfa4f1ac6"), wall: new Date(1542071829042), o: { _id:

{ id: UUID("85cc8499-8b2c-4b4f-a871-5b4fc8358968"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }

, lastUse: new Date(1542071829043) } } on: { ts: Timestamp(1542071829, 1), t: 27, h: -7111336552031876369, v: 2, op: "i", ns: "config.system.sessions", ui: UUID("6e46dffd-9a3c-496c-81b9-3a7cfa4f1ac6"), wall: new Date(1542071829042), o: { _id:

{ id: UUID("85cc8499-8b2c-4b4f-a871-5b4fc8358968"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }

, lastUse: new Date(1542071829043) } }
2018-11-12T17:17:09.062-0800 F - [repl writer worker 13] Fatal assertion 16359 NamespaceNotFound: Failed to apply operation due to missing collection (6e46dffd-9a3c-496c-81b9-3a7cfa4f1ac6): { ts: Timestamp(1542071829, 1), t: 27, h: -7111336552031876369, v: 2, op: "i", ns: "config.system.sessions", ui: UUID("6e46dffd-9a3c-496c-81b9-3a7cfa4f1ac6"), wall: new Date(1542071829042), o: { _id:

{ id: UUID("85cc8499-8b2c-4b4f-a871-5b4fc8358968"), uid: BinData(0, E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855) }

, lastUse: new Date(1542071829043) } } at src/mongo/db/repl/sync_tail.cpp 1209
2018-11-12T17:17:09.062-0800 F - [repl writer worker 13]

***aborting after fassert() failure

 

I tried to restart and am getting the same error with the mongod crashing.

Also, after the upgrade, i did modify the featureCompatibilityVersion to 3.6

{ "_id" : "featureCompatibilityVersion", "version" : "3.6" }

 Comments   
Comment by Kelsey Schubert [ 13/Nov/18 ]

Hi sreek95051,

Thanks for your report. Would you please upgrade to latest version of MongoDB 3.6 to take advantage of SERVER-36332? This work significantly improves how the config.system.sessions collection behaves on replicated nodes. If the issue persists after upgrading, I would suggest performing an initial sync to guarantee that the secondary is back in a consistent state. Please let us know if you encounter any similar issues after performing these steps.

Thank you,
Kelsey

Generated at Thu Feb 08 04:47:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.