[SERVER-10826] Assert in the secondary Created: 19/Sep/13  Updated: 27/Sep/13  Resolved: 27/Sep/13

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Dharshan Rangegowda Assignee: Joanna Cheng
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

AWS, M1.Large replica set


Attachments: Text File mongodassert.log    
Issue Links:
Related
related to DOCS-1986 Elaborate on case sensitivity in data... Closed
Operating System: ALL
Participants:

 Description   

Assert on the secondary of a brand new cluster. The user claims to only have typed "use <dbname>" which caused an assert in the secondary. I have attached the full logs for you.

Tue Sep 17 19:36:19.657 [conn10402] authenticate db: local

{ authenticate: 1, nonce: "420a32d3262a1deb", user: "__system", key: "332123ac38f870a9457839370b92c152" }

Tue Sep 17 19:36:27.128 [conn10401] end connection 10.190.206.142:39617 (7 connections now open)
Tue Sep 17 19:36:27.133 [initandlisten] connection accepted from 10.190.206.142:39620 #10403 (8 connections now open)
Tue Sep 17 19:36:27.135 [conn10403] authenticate db: local

{ authenticate: 1, nonce: "1a53ac883fe2352d", user: "__system", key: "31e02825a3634003c304d7266e7a54ed" }

Tue Sep 17 19:36:29.083 [repl prefetch worker] warning database /mongodb_data edspringPROD could not be opened
Tue Sep 17 19:36:29.083 [repl prefetch worker] DBException 13297: db already exists with different case other: [edSpringPROD] me [edspringPROD]
Tue Sep 17 19:36:29.083 [repl writer worker 1] warning database /mongodb_data edspringPROD could not be opened
Tue Sep 17 19:36:29.083 [repl writer worker 1] DBException 13297: db already exists with different case other: [edSpringPROD] me [edspringPROD]
Tue Sep 17 19:36:29.083 [repl writer worker 1] ERROR: writer worker caught exception: db already exists with different case other: [edSpringPROD] me [edspringPROD] on: { ts: Timestamp 1379446589000|1, h: -4967349403741369875, v: 2, op: "c", ns: "edspringPROD.$cmd", o:

{ dropDatabase: 1.0 }

}
Tue Sep 17 19:36:29.083 [repl writer worker 1] Fatal Assertion 16360
0xdc7f71 0xd87cf3 0xc1b29c 0xd95821 0xe10879 0x7f4004897851 0x7f4003c3a11d
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdc7f71]
/usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xa3) [0xd87cf3]
/usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc1b29c]
/usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xd95821]
/usr/bin/mongod() [0xe10879]
/lib64/libpthread.so.0(+0x7851) [0x7f4004897851]
/lib64/libc.so.6(clone+0x6d) [0x7f4003c3a11d]
Tue Sep 17 19:36:29.086 [repl writer worker 1]

***aborting after fassert() failure

Tue Sep 17 19:36:29.087 Got signal: 6 (Aborted).

Tue Sep 17 19:36:29.090 Backtrace:
0xdc7f71 0x6ce459 0x7f4003b84920 0x7f4003b848a5 0x7f4003b86085 0xd87d2e 0xc1b29c 0xd95821 0xe10879 0x7f4004897851 0x7f4003c3a11d
/usr/bin/mongod(_ZN5mongo15printStackTraceERSo+0x21) [0xdc7f71]
/usr/bin/mongod(_ZN5mongo10abruptQuitEi+0x399) [0x6ce459]
/lib64/libc.so.6(+0x32920) [0x7f4003b84920]
/lib64/libc.so.6(gsignal+0x35) [0x7f4003b848a5]
/lib64/libc.so.6(abort+0x175) [0x7f4003b86085]
/usr/bin/mongod(_ZN5mongo13fassertFailedEi+0xde) [0xd87d2e]
/usr/bin/mongod(_ZN5mongo7replset14multiSyncApplyERKSt6vectorINS_7BSONObjESaIS2_EEPNS0_8SyncTailE+0x12c) [0xc1b29c]
/usr/bin/mongod(_ZN5mongo10threadpool6Worker4loopEv+0x281) [0xd95821]
/usr/bin/mongod() [0xe10879]
/lib64/libpthread.so.0(+0x7851) [0x7f4004897851]
/lib64/libc.so.6(clone+0x6d) [0x7f4003c3a11d]



 Comments   
Comment by Joanna Cheng [ 27/Sep/13 ]

Hi Dharshan,

I'm going to close this ticket as I've not heard back from you, and I believe MongoDB is working as designed here.

Thanks for your feedback, and have a great weekend!

Kind regards,
Joanna

Comment by Joanna Cheng [ 24/Sep/13 ]

Hi Dharshan,

Thanks for the reply and details.

From mongod's perspective, at this point the primary and secondary are in inconsistent states (specifically, with respect to database names). Note at this stage we cannot continue further in the oplog as we are no longer consistent with the primary. In this state, to preserve the integrity of the replica set, the inconsistent secondary is taken down, as it cannot keep up with the rest of the replica set.

Kind regards,
Joanna

Comment by Dharshan Rangegowda [ 23/Sep/13 ]

Hi - the user was futzing around the console with "use dbname" command. So they might have created two dbs and then deleted them. I am not sure. However it would be good if this can be handled without a crash.

Comment by Joanna Cheng [ 23/Sep/13 ]

Hi Dharshan,

Looking at the log snippet you provided:

Tue Sep 17 19:36:29.083 [repl prefetch worker] warning database /mongodb_data edspringPROD could not be opened
Tue Sep 17 19:36:29.083 [repl prefetch worker] DBException 13297: db already exists with different case other: [edSpringPROD] me [edspringPROD]
Tue Sep 17 19:36:29.083 [repl writer worker 1] warning database /mongodb_data edspringPROD could not be opened
Tue Sep 17 19:36:29.083 [repl writer worker 1] DBException 13297: db already exists with different case other: [edSpringPROD] me [edspringPROD]
Tue Sep 17 19:36:29.083 [repl writer worker 1] ERROR: writer worker caught exception: db already exists with different case other: [edSpringPROD] me [edspringPROD] on: { ts: Timestamp 1379446589000|1, h: -4967349403741369875, v: 2, op: "c", ns: "edspringPROD.$cmd", o:
{ dropDatabase: 1.0 }

It shows the replication thread trying to replicate a command on the secondary, to drop the database edSpringPROD
However this database does not appear on the secondary, a different database edspringPROD (note the lower case 's') exists.
As this command cannot be replicated, this leads the secondary to be in an inconsistent state, relative to the primary. In this situation we bring mongod down.

Could you shed any light as to how the inconsistency in database names could have happened?

This case has highlighted that our documentation could be clarified, with regards to how cases in database names are handled; I will raise this with our documentation team.

Kind regards,
Joanna

Generated at Thu Feb 08 03:24:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.