[SERVER-20866] Race condition in oplog insert transaction rollback Created: 11/Oct/15  Updated: 08/Feb/16  Resolved: 23/Dec/15

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: 3.0.2
Fix Version/s: 3.0.9

Type: Bug Priority: Critical - P2
Reporter: Yoni Douek Assignee: Geert Bosch
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File another_crash     HTML File crash_308_debug     Zip Archive monogo_crash.zip     HTML File perfect_example    
Issue Links:
Depends
Duplicate
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: QuInt E (01/11/16)
Participants:

 Description   
Issue Status as of Dec 03, 2015

ISSUE SUMMARY
When using the MMAPv1 storage engine, a race condition during the transaction rollback of oplog insertions may result in a crash.

USER IMPACT
MongoDB may crash and corrupt oplog. Primary re-election may not occur. Heavy insert loads increase the likelihood that this race condition may impact the system.

This behavior has been observed when balancing a collection that has large objects and when the chunk migration "source" mongod is under heavy load.

WORKAROUNDS
There are no workarounds for this issue.

Following a crash execute

db.shutdownServer({force:true})

to ensure a reelection occurs.

AFFECTED VERSIONS
MongoDB 3.0.0 through 3.0.8

FIX VERSION
The fix is included in the 3.0.9 release.

Original Description
After we added a shard, we got this crash 3 times. The node becomes useless, and we have to manually elect a new primary and restore the data.

After investigation we found out the the oplog collection is corrupt (validate() returns an error).

This is the crash:

mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xf6b4a9]
 mongod(_ZN5mongo10logContextEPKc+0xE1) [0xf0bca1]
 mongod(_ZN5mongo11msgassertedEiPKc+0xAF) [0xef0eff]
 mongod(+0xAF0FAC) [0xef0fac]
 mongod(_ZNK5mongo7BSONObj14_assertInvalidEv+0x3DF) [0x84feff]
 mongod(_ZN5mongo10Collection19aboutToDeleteCappedEPNS_16OperationContextERKNS_8RecordIdENS_10RecordDataE+0xB8) [0x92de38]
 mongod(_ZN5mongo19CappedRecordStoreV111allocRecordEPNS_16OperationContextEib+0x518) [0xd621b8]
 mongod(_ZN5mongo17RecordStoreV1Base12insertRecordEPNS_16OperationContextEPKNS_9DocWriterEb+0x76) [0xd5b0f6]
 mongod(_ZN5mongo10Collection14insertDocumentEPNS_16OperationContextEPKNS_9DocWriterEb+0x5C) [0x92e8dc]
 mongod(+0x8550A6) [0xc550a6]
 mongod(_ZN5mongo4repl5logOpEPNS_16OperationContextEPKcS4_RKNS_7BSONObjEPS5_Pbb+0x161) [0xc537b1]
 mongod(_ZN5mongo11UpdateStage18transformAndUpdateERKNS_11SnapshottedINS_7BSONObjEEERNS_8RecordIdE+0x6AC) [0xa567fc]
 mongod(_ZN5mongo11UpdateStage4workEPm+0x3E5) [0xa56fb5]
 mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0xA4) [0xbed174]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x34) [0xbed524]
 mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x3D) [0xbedb5d]
 mongod(_ZN5mongo18WriteBatchExecutor10execUpdateERKNS_12BatchItemRefEPNS_7BSONObjEPPNS_16WriteErrorDetailE+0x71D) [0x9ce03d]
 mongod(_ZN5mongo18WriteBatchExecutor11bulkExecuteERKNS_21BatchedCommandRequestERKNS_19WriteConcernOptionsEPSt6vectorIPNS_19BatchedUpsertDetailESaIS9_EEPS7_IPNS_16WriteErrorDetailESaISE_EE+0x23C) [0x9cf82c]
 mongod(_ZN5mongo18WriteBatchExecutor12executeBatchERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseE+0x37B) [0x9cfd6b]
 mongod(_ZN5mongo8WriteCmd3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x15D) [0x9d270d]
 mongod(_ZN5mongo12_execCommandEPNS_16OperationContextEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x34) [0x9f5e34]
 mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xC1D) [0x9f6dbd]
 mongod(_ZN5mongo12_runCommandsEPNS_16OperationContextEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x28B) [0x9f7acb]
 mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERKNS_15NamespaceStringERNS_5CurOpES3_+0x746) [0xbbb836]
 mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xB10) [0xad2190]
 mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xDD) [0x829cad]
 mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x34B) [0xf1f09b]
 libpthread.so.0(+0x7DF3) [0x7f36287e9df3]
 libc.so.6(clone+0x6D) [0x7f362729d1bd]
-----  END BACKTRACE  -----



 Comments   
Comment by Daniel Pasette (Inactive) [ 30/Dec/15 ]

jon@appboy.com, I've added a synopsis of the issue in the description. I can also add that this is a relatively rare issue (two known reports in a release that has been out almost 1 year).

Comment by Jon Hyman [ 29/Dec/15 ]

Is there more of a synopsis that someone can provide here? This only happens on collections with "large objects" but how large is that? We're planning on upgrading from 2.6.11 to 3.0.8 next week but I am now reluctant and wondering if I should wait for 3.0.9 given that we add shards all the time. Our max document sizes are around 500kb, but average is around 1kb.

Comment by Daniel Pasette (Inactive) [ 28/Dec/15 ]

We're planning to make a release candidate for v3.0.9 which will include this fix in the first couple weeks of January.

Comment by Yoni Douek [ 28/Dec/15 ]

We can approve that this fix works, thanks for this!

We noticed that you're working in ~2 month release cycles, and can't afford to wait for 3.0.9.
Is there a chance that we can get a "hotfix" binary for Amazon Linux (3.0.8 + this commit, or anything stable that we can use in production)?

Thanks!

Comment by Daniel Pasette (Inactive) [ 24/Dec/15 ]

That's a good build. There were a few test fixes checked in after the fix you're after, and the latest now contains those fixes too. You can check the commit that the latest build corresponds to in the version string. In this case, the git hash prefix is 7049b7f. The commit stream for the v3.0 branch is here:
https://github.com/mongodb/mongo/commits/v3.0

Let us know how this works for you.

Comment by Yoni Douek [ 24/Dec/15 ]

Thanks dan, none of the nightlies in https://www.mongodb.org/downloads#development return this value, latest we could find is "v3.0.8-11-g7049bf7" - can u point me to the nightly build's address?

Comment by Daniel Pasette (Inactive) [ 23/Dec/15 ]

It should be done. If you run mongod --version you should see the mongo version 3.0.9-pre- and git version: 5d07580128edf5bb00f175db054f89c5fefc8b13

Comment by Yoni Douek [ 23/Dec/15 ]

BTW - we were constantly seeing this message (every few seconds) - even hours before the crash -

assertion 13388 [MyDomain.Sessions] shard version not ok: version mismatch detected for MyDomain.Sessions, stored major version 503 does not match received 502 ( ns : MyDomain.Sessions, received : 502|1||563ba74d869758a4542b9075, wanted : 503|0||563ba74d869758a4542b9075, send ) ( ns : MyDomain.Sessions, received : 502|1||563ba74d869758a4542b9075, wanted : 503|0||563ba74d869758a4542b9075, send ) ns:MyDomain.$cmd query:{ findAndModify: "Sessions", query: { a: ObjectId('52610428645cff0f78c98742'), _id: ObjectId('566f3d8205eb770ed2cf36f8'), fn: { $elemMatch: { fi: ObjectId('56445dc8d5539e10c472482a'), s: 1 } } }, new: true, update: { $set: { fn.$.n: new Date(1450157662438) }, $currentDate: { lm: true } }, upsert: false }

Is this related? any idea why this might happen?

Comment by Yoni Douek [ 23/Dec/15 ]

Amazon Linux

Comment by Daniel Pasette (Inactive) [ 23/Dec/15 ]

The nightly binaries are building now and should be available later today. Which variant are you using?

Comment by Yoni Douek [ 23/Dec/15 ]

Awesome thanks for this! We'll test the nightly build (assuming this is included) and update.

Comment by Githook User [ 23/Dec/15 ]

Author:

{u'username': u'GeertBosch', u'name': u'Geert Bosch', u'email': u'geert@mongodb.com'}

Message: SERVER-20866: Fix race in rollback of oplog insertions
Branch: v3.0
https://github.com/mongodb/mongo/commit/5d07580128edf5bb00f175db054f89c5fefc8b13

Comment by Ryan Farisco [ 23/Dec/15 ]

why am i being copied into this ticket still?

?

Comment by Geert Bosch [ 23/Dec/15 ]

Yes, that is correct, 3.2 is not affected.

Comment by Yoni Douek [ 23/Dec/15 ]

Dan, this is interesting info, if we upgrade to 3.2 - this issue will be resolved? Thanks

Comment by Daniel Pasette (Inactive) [ 23/Dec/15 ]

Hi ripusingla, we're currently working on a fix for the 3.0 branch now. This issue does not impact the 3.2 branch.

Comment by Ripu Singla [ 23/Dec/15 ]

Hi @Geert Bosch, @Dan Pasette, Can you guys help me find the issue and how to resolve it? Rally appreciate any pointers. Thanks.

Comment by Ripu Singla [ 23/Dec/15 ]

Hi @Ryan Farisco, Sorry for late reply, I didn't add you, looks like you were copied accidentally. Apologies.

Comment by Ryan Farisco [ 18/Dec/15 ]

Hi Ripu, we have been copied in accidentally?

?

Comment by Ripu Singla [ 18/Dec/15 ]

I see.. Thanks for the info

Comment by Yoni Douek [ 18/Dec/15 ]

We're restoring the old primary using 12 hour old disk backups (AWS Snapshots), so the new primary only needs to sync the 12 hour changes (takes a few hours).

Comment by Ripu Singla [ 18/Dec/15 ]

Hi @Yoni Douek, Thanks for the heads up, what I do is to stop the primary using "sudo service mongod stop" and secondary automatically takes primary role. But the hard part is to re-sync original primary with new primary, it takes around 2 days as we have roughly ~1TB of data on this replica. Any work around on this? how do you restore data on the corrupted mongo instance?

Comment by Yoni Douek [ 18/Dec/15 ]

Ripu, happy to see we're not alone in this! The annoying part of this nasty bug is that the primary doesn't step down. My tip for you is when u see this - manually stop it using

db.shutdownServer({force:true})

so that the secondary will take over.

Comment by Ripu Singla [ 18/Dec/15 ]

Hi @Geert Bosch, @Yoni Douek, we are facing exact same issue, our production server crashed 3rd time last night, Can you let us know the root cause or fix??

We are using 3 replica sets(primary+secondary+arbiter), 3 config servers and 1 query servers, total of 13 machines. We are using Mongo version 3.0.5.
In our case also, the crash happens only when the balancer is on, otherwise servers go on without any problem for days. After crash oplog of one of the primary gets corrupted while the secondary stays healthy. We are also using LVM with 2TB storage on each of 3 replicas.

Thanks.
Any help much appreciated.
Ripu Singla.

Comment by Yoni Douek [ 16/Dec/15 ]

Thanks for the update Geert, looking forward for this ! : )

Comment by Geert Bosch [ 16/Dec/15 ]

Hi Yoni,

We believe we've found the root cause of this issue and are working on a fix.

-Geert

Comment by Yoni Douek [ 16/Dec/15 ]

Hey,

Attached plz find a log of the crash from 3.0.8-rc0 with debug symbols.

Ryan - I didn't add you, maybe someone from mongodb did.

Comment by Ryan Farisco [ 15/Dec/15 ]

Hi Yoni - I believe we have been copied into the incorrect ticket.

?

Comment by Yoni Douek [ 15/Dec/15 ]

Perfect. We're on it.

Comment by Ramon Fernandez Marina [ 15/Dec/15 ]

yonido, you can download a 3.0.8-rc0 with debug symbols here (Linux64; other platforms and versions are also available).

We can wait until you use these binaries to collect further information, but since it seems you're able to reproduce this issue easily it may help if you sent us a recipe to reproduce on our end, if you have one.

Thanks,
Ramón.

Comment by Daniel Pasette (Inactive) [ 15/Dec/15 ]

Reopening. We'll take a look Yoni.

Comment by Yoni Douek [ 15/Dec/15 ]

Hey,

We've been investigating this issue very deeply in the last several weeks. We are now 100% sure that this is a bug in mongodb and not anything environmental.
1. We checked the storage layer, as Ramon suggested, and found no errors in either LVM or XFS logs.
2. We then reproduced the issue on a flat volume, i.e. - a single ext4 drive, without LVM which holds /data.
3. The oplog, and all other collections pass validate() correctly before the crash. After running the scenario which we will describe, only the oplog is corrupted. If this was something in the storage layer, the corrupted collection would be random. The process of moving the chunk is the cause for the oplog corruption.
4. Only when the balancer is running this happens.
5. We are now in a state where we can quickly reproduce it in our test environment (happened >10 times).

To repeat - after running the balancer, when balancing a collection that has large objects (and only in these type of collections) - and when the chunk migration "source" mongod is under stress (heavy load), we get the following:
1. Many of these:

assertion 13388 [MyDomain.Sessions] shard version not ok: version mismatch detected for MyDomain.Sessions, stored major version 503 does not match received 502 ( ns : MyDomain.Sessions, received : 502|1||563ba74d869758a4542b9075, wanted : 503|0||563ba74d869758a4542b9075, send ) ( ns : MyDomain.Sessions, received : 502|1||563ba74d869758a4542b9075, wanted : 503|0||563ba74d869758a4542b9075, send ) ns:MyDomain.$cmd query:{ findAndModify: "Sessions", query: { a: ObjectId('52610428645cff0f78c98742'), _id: ObjectId('566f3d8205eb770ed2cf36f8'), fn: { $elemMatch: { fi: ObjectId('56445dc8d5539e10c472482a'), s: 1 } } }, new: true, update: { $set: { fn.$.n: new Date(1450157662438) }, $currentDate: { lm: true } }, upsert: false }

2. Either

Assertion failure isOk() src/mongo/db/storage/mmap_v1/extent.h 81

or

Assertion: 10334:BSONObj size: -286331154 (0xEEEEEEEE) is invalid. Size must be between 0 and 16793600(16MB) First element: ts: Timestamp 1448283231000|537

3. Then either

 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0xf9cfb2]
 mongod(_ZN5mongo10logContextEPKc+0xE9) [0xf3a239]
 mongod(_ZN5mongo11msgassertedEiPKc+0xAE) [0xf1eade]
 mongod(+0xB1EB8C) [0xf1eb8c]
 mongod(_ZNK5mongo7BSONObj14_assertInvalidEv+0x3BB) [0x85349b]
 mongod(_ZN5mongo10Collection19aboutToDeleteCappedEPNS_16OperationContextERKNS_8RecordIdENS_10RecordDataE+0xBF) [0x9327ef]
 mongod(_ZN5mongo19CappedRecordStoreV111allocRecordEPNS_16OperationContextEib+0x534) [0xd8a5f4]
 mongod(_ZN5mongo17RecordStoreV1Base12insertRecordEPNS_16OperationContextEPKNS_9DocWriterEb+0x79) [0xd83329]
 mongod(_ZN5mongo10Collection14insertDocumentEPNS_16OperationContextEPKNS_9DocWriterEb+0x62) [0x9332a2]
 mongod(+0x86837B) [0xc6837b]
 mongod(_ZN5mongo4repl5logOpEPNS_16OperationContextEPKcS4_RKNS_7BSONObjEPS5_Pbb+0x117) [0xc66ad7]
 mongod(_ZN5mongo11UpdateStage18transformAndUpdateERKNS_11SnapshottedINS_7BSONObjEEERNS_8RecordIdE+0x6D1) [0xa60631]
 mongod(_ZN5mongo11UpdateStage4workEPm+0x3F5) [0xa60dd5]
 mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0x142) [0xbfe0f2]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x3B) [0xbfe40b]
 mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x3D) [0xbfeaed]
 mongod(_ZN5mongo18WriteBatchExecutor10execUpdateERKNS_12BatchItemRefEPNS_7BSONObjEPPNS_16WriteErrorDetailE+0x719) [0x9d5b09]
 mongod(_ZN5mongo18WriteBatchExecutor11bulkExecuteERKNS_21BatchedCommandRequestERKNS_19WriteConcernOptionsEPSt6vectorIPNS_19BatchedUpsertDetailESaIS9_EEPS7_IPNS_16WriteErrorDetailESaISE_EE+0x245) [0x9d7365]
 mongod(_ZN5mongo18WriteBatchExecutor12executeBatchERKNS_21BatchedCommandRequestEPNS_22BatchedCommandResponseE+0x465) [0x9d7995]
 mongod(_ZN5mongo8WriteCmd3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x17C) [0x9da59c]
 mongod(_ZN5mongo12_execCommandEPNS_16OperationContextEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x31) [0x9fc431]
 mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xC7C) [0x9fd44c]
 mongod(_ZN5mongo12_runCommandsEPNS_16OperationContextEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x28B) [0x9fe11b]
 mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERKNS_15NamespaceStringERNS_5CurOpES3_+0x73B) [0xbcba5b]
 mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xB72) [0xaded22]
 mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xF5) [0x82c425]
 mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x339) [0xf4e509]
 libpthread.so.0(+0x7DF3) [0x7f4f696aadf3]
 libc.so.6(clone+0x6D) [0x7f4f6815e1bd]

or

 mongod(_ZN5mongo15printStackTraceERSo+0x32) [0xf9cfb2]
 mongod(_ZN5mongo10logContextEPKc+0xE9) [0xf3a239]
 mongod(_ZN5mongo12verifyFailedEPKcS1_j+0xDA) [0xf1edca]
 mongod(_ZNK5mongo19MmapV1ExtentManager9getExtentERKNS_7DiskLocEb+0x70) [0xd7cac0]
 mongod(_ZN5mongo17RecordStoreV1Base27_addRecordToRecListInExtentEPNS_16OperationContextEPNS_6RecordENS_7DiskLocE+0x3A) [0xd82c0a]
 mongod(_ZN5mongo17RecordStoreV1Base12insertRecordEPNS_16OperationContextEPKNS_9DocWriterEb+0x1BE) [0xd8346e]
 mongod(_ZN5mongo10Collection14insertDocumentEPNS_16OperationContextEPKNS_9DocWriterEb+0x62) [0x9332a2]
 mongod(+0x86837B) [0xc6837b]
 mongod(_ZN5mongo4repl5logOpEPNS_16OperationContextEPKcS4_RKNS_7BSONObjEPS5_Pbb+0x117) [0xc66ad7]
 mongod(_ZN5mongo11UpdateStage18transformAndUpdateERKNS_11SnapshottedINS_7BSONObjEEERNS_8RecordIdE+0x6D1) [0xa60631]
 mongod(_ZN5mongo11UpdateStage4workEPm+0x3F5) [0xa60dd5]
 mongod(_ZN5mongo12PlanExecutor18getNextSnapshottedEPNS_11SnapshottedINS_7BSONObjEEEPNS_8RecordIdE+0x142) [0xbfe0f2]
 mongod(_ZN5mongo12PlanExecutor7getNextEPNS_7BSONObjEPNS_8RecordIdE+0x3B) [0xbfe40b]
 mongod(_ZN5mongo12PlanExecutor11executePlanEv+0x3D) [0xbfeaed]
 mongod(_ZN5mongo6updateEPNS_16OperationContextEPNS_8DatabaseERKNS_13UpdateRequestEPNS_7OpDebugE+0x12B) [0xb4bfeb]
 mongod(_ZN5mongo16CmdFindAndModify7runImplEPNS_16OperationContextERKSsS4_RKNS_7BSONObjES7_S7_S7_bbbRNS_14BSONObjBuilderERSs+0x972) [0x983502]
 mongod(_ZN5mongo16CmdFindAndModify3runEPNS_16OperationContextERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x332) [0x984df2]
 mongod(_ZN5mongo12_execCommandEPNS_16OperationContextEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb+0x31) [0x9fc431]
 mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_iPKcRNS_7BSONObjERNS_14BSONObjBuilderEb+0xC7C) [0x9fd44c]
 mongod(_ZN5mongo12_runCommandsEPNS_16OperationContextEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi+0x28B) [0x9fe11b]
 mongod(_ZN5mongo8runQueryEPNS_16OperationContextERNS_7MessageERNS_12QueryMessageERKNS_15NamespaceStringERNS_5CurOpES3_+0x73B) [0xbcba5b]
 mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0xB72) [0xaded22]
 mongod(_ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0xF5) [0x82c425]
 mongod(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x339) [0xf4e509]
 libpthread.so.0(+0x7DF3) [0x7f4f696aadf3]
 libc.so.6(clone+0x6D) [0x7f4f6815e1bd]

or both - endlessly.

We are using mmapv1.

I've also attached another example - this one is better than the previous ones that were previously attached - called "perfect_example".
It happened also in mongo 3.0.8-rc0 (we thought that https://jira.mongodb.org/browse/SERVER-21646 is somehow related because this feels like a concurrency/threading issue on writing to the oplog).

Please re-open the ticket and check this. Since we can now reproduce this quite easily, we'll be happy to help you debug this by sending us a version that contains debugging information. Our cluster crashes every other day and unfortunately we'll have to use another database if this is not solved.

Comment by Ramon Fernandez Marina [ 17/Nov/15 ]

Hi yonido, apologies for the delay in getting back to you. Some of the messages in the logs you uploaded point to corrupt BSON:

2015-09-30T16:00:56.753+0000 I -        [conn120121] Assertion failure isOk() src/mongo/db/storage/mmap_v1/extent.h 77
2015-09-30T16:00:56.753+0000 I -        [conn1] Assertion: 10320:BSONElement: bad type -104
...
2015-10-08T17:43:37.913+0000 I -        [rsBackgroundSync] Assertion: 10320:BSONElement: bad type 105
...
153295934-2015-10-21T09:52:40.982+0000 I -        [conn195100] Assertion: 10334:BSONObj size: 1127114085 (0x432E6565) is inval
id. Size must be between 0 and 16793600(16MB) First element: stomEventPropertyReports: ?type=117

The most common reason for these errors are failures at the storage layer. If a node is finding corrupt data then you may want to resync it from a healthy node, or restore its data from a backup. The failure to successfully validate() the oplog points in this direction as well.

I'd recommend you check the health of the storage layer of the affected node, as well as search its logs for storage-related errors, before you restore this node. If storage in this node is flaky then this problem is likely to reoccur.

Since I haven't found evidence of a bug in the server, and we keep the SERVER project is for reporting bugs (or feature suggestions) for the MongoDB server, I'm going to close this ticket. For MongoDB-related support discussion you can post on the mongodb-user group or Stack Overflow with the mongodb tag, where your question will reach a larger audience. See also our Technical Support page for additional support resources.

Regards,
Ramón.

Comment by Yoni Douek [ 21/Oct/15 ]

Hey,

Attaching another crash on which this happened. Note the following:

  • Seems to happen after finishing to move a chunk
  • Just before that, we get "shard version not ok: version mismatch detected for MyDomain.HeatmapReports, stored major version 3328 does not match received 3327"
  • And then: "Assertion: 10334:BSONObj size: 1127114085 (0x432E6565) is invalid. Size must be between 0 and 16793600(16MB) First element: stomEventPropertyReports: ?type=117" ==> corrupt
Comment by Yoni Douek [ 20/Oct/15 ]

Bumping this, @ramon.fernandez - our cluster is crashing every other day, please help

Comment by Yoni Douek [ 12/Oct/15 ]

Tara, not sure I understand what you're trying to say ; ) ?

Comment by Tara Gokey [ 12/Oct/15 ]

warning: problem while initially checking shard versions on shard0000 :: caused by :: 10276 DBClientBase::findN: transport error: mongoshard01.vixletinternal.com:27017 ns: admin.$cmd query:

{ setShardVersion: "", init: true, configdb: "mongocfg01.vixletinternal.com:27019,mongocfg02.vixletinternal.com:27019,mongocfg03.vixletinternal.com:27019", serverID: ObjectId('561b565a55b9ea01b3761314'), authoritative: true }

Mon Oct 12 07:46:09.878 [conn3499] warning: db exception when initializing on shard0000:mongoshard01.vixletinternal.com:27017, current connection state is { state:

{ conn: "", vinfo: "shard0000:mongoshard01.vixletinternal.com:27017", cursor: "(none)", count: 0, done: false }

, retryNext: false, init: false, finish: false, errored: false } :: caused by :: 10276 DBClientBase::findN: transport error: mongoshard01.vixletinternal.com:27017 ns: admin.$cmd query:

{ setShardVersion: "", init: true, configdb: "mongocfg01.vixletinternal.com:27019,mongocfg02.vixletinternal.com:27019,mongocfg03.vixletinternal.com:27019", serverID: ObjectId('561b565a55b9ea01b3761314'), authoritative: true }
Comment by Yoni Douek [ 11/Oct/15 ]

5. For this situation, like the crash above, the oplog was also corrupt (validate() returned error), whereas all other collections were fine.

Comment by Yoni Douek [ 11/Oct/15 ]

Attached, the log from primary and secondary on the time of crash. Notes:

1. The setup is primary+secondary+arbiter.

2. This log is from a scenario which is a bit different from the crashlog above. This one shows "Assertion failure isOk() src/mongo/db/storage/mmap_v1/extent.h 77"

In our current setup, we saw different manifestations of this crash, already 3 times every day or two. The one attached is one of them - and the other is the one above which shows the crash trace above many many times for every write (unfortunately i don't have full logs for that one).

3. When balancer is stopped, this doesn't happen.

4. When it happens, the process is still alive but all writes fail, maybe this is why the other nodes don't see it as a dead node. But you can see the log of the secondary.

Please help, our production servers are very unstable now.

Comment by Ramon Fernandez Marina [ 11/Oct/15 ]

yonido, can you please send the full logs for the affected node? We'll also need to see the logs of the other members of this replica set will to understand why a new primary was not elected.

Thanks,
Ramón.

Generated at Thu Feb 08 03:55:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.