[SERVER-19148]  local.oplog.rs Assertion failure n >= 0 && n < static_cast<int>(_files.size()) src/mongo/db/storage/extent_manager.cpp 109 Created: 26/Jun/15  Updated: 02/Jul/15  Resolved: 02/Jul/15

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: xiaoli wang Assignee: Sam Kleinman (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Steps To Reproduce:

version: 2.6.8

3 nodes replica set

Participants:

 Description   

2015-05-16T00:11:19.146+0800 [rsSync] uh oh: 7063832
2015-05-16T00:11:19.146+0800 [rsSync] local.oplog.rs Assertion failure n >= 0 && n < static_cast<int>(_files.size()) src/mongo/db/storage/extent_manager.cpp 109
2015-05-16T00:11:19.156+0800 [rsSync] local.oplog.rs 0xffec79 0xfa7ea5 0xf90bd6 0xdb211c 0xdb32cd 0xdd9f76 0xdda34b 0xddb151 0xde045f 0xde04d8 0xde99d2 0xde95be 0x8dfaa4 0x8e037d 0xd3b8d5 0xd7ce6a 0xd82089 0xd826b0 0xd827a2 0xd8365e 
 mongod(_ZN5mongo15printStackTraceERSo+0x29) [0xffec79]
 mongod(_ZN5mongo10logContextEPKc+0x1f5) [0xfa7ea5]
 mongod(_ZN5mongo12verifyFailedEPKcS1_j+0x136) [0xf90bd6]
 mongod(_ZNK5mongo13ExtentManager12_getOpenFileEi+0xcc) [0xdb211c]
 mongod(_ZNK5mongo13ExtentManager9recordForERKNS_7DiskLocE+0x1d) [0xdb32cd]
 mongod(_ZNK5mongo16NamespaceDetails11inCapExtentERKNS_7DiskLocE+0x26) [0xdd9f76]
 mongod(_ZN5mongo16NamespaceDetails10__capAllocEi+0x16b) [0xdda34b]
 mongod(_ZN5mongo16NamespaceDetails11cappedAllocEPNS_10CollectionERKNS_10StringDataEi+0x1c1) [0xddb151]
 mongod(_ZN5mongo16NamespaceDetails6_allocEPNS_10CollectionERKNS_10StringDataEi+0x1f) [0xde045f]
 mongod(_ZN5mongo16NamespaceDetails5allocEPNS_10CollectionERKNS_10StringDataEi+0x58) [0xde04d8]
 mongod(_ZN5mongo19CappedRecordStoreV111allocRecordEii+0x32) [0xde99d2]
 mongod(_ZN5mongo17RecordStoreV1Base12insertRecordEPKcii+0x5e) [0xde95be]
 mongod(_ZN5mongo10Collection15_insertDocumentERKNS_7BSONObjEbPKNS_16PregeneratedKeysE+0x84) [0x8dfaa4]
 mongod(_ZN5mongo10Collection14insertDocumentERKNS_7BSONObjEbPKNS_16PregeneratedKeysE+0x1ad) [0x8e037d]
 mongod(_ZN5mongo11_logOpObjRSERKNS_7BSONObjE+0x435) [0xd3b8d5]
 mongod(_ZN5mongo7replset8SyncTail15applyOpsToOplogEPSt5dequeINS_7BSONObjESaIS3_EE+0xba) [0xd7ce6a]
 mongod(_ZN5mongo7replset8SyncTail16oplogApplicationEv+0x5e9) [0xd82089]
 mongod(_ZN5mongo11ReplSetImpl11_syncThreadEv+0x110) [0xd826b0]
 mongod(_ZN5mongo11ReplSetImpl10syncThreadEv+0x92) [0xd827a2]
 mongod(_ZN5mongo15startSyncThreadEv+0x9e) [0xd8365e]
2015-05-16T00:11:19.156+0800 [rsSync] replSet syncThread: 0 assertion src/mongo/db/storage/extent_manager.cpp:109



 Comments   
Comment by Sam Kleinman (Inactive) [ 02/Jul/15 ]

I'm going to go ahead and close this issue. MongoDB relies on the system's storage layer for normal operation, we cannot validate these issues without the ability to reliably reproduce the error and control for variability in storage engines and configurations.

Sorry again for this inconvenience, and I hope that you've been able to successfully restore your replica set to normal operation.

Regards,
sam

Comment by Sam Kleinman (Inactive) [ 30/Jun/15 ]

The best way to recover a system in this state is to resync the member. Stopping the instance, removing all data files, and then starting the member is one way to initiate resync.

Regards,
Sam

Comment by xiaoli wang [ 30/Jun/15 ]

Hi, Can I recovery my system by removing all local.* files ,and restart my mongod?

Comment by xiaoli wang [ 27/Jun/15 ]

>> 1. Are you using the directoryperdb option?
No
>> 2. Are you manipulating the files in the database path (dbpath) outside of the MongoDB interface, such as for backups or migrations?
No
>> 3. Do you have journaling disabled?
No
>> 4. What kind of underlying storage device or devices are you using, including connectivity (e.g. local/network), drive type (e.g. SSD/HDD) array configuration (e.g. RAID level, RAID controller.)
local disk as lvm
>> 5. Do you keep backups of your data, and if so, what method do you use to capture backups? Have you ever restored from backup?
we didn't do restore

Thank you very much

Regards
esala

Comment by Sam Kleinman (Inactive) [ 26/Jun/15 ]

Hello,

Sorry that you've encountered this error again. This looks like another, likely related, instance of data corruption in the oplog encountered during replication. Can you answer the following questions about your deployment:

  1. Are you using the directoryperdb option?
  2. Are you manipulating the files in the database path (dbpath) outside of the MongoDB interface, such as for backups or migrations?
  3. Do you have journaling disabled?
  4. What kind of underlying storage device or devices are you using, including connectivity (e.g. local/network), drive type (e.g. SSD/HDD) array configuration (e.g. RAID level, RAID controller.)
  5. Do you keep backups of your data, and if so, what method do you use to capture backups? Have you ever restored from backup?

Thanks for your help, and sorry again for the inconvenience.

Regards,
sam

Generated at Thu Feb 08 03:49:59 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.