[SERVER-17083] Background thread for oplog clean-up can timeout during repair (invariant failure) Created: 28/Jan/15  Updated: 18/Sep/15  Resolved: 28/Jan/15

Status: Closed
Project: Core Server
Component/s: Replication, WiredTiger
Affects Version/s: 3.0.0-rc6
Fix Version/s: 3.0.0-rc7

Type: Bug Priority: Major - P3
Reporter: Kamran K. Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 0
Labels: 28qa, wiredtiger
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-16921 WT oplog bottleneck on secondary Closed
is related to SERVER-17094 repairing local db with WiredTiger wi... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

The background thread for oplog clean-up (SERVER-16921) has a fixed timeout of 10 seconds, which can be exceeded during a repair.

Invariant failure _backgroundThread->wait(10000) src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp 262
 
#0  0x00007f88633fd20b in raise (sig=5) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1  0x00000000018c196f in mongo::breakpoint () at src/mongo/util/debugger.cpp:63
#2  0x00000000018b6d9e in mongo::invariantFailed (expr=0x201aa30 "_backgroundThread->wait(10000)", file=0x201a960 "src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp", line=262)
    at src/mongo/util/assert_util.cpp:147
#3  0x000000000173b1f5 in mongo::WiredTigerRecordStore::~WiredTigerRecordStore (this=0x4fddf00, __in_chrg=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp:262
#4  0x000000000173b3b0 in mongo::WiredTigerRecordStore::~WiredTigerRecordStore (this=0x4fddf00, __in_chrg=<optimized out>) at src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp:270
#5  0x00000000016ad536 in boost::checked_delete<mongo::RecordStore> (x=0x4fddf00) at src/third_party/boost/boost/checked_delete.hpp:39
#6  0x00000000016ad4d9 in boost::scoped_ptr<mongo::RecordStore>::~scoped_ptr (this=0x50ef2b0, __in_chrg=<optimized out>) at src/third_party/boost/boost/smart_ptr/scoped_ptr.hpp:80
#7  0x00000000016ac5c8 in mongo::KVCollectionCatalogEntry::~KVCollectionCatalogEntry (this=0x50ef280, __in_chrg=<optimized out>) at src/mongo/db/storage/kv/kv_collection_catalog_entry.cpp:95
#8  0x00000000016ac614 in mongo::KVCollectionCatalogEntry::~KVCollectionCatalogEntry (this=0x50ef280, __in_chrg=<optimized out>) at src/mongo/db/storage/kv/kv_collection_catalog_entry.cpp:96
#9  0x00000000016ae96a in mongo::KVDatabaseCatalogEntry::reinitCollectionAfterRepair (this=0x4fee320, opCtx=0x7f88597137d0, ns=...) at src/mongo/db/storage/kv/kv_database_catalog_entry.cpp:267
#10 0x00000000016b3e60 in mongo::KVStorageEngine::repairRecordStore (this=0x5004b50, txn=0x7f88597137d0, ns=...) at src/mongo/db/storage/kv/kv_storage_engine.cpp:275
#11 0x00000000015ab6de in mongo::repairDatabase (txn=0x7f88597137d0, engine=0x5004b50, dbName=..., preserveClonedFilesOnFailure=false, backupOriginalFiles=false) at src/mongo/db/repair_database.cpp:235
#12 0x00000000013350ae in mongo::CmdRepairDatabase::run (this=0x2874420 <mongo::cmdRepairDatabase>, txn=0x7f88597137d0, dbname=..., cmdObj=..., errmsg=..., result=..., fromRepl=false)
    at src/mongo/db/dbcommands.cpp:297
#13 0x0000000001331ed5 in mongo::_execCommand (txn=0x7f88597137d0, c=0x2874420 <mongo::cmdRepairDatabase>, dbname=..., cmdObj=..., queryOptions=0, errmsg=..., result=..., fromRepl=false)
    at src/mongo/db/dbcommands.cpp:1273
#14 0x0000000001332e52 in mongo::Command::execCommand (txn=0x7f88597137d0, c=0x2874420 <mongo::cmdRepairDatabase>, queryOptions=0, cmdns=0x4fe8c14 "local.$cmd", cmdObj=..., result=..., fromRepl=false)
    at src/mongo/db/dbcommands.cpp:1489
#15 0x0000000001333731 in mongo::_runCommands (txn=0x7f88597137d0, ns=0x4fe8c14 "local.$cmd", _cmdobj=..., b=..., anObjBuilder=..., fromRepl=false, queryOptions=0) at src/mongo/db/dbcommands.cpp:1561
#16 0x00000000015357d2 in mongo::runCommands (txn=0x7f88597137d0, ns=0x4fe8c14 "local.$cmd", jsobj=..., curop=..., b=..., anObjBuilder=..., fromRepl=false, queryOptions=0) at src/mongo/db/query/find.cpp:137
#17 0x00000000015377fa in mongo::runQuery (txn=0x7f88597137d0, m=..., q=..., nss=..., curop=..., result=..., fromDBDirectClient=false) at src/mongo/db/query/find.cpp:606
#18 0x000000000143d9a2 in mongo::receivedQuery (txn=0x7f88597137d0, c=..., dbresponse=..., m=..., fromDBDirectClient=false) at src/mongo/db/instance.cpp:220
#19 0x000000000143eb4c in mongo::assembleResponse (txn=0x7f88597137d0, m=..., dbresponse=..., remote=..., fromDBDirectClient=false) at src/mongo/db/instance.cpp:403
#20 0x000000000113f0be in mongo::MyMessageHandler::process (this=0x4cae180, m=..., port=0x4fef040, le=0x4ff09e0) at src/mongo/db/db.cpp:206
#21 0x00000000018dfbd0 in mongo::PortMessageServer::handleIncomingMsg (arg=0x4fef040) at src/mongo/util/net/message_server_port.cpp:229
#22 0x00007f88633f5182 in start_thread (arg=0x7f8859714700) at pthread_create.c:312
#23 0x00007f88624f600d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111


Version: 82daa0725d7f2



 Comments   
Comment by Eric Milkie [ 28/Jan/15 ]

Eliot's commit fixes this for --repair; SERVER-17094 filed for fixing the repair of local db in general.

Comment by Githook User [ 28/Jan/15 ]

Author:

{u'username': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: SERVER-17083: don't start background capped thread in repair

(cherry picked from commit ac9ee2fb80f2afc2737a0d9f346cff8117a82af2)
Branch: v3.0
https://github.com/mongodb/mongo/commit/52066525e913a9c95989f42d650dedb86c683b74

Comment by Githook User [ 28/Jan/15 ]

Author:

{u'username': u'erh', u'name': u'Eliot Horowitz', u'email': u'eliot@10gen.com'}

Message: SERVER-17083: don't start background capped thread in repair
Branch: master
https://github.com/mongodb/mongo/commit/ac9ee2fb80f2afc2737a0d9f346cff8117a82af2

Generated at Thu Feb 08 03:43:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.