-
Type:
Bug
-
Resolution: Incomplete
-
Priority:
Major - P3
-
None
-
Affects Version/s: 3.0.1
-
Component/s: Replication, WiredTiger
-
None
-
ALL
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
I've upgraded my UAT environment to Mongo 3.0.1 with the WiredTiger storage engine & am repeatedly seeing my primary replicaset fallback to RECOVERY mode with the below stacktrace. The "prism-pubsub.intergate" collection mentioned in the stack trace is a capped collection which we're opening a tailable cursor on using the (just forked) mubsub library here: https://github.com/evanlh/mubsub.
2015-04-02T15:52:21.286-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\util\stacktrace_win.cpp(175) mongo::printStackTrace+0x43
2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\util\log.cpp(135) mongo::logContext+0x97
2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\util\assert_util.cpp(214) mongo::msgasserted+0xd7
2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\storage\wiredtiger\wiredtiger_record_store.cpp(330) mongo::WiredTigerRecordStore::dataFor+0xc7
2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\catalog\collection.cpp(157) mongo::Collection::docFor+0x52
2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\exec\idhack.cpp(153) mongo::IDHackStage::work+0x176
2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\query\plan_executor.cpp(328) mongo::PlanExecutor::getNextSnapshotted+0xdd
2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\query\plan_executor.cpp(286) mongo::PlanExecutor::getNext+0x41
2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\dbhelpers.cpp(151) mongo::Helpers::findOne+0x227
2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\dbhelpers.cpp(116) mongo::Helpers::findOne+0x37
2015-04-02T15:52:21.287-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\rs_rollback.cpp(583) mongo::repl::`anonymous namespace'::syncFixUp+0x18e7
2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\rs_rollback.cpp(774) mongo::repl::`anonymous namespace'::_syncRollback+0x395
2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\rs_rollback.cpp(827) mongo::repl::syncRollback+0x1f5
2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\bgsync.cpp(428) mongo::repl::BackgroundSync::_rollbackIfNeeded+0x590
2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\bgsync.cpp(262) mongo::repl::BackgroundSync::produce+0x406
2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\bgsync.cpp(200) mongo::repl::BackgroundSync::_producerThread+0x9a
2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\bgsync.cpp(153) mongo::repl::BackgroundSync::producerThread+0x86
2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] mongod.exe ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(185) boost::`anonymous namespace'::thread_start_function+0x21
2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] MSVCR120.dll beginthreadex+0x107
2015-04-02T15:52:21.288-0400 I CONTROL [rsBackgroundSync] MSVCR120.dll endthreadex+0x192
2015-04-02T15:52:21.289-0400 I CONTROL [rsBackgroundSync] kernel32.dll BaseThreadInitThunk+0xd
2015-04-02T15:52:21.289-0400 I CONTROL [rsBackgroundSync]
2015-04-02T15:52:21.486-0400 I REPL [rsBackgroundSync] replSet exception in rollback ns:prism-pubsub.intergate { _id: ObjectId('551d9db163449994281a07b4') } 28556 Didn't find RecordId in WiredTigerRecordStore ndeletes:9
2015-04-02T15:52:21.486-0400 I REPL [rsBackgroundSync] replSet 9 delete and 20 update operations processed out of 40 total operations
2015-04-02T15:52:21.765-0400 I REPL [ReplicationExecutor] replSetElect voting yea for algo-rh01-sdny.itgssi.com:27017 (1)
2015-04-02T15:52:22.379-0400 I REPL [ReplicationExecutor] Member algo-rh01-sdny.itgssi.com:27017 is now in state PRIMARY
2015-04-02T15:52:22.934-0400 I REPL [rsBackgroundSync] rollback 5 d:9 u:30
2015-04-02T15:52:22.934-0400 I REPL [rsBackgroundSync] rollback 6
2015-04-02T15:52:23.075-0400 W REPL [rsBackgroundSync] issues during syncRollback, see log
2015-04-02T15:52:23.075-0400 I REPL [ReplicationExecutor] transition to RECOVERING
2015-04-02T15:52:23.076-0400 I REPL [rsBackgroundSync] rollback finished
2015-04-02T15:52:23.091-0400 I REPL [ReplicationExecutor] syncing from: algo-rh01-sdny.itgssi.com:27017
- is related to
-
SERVER-18006 Shard replicas crash after 3.0.2 upgrade with "Didn't find RecordId in WiredTigerRecordStore"
-
- Closed
-