Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17903

When corruption detected, server continues to run and sync secondaries

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: 2.6.4
    • Fix Version/s: None
    • Component/s: Stability, Storage
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL

      Description

      When the server encountered the following corruption (a number of times), it continued to run. Later, secondaries synced from this server, a number of failovers happened & the replica set ended up in an inconsistent state where the primary contained less documents than one of the secondaries.

      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\util\stacktrace.cpp(169)                           mongo::printStackTrace+0x43
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\util\log.cpp(127)                                  mongo::logContext+0x97
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\util\assert_util.cpp(183)                          mongo::msgasserted+0xf7
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\util\assert_util.cpp(174)                          mongo::msgasserted+0x13
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\bson\bson-inl.h(219)                               mongo::BSONObj::_assertInvalid+0x46b
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\db\exec\fetch.cpp(111)                             mongo::FetchStage::work+0x1a2
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\db\query\plan_executor.cpp(91)                     mongo::PlanExecutor::getNext+0x15f
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\db\query\cached_plan_runner.cpp(71)                mongo::CachedPlanRunner::getNext+0x53
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\db\query\new_find.cpp(561)                         mongo::newRunQuery+0xb80
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\db\instance.cpp(269)                               mongo::receivedQuery+0x406
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\db\instance.cpp(437)                               mongo::assembleResponse+0x2f9
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\db\db.cpp(202)                                     mongo::MyMessageHandler::process+0x10c
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\mongo\util\net\message_server_port.cpp(210)              mongo::PortMessageServer::handleIncomingMsg+0x67f
      2015-02-03T15:00:43.054+0000 [conn427] mongod.exe    ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(185)  boost::`anonymous namespace'::thread_start_function+0x21
      2015-02-03T15:00:43.054+0000 [conn427] MSVCR100.dll                                                                   endthreadex+0x43
      2015-02-03T15:00:43.054+0000 [conn427] MSVCR100.dll                                                                   endthreadex+0xdf
      2015-02-03T15:00:43.054+0000 [conn427] kernel32.dll                                                                   BaseThreadInitThunk+0xd
      2015-02-03T15:00:43.054+0000 [conn427] DDD.CCC*
      2015-02-03T15:00:43.226+0000 [conn427] assertion 10334 BSONObj size: 0 (0x0) is invalid. Size must be between 0 and 16793600(16MB) First element: EOO ns:DDD.CCC* query:{ $query: { BucketId: "default", StreamId: "143655635", StreamRevisionTo: { $gte: 0 }, StreamRevisionFrom: { $lte: 1 } }, $orderby: { StreamRevisionFrom: 1 } }
      2015-02-03T15:00:43.226+0000 [conn437] serverStatus was very slow: { after basic: 0, after asserts: 0, after backgroundFlushing: 0, after connections: 0, after cursors: 0, after dur: 0, after extra_info: 0, after globalLock: 0, after indexCounters: 0, after locks: 0, after network: 0, after opcounters: 0, after opcountersRepl: 0, after recordStats: 23848, after repl: 23848, at end: 23848 }
      2015-02-03T15:00:43.398+0000 [conn504] Assertion: 10334:BSONObj size: 0 (0x0) is invalid. Size must be between 0 and 16793600(16MB) First element: EOO
      2015-02-03T15:00:43.460+0000 [rsHealthPoll] warning: Failed to connect to NNN.NNN.NNN.NNN*:27017, reason: errno:10061 No connection could be made because the target machine actively refused it.
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              pasette Daniel Pasette
              Reporter:
              ger.hartnett Ger Hartnett
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: