Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10042

Replica fails BSONElement: bad type -53

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.4
    • Component/s: Replication
    • Environment:
    • Windows
    • Hide

      We have a replica set with two nodes and one arbiter and mapreduce operations.

      Show
      We have a replica set with two nodes and one arbiter and mapreduce operations.

      We have a replica set with two nodes and one arbiter.

      {
      "_id" : "rsstuff",
      "version" : 6,
      "members" : [

      { "_id" : 0, "host" : "LOG-MNG13:27017", "priority" : 10 }

      ,

      { "_id" : 1, "host" : "LOG-MNG23:27017" }

      ,

      { "_id" : 2, "host" : "LOG-MNG21:27018", "arbiterOnly" : true }

      ]
      }

      Last night, the replica have started to fail. I have reviewing the log (I'm going to attach it later), I could see the following:

      1) Communication error (tiemout) from secondary to primary

      Thu Jun 27 02:03:41.524 [initandlisten] connection accepted from 192.168.3.23:61211 #1879438 (200 connections now open)
      Thu Jun 27 02:03:41.883 [rsSyncNotifier] Socket recv() timeout 172.29.106.76:27017
      Thu Jun 27 02:03:41.883 [rsSyncNotifier] SocketException: remote: 172.29.106.76:27017 error: 9001 socket exception [3] server [172.29.106.76:27017]
      Thu Jun 27 02:03:41.883 [rsSyncNotifier] replset tracking exception: exception: 10278 dbclient error communicating with server: log-mng12:27017

      2) Try to reconnect

      Thu Jun 27 02:04:00.680 [rsSyncNotifier] replset setting oplog notifier to log-mng12:27017
      ...
      Thu Jun 27 02:04:04.939 [rsBackgroundSync] replSet syncing to: log-mng12:27017

      3) Then there was an error about size ("Size must be between 0 and 16793600(16MB)"). The aggregate.tmp.mr.xxx are temporary collection that they are created during mapreduce operations.

      Thu Jun 27 02:04:05.126 [repl writer worker 2] build index aggregate.tmp.mr.Statistics_HotelsMeansEntryAvailability_52339_inc

      { _id: 1 }

      Thu Jun 27 02:04:05.126 [repl writer worker 2] build index done. scanned 0 total records. 0.001 secs
      Thu Jun 27 02:04:05.126 [repl writer worker 2] info: creating collection aggregate.tmp.mr.Statistics_HotelsMeansEntryAvailability_52339_inc on add index
      Thu Jun 27 02:04:05.126 [repl writer worker 2] build index aggregate.tmp.mr.Statistics_HotelsMeansEntryAvailability_52339_inc

      { 0: 1 }

      Thu Jun 27 02:04:05.126 [repl writer worker 2] build index done. scanned 0 total records. 0 secs
      Thu Jun 27 02:04:05.126 [repl writer worker 1] build index aggregate.tmp.mr.Statistics_HotelsMeansEntryAvailability_52339

      { _id: 1 }

      Thu Jun 27 02:04:05.126 [repl writer worker 1] build index done. scanned 0 total records. 0.001 secs
      Thu Jun 27 02:04:05.126 [repl writer worker 1] info: indexing in foreground on this replica; was a background index build on the primary
      Thu Jun 27 02:04:05.126 [repl writer worker 1] build index aggregate.tmp.mr.Statistics_HotelsMeansEntryAvailability_52339

      { _id.f: 1.0, _id.dc: 1.0, _id.dz: 1.0, _id.dh: 1.0, _id.arr: 1.0, _id.nights: 1.0, _id.na: 1.0, _id.nc: 1.0 }

      Thu Jun 27 02:04:05.142 [repl writer worker 1] build index done. scanned 0 total records. 0 secs
      Thu Jun 27 02:04:05.142 [repl writer worker 1] build index aggregate.tmp.mr.Statistics_HotelsBookingEngine_52338_inc

      { _id: 1 }

      Thu Jun 27 02:04:05.142 [rsBackgroundSync] Assertion: 10334:BSONObj size: 1845624932 (0x6400026E) is invalid. Size must be between 0 and 16793600(16MB) First element: : ?type=115
      Thu Jun 27 02:04:05.142 [repl writer worker 1] build index done. scanned 0 total records. 0.001 secs
      Thu Jun 27 02:04:05.142 [repl writer worker 1] info: creating collection aggregate.tmp.mr.Statistics_HotelsBookingEngine_52338_inc on add index
      Thu Jun 27 02:04:05.142 [repl writer worker 1] build index aggregate.tmp.mr.Statistics_HotelsBookingEngine_52338_inc

      { 0: 1 }

      Thu Jun 27 02:04:05.142 [repl writer worker 1] build index done. scanned 0 total records. 0 secs
      Thu Jun 27 02:04:05.158 [repl writer worker 7] build index aggregate.tmp.mr.Statistics_HotelsBookingEngine_52338

      { _id: 1 }

      4) Then the log shows a dump

      Thu Jun 27 02:04:05.672 [rsBackgroundSync] mongod.exe ...\src\mongo\util\stacktrace.cpp(189) mongo::printStackTrace+0x3e
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] mongod.exe ...\src\mongo\util\assert_util.cpp(159) mongo::msgasserted+0xc1
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] mongod.exe ...\src\mongo\bson\bson-inl.h(219) mongo::BSONObj::_assertInvalid+0x3e6
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] mongod.exe ...\src\mongo\client\dbclientcursor.cpp(238) mongo::DBClientCursor::next+0x153
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] mongod.exe ...\src\mongo\client\dbclientcursor.h(78) mongo::DBClientCursor::nextSafe+0x43
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] mongod.exe ...\src\mongo\db\oplogreader.h(127) mongo::OplogReader::nextSafe+0x44
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\bgsync.cpp(351) mongo::replset::BackgroundSync::produce+0x457
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] mongod.exe ...\src\mongo\db\repl\bgsync.cpp(241) mongo::replset::BackgroundSync::producerThread+0x98
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] mongod.exe ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(180) boost::`anonymous namespace'::thread_start_function+0x21
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] mongod.exe f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(314) _callthreadstartex+0x17
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] mongod.exe f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(292) _threadstartex+0x7f
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] KERNEL32.DLL BaseThreadInitThunk+0x1a
      Thu Jun 27 02:04:05.672 [rsBackgroundSync] replSet db exception in producer: 10334 BSONObj size: 1845624932 (0x6400026E) is invalid. Size must be between 0 and 16793600(16MB) First element: : ?type=115

      5) Then, I could see "BSONElement: bad type -53"

      Thu Jun 27 02:04:09.931 [initandlisten] connection accepted from 172.29.106.46:51733 #1879639 (201 connections now open)
      Thu Jun 27 02:04:10.150 [rsSync] Assertion: 10320:BSONElement: bad type -53
      Thu Jun 27 02:04:10.259 [conn1879163] end connection 172.29.106.52:16844 (200 connections now open)
      Thu Jun 27 02:04:10.259 [initandlisten] connection accepted from 172.29.106.52:19619 #1879640 (201 connections now open)
      Thu Jun 27 02:04:10.337 [initandlisten] connection accepted from 172.29.106.51:5754 #1879641 (202 connections now open)
      Thu Jun 27 02:04:10.337 [conn1879641] end connection 172.29.106.51:5754 (201 connections now open)
      Thu Jun 27 02:04:10.462 [initandlisten] connection accepted from 172.29.106.51:5756 #1879642 (202 connections now open)
      Thu Jun 27 02:04:10.462 [conn1879642] end connection 172.29.106.51:5756 (201 connections now open)
      Thu Jun 27 02:04:10.680 [rsSync] mongod.exe ...\src\mongo\util\stacktrace.cpp(189) mongo::printStackTrace+0x3e
      Thu Jun 27 02:04:10.680 [rsSync] mongod.exe ...\src\mongo\util\assert_util.cpp(159) mongo::msgasserted+0xc1
      Thu Jun 27 02:04:10.680 [rsSync] mongod.exe ...\src\mongo\bson\bson-inl.h(673) mongo::BSONElement::size+0x1b5
      Thu Jun 27 02:04:10.680 [rsSync] mongod.exe ...\src\mongo\db\repl\rs_sync.cpp(484) mongo::replset::SyncTail::tryPopAndWaitForMore+0x161
      Thu Jun 27 02:04:10.680 [rsSync] mongod.exe ...\src\mongo\db\repl\rs_sync.cpp(380) mongo::replset::SyncTail::oplogApplication+0x449
      Thu Jun 27 02:04:10.680 [rsSync] mongod.exe ...\src\mongo\db\repl\rs_sync.cpp(723) mongo::ReplSetImpl::_syncThread+0x9c
      Thu Jun 27 02:04:10.680 [rsSync] mongod.exe ...\src\mongo\db\repl\rs_sync.cpp(743) mongo::ReplSetImpl::syncThread+0xdc
      Thu Jun 27 02:04:10.680 [rsSync] mongod.exe ...\src\mongo\db\repl\rs_sync.cpp(769) mongo::startSyncThread+0x8a
      Thu Jun 27 02:04:10.680 [rsSync] mongod.exe ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(180) boost::`anonymous namespace'::thread_start_function+0x21
      Thu Jun 27 02:04:10.680 [rsSync] mongod.exe f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(314) _callthreadstartex+0x17
      Thu Jun 27 02:04:10.680 [rsSync] mongod.exe f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(292) _threadstartex+0x7f
      Thu Jun 27 02:04:10.680 [rsSync] KERNEL32.DLL BaseThreadInitThunk+0x1a
      Thu Jun 27 02:04:10.680 [rsSync] replSet syncThread: 10320 BSONElement: bad type -53

            Assignee:
            david.hows David Hows
            Reporter:
            david.verdejo@logitravel.com David Verdejo
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: