Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-5927

During full resyncing, when a connection error happens, thread fault appears and restart full resyncing. Is this a correct process?

    • Type: Icon: Question Question
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.0.2
    • Component/s: Replication
    • Environment:
      Oracle linux 5.7, iSCSI(not shared with network)
      4 shards. 5 members for each shard

      We tried the full resyncing(remove all files in dbpath, then restart mongod).

      After for a while, we found these messages. And mongod continues to sync fully again.

      Is this a correct process that can be happened?

      =======================================================================================
      Fri May 25 01:26:12 [rsSync] 8120473 objects cloned so far from collection mylog.fs.chunks
      Fri May 25 01:26:18 [FileAllocator] allocating new datafile /log/data/repl01_2/mylog.108, filling with zeroes...
      Fri May 25 01:26:32 [FileAllocator] done allocating datafile /log/data/repl01_2/mylog.108, size: 2047MB, took 14.517 secs
      Fri May 25 01:26:33 [rsSync] clone mylog.fs.chunks 8142975
      ...... # some connection messages
      Fri May 25 01:26:56 [rsSync] Socket recv() errno:104 Connection reset by peer fc1301:35000
      Fri May 25 01:26:56 [rsSync] SocketException: remote: fc1301:35000 error: 9001 socket exception [1] server [fc1301:35000]
      Fri May 25 01:26:56 [rsSync] Assertion: 13273:single data buffer expected
      0x584722 0x5df960 0x5e18c8 0x5bf339 0x84cb77 0x84eb15 0x850409 0x82bfbe 0x82dbd3 0x826ee1 0x826f9a 0x827420 0xaa80b0 0x37c900673d 0x37c84d3d1d
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo11msgassertedEiPKc+0x112) [0x584722]
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo14DBClientCursor12dataReceivedERbRSs+0x180) [0x5df960]
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo14DBClientCursor18exhaustReceiveMoreEv+0x158) [0x5e18c8]
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo18DBClientConnection5queryEN5boost8functionIFvRNS_27DBClientCursorBatchIteratorEEEERKSsNS_5QueryEPKNS_7BSONObjEi+0x209) [0x5bf339]
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo6Cloner4copyEPKcS2_bbbbbbNS_5QueryE+0x3a7) [0x84cb77]
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo6Cloner2goEPKcRSsRKSsbbbbbbPi+0x1665) [0x84eb15]
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo9cloneFromEPKcRSsRKSsbbbbbbPi+0x59) [0x850409]
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo11ReplSetImpl18_syncDoInitialSyncEv+0xe5e) [0x82bfbe]
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo11ReplSetImpl17syncDoInitialSyncEv+0x23) [0x82dbd3]
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo11ReplSetImpl11_syncThreadEv+0x61) [0x826ee1]
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo11ReplSetImpl10syncThreadEv+0x4a) [0x826f9a]
      /home/logadmin/mongodb/bin/mongod(_ZN5mongo15startSyncThreadEv+0xa0) [0x827420]
      /home/logadmin/mongodb/bin/mongod(thread_proxy+0x80) [0xaa80b0]
      /lib64/libpthread.so.0 [0x37c900673d]
      /lib64/libc.so.6(clone+0x6d) [0x37c84d3d1d]
      Fri May 25 01:26:57 [rsSync] Socket flush send() errno:9 Bad file descriptor fc1301:35000
      Fri May 25 01:26:57 [rsSync] mylog caught exception (socket exception) in destructor (~PiggyBackData)
      Fri May 25 01:26:57 [rsSync] replSet initial sync exception 13273 single data buffer expected
      ...... # some writebacklisten meesages
      Fri May 25 01:27:27 [rsSync] replSet initial sync pending
      Fri May 25 01:27:27 [rsSync] replSet syncing to: fc1301:35000
      Fri May 25 01:27:27 [rsSync] replSet initial sync drop all databases
      Fri May 25 01:27:27 [rsSync] dropAllDatabasesExceptLocal 2
      Fri May 25 01:27:27 [rsSync] removeJournalFiles
      =======================================================================================

            Assignee:
            mathias@mongodb.com Mathias Stearn
            Reporter:
            k2hyun Kihyun Kim
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: