Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-11743

Add debugging to socket pending data check (was: auth.js and authConnectionHook.js intermittently failing linux32 debug)

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major - P3 Major - P3
    • None
    • None
    • Build
    • None
    • ALL

    Description

      There are a couple more examples, but this is the earliest I saw.

      https://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_debug_103da34ddbc32d8345655471fc46afc6352a6624_13_11_14_14_28_21_sharding_linux_32

      https://mci.10gen.com/ui/task/mongodb_mongo_master_linux_32_debug_58ed9775c96676a8c6dab2a83ea4ee55f1dcaa38_13_11_14_15_55_13_sharding_linux_32

       m31100| 2013-11-16T01:09:57.803-0500 [initandlisten] connection accepted from 10.116.42.47:49186 #58 (23 connections now open)
       m31100| 2013-11-16T01:09:57.811-0500 [conn58]  authenticate db: local { authenticate: 1, nonce: "xxx", user: "__system", key: "xxx" }
       m31100| 2013-11-16T01:09:59.034-0500 [ReplicaSetMonitorWatcher] Socket recv() timeout  10.116.42.47:31100
       m31100| 2013-11-16T01:09:59.034-0500 [ReplicaSetMonitorWatcher] SocketException: remote: 10.116.42.47:31100 error: 9001 socket exception [RECV_TIMEOUT] server [10.116.42.47:31100] 
       m31100| 2013-11-16T01:09:59.034-0500 [ReplicaSetMonitorWatcher] DBClientCursor::init call() failed
       m31100| 2013-11-16T01:09:59.034-0500 [ReplicaSetMonitorWatcher] dev: lastError==0 won't report:DBClientBase::findN: transport error: ip-10-116-42-47:31100 ns: admin.$cmd query: { ismaster: 1 }
       
       ==> m31100| 2013-11-16T01:10:00.092-0500 [ReplicaSetMonitorWatcher] ERROR: Socket found pending data during connectivity check (idle 7 secs, remote host 10.116.42.47:31100)
       
       m31100| 2013-11-16T01:10:00.092-0500 [ReplicaSetMonitorWatcher] Fatal Assertion 16199
       m31100| 2013-11-16T01:10:00.101-0500 [ReplicaSetMonitorWatcher] 0x8e1d41d 0x8dc0c26 0x8d9ee51 0x8ddd2f0 0x83b919b 0x83c2a72 0x83d011f 0x83d381f 0x83d3a6d 0x83e840b 0x8da32c0 0x8da5d50 0x8e6782e 0x97c542 0x1e5b6e 
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5mongo15printStackTraceERSo+0x2d) [0x8e1d41d]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5mongo10logContextEPKc+0x166) [0x8dc0c26]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5mongo13fassertFailedEi+0xf1) [0x8d9ee51]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5mongo6Socket16isStillConnectedEv+0x390) [0x8ddd2f0]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5mongo18DBClientConnection16isStillConnectedEv+0x2b) [0x83b919b]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5mongo19_getConnWithRefreshERNS_17ReplicaSetMonitor4NodeE+0x32) [0x83c2a72]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5mongo17ReplicaSetMonitor6_checkEv+0xbf) [0x83d011f]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5mongo17ReplicaSetMonitor5checkEv+0x1df) [0x83d381f]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5mongo17ReplicaSetMonitor8checkAllEv+0x1ed) [0x83d3a6d]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5mongo24ReplicaSetMonitorWatcher3runEv+0xeb) [0x83e840b]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5mongo13BackgroundJob7jobBodyEv+0xe0) [0x8da32c0]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod(_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvNS_4_mfi3mf0IvN5mongo13BackgroundJobEEENS2_5list1INS2_5valueIPS7_EEEEEEE3runEv+0x30) [0x8da5d50]
       m31100|  /data/mci/git@github.commongodb/mongo.git/master/mongod [0x8e6782e]
       m31100|  /lib/i686/nosegneg/libpthread.so.0 [0x97c542]
       m31100|  /lib/i686/nosegneg/libc.so.6(clone+0x5e) [0x1e5b6e]
       m31100| 2013-11-16T01:10:00.101-0500 [ReplicaSetMonitorWatcher] 
       m31100| 
       m31100| ***aborting after fassert() failure
       m31100| 
       m31100| 
       m31100| Got signal: 6 (Aborted).

      Attachments

        Activity

          People

            milkie@mongodb.com Eric Milkie
            dan@mongodb.com Daniel Pasette (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: