Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-10420

replmonitor_bad_seed.js fails with auth because it tries to read all user data while a shard is down

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 2.5.5
    • Affects Version/s: None
    • Labels:
      None
    • Environment:
      OS X 10.5 64-bit DUR OFF
      Solaris-SmartOS 64-bit
      Linux 64-bit
      (probably others)
    • ALL

      This issue just appeared on master for three separate builders:

      Linux 64-bit Build #5684 Aug 2 rev f736febe5a0
      http://buildlogs.mongodb.org/Linux%2064-bit/builds/5684/test/sharding/replmonitor_bad_seed.js

      Solaris-SmartOS 64-bit Build #1192 Aug 2 rev 07faf6eef1
      http://buildlogs.mongodb.org/Solaris-SmartOS%2064-bit/builds/1192/test/sharding/replmonitor_bad_seed.js

      OS X 10.5 64-bit DUR OFF Build #2527 Aug 2 rev f736febe5a0
      http://buildlogs.mongodb.org/OS%20X%2010.5%2064-bit%20DUR%20OFF/builds/2527/test/sharding/replmonitor_bad_seed.js

      ReplSetTest stopSet *** Shut down repl set - test worked ****
      2013-08-02 20:45:24 EDT	
       m30999| Sat Aug  3 00:45:23.360 [mongosMain] dbexit: received signal 15 rc:0 received signal 15
       m29000| Sat Aug  3 00:45:23.362 [conn3] end connection 10.29.160.141:39516 (4 connections now open)
       m29000| Sat Aug  3 00:45:23.362 [conn4] end connection 10.29.160.141:39517 (3 connections now open)
       m29000| Sat Aug  3 00:45:23.362 [conn5] end connection 10.29.160.141:39518 (2 connections now open)
      Sat Aug  3 00:45:24.361 shell: stopped mongo program on port 30999
      Sat Aug  3 00:45:24.367 shell: started program /data/buildslaves/Linux_64bit/mongo/mongos --port 30999 --configdb gcov1:29000 --chunkSize 50 --setParameter enableTestCommands=1 --setParameter enableTestCommands=1
      2013-08-02 20:45:26 EDT	
       m30999| Sat Aug  3 00:45:24.385 warning: running with 1 config server should be done only for testing purposes and is not recommended for production
       m30999| Sat Aug  3 00:45:24.387 [mongosMain] MongoS version 2.5.2-pre- starting: pid=4588 port=30999 64-bit host=gcov1 (--help for usage)
       m30999| Sat Aug  3 00:45:24.387 [mongosMain] git version: f736febe5a0d6d5a197b012eebad0243161830b6
       m30999| Sat Aug  3 00:45:24.387 [mongosMain] build info: Linux gcov1 3.2.20-1.29.6.amzn1.x86_64 #1 SMP Tue Jun 12 01:19:28 UTC 2012 x86_64 BOOST_LIB_VERSION=1_49
       m30999| Sat Aug  3 00:45:24.387 [mongosMain] options: { chunkSize: 50, configdb: "gcov1:29000", port: 30999, setParameter: [ "enableTestCommands=1", "enableTestCommands=1" ] }
       m29000| Sat Aug  3 00:45:24.388 [initandlisten] connection accepted from 10.29.160.141:39545 #7 (3 connections now open)
       m29000| Sat Aug  3 00:45:24.404 [initandlisten] connection accepted from 10.29.160.141:39546 #8 (4 connections now open)
       m29000| Sat Aug  3 00:45:24.409 [initandlisten] connection accepted from 10.29.160.141:39548 #9 (5 connections now open)
       m30999| Sat Aug  3 00:45:24.410 [mongosMain] ChunkManager: time to load chunks for test.user: 0ms sequenceNumber: 2 version: 1|0||51fc52a0c9a1d525c77b20bb based on: (empty)
       m30999| Sat Aug  3 00:45:24.410 [mongosMain] starting new replica set monitor for replica set test-rs0 with seed of gcov1:31100,gcov1:31101,gcov1:31102
       m30999| Sat Aug  3 00:45:24.410 [mongosMain] error connecting to seed gcov1:31100, err: couldn't connect to server gcov1:31100
       m30999| Sat Aug  3 00:45:24.410 [mongosMain] error connecting to seed gcov1:31101, err: couldn't connect to server gcov1:31101
       m30999| Sat Aug  3 00:45:24.411 [mongosMain] error connecting to seed gcov1:31102, err: couldn't connect to server gcov1:31102
       m30999| Sat Aug  3 00:45:26.411 [mongosMain] warning: No primary detected for set test-rs0
       m30999| Sat Aug  3 00:45:26.411 [mongosMain] All nodes for set test-rs0 are down. This has happened for 1 checks in a row. Polling will stop after 29 more failed checks
       m30999| Sat Aug  3 00:45:26.411 [mongosMain] replica set monitor for replica set test-rs0 started, address is test-rs0/
       m30999| Sat Aug  3 00:45:26.411 [ReplicaSetMonitorWatcher] starting
       m30999| Sat Aug  3 00:45:26.411 [mongosMain] Initializing user data failed: Unknown error code 11002 socket exception [CONNECT_ERROR] server [test-rs0/gcov1:31100,gcov1:31101,gcov1:31102] mongos connectionpool error: connect failed to replica set test-rs0/gcov1:31100,gcov1:31101,gcov1:31102
       m29000| Sat Aug  3 00:45:26.412 [conn8] end connection 10.29.160.141:39546 (4 connections now open)
       m29000| Sat Aug  3 00:45:26.412 [conn7] end connection 10.29.160.141:39545 (4 connections now open)
       m29000| Sat Aug  3 00:45:26.413 [conn9] end connection 10.29.160.141:39548 (2 connections now open)
      Could not start mongo program at 30999, process ended
      Sat Aug  3 00:45:26.575 TypeError: Cannot call method 'getDB' of null at /data/buildslaves/Linux_64bit/mongo/jstests/sharding/replmonitor_bad_seed.js:35
      failed to load: /data/buildslaves/Linux_64bit/mongo/jstests/sharding/replmonitor_bad_seed.js
      

      I can easily reproduce this failure on Linux 64-bit debug. Kicking off a git-bisect run now.

            Assignee:
            spencer@mongodb.com Spencer Brody (Inactive)
            Reporter:
            matt.kangas Matt Kangas
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: