Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21100

InitialSync finishes before indexes finish building

    • Type: Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.0.7
    • Component/s: Replication
    • None
    • ALL

      I recently added a new node to our replica set. It caused errors in our application, because it finished the initialSync and switched to a valid SECONDARY before its indexes finished building. Since we send some of the queries to our secondary, that resulted with a) really slow queries, and b) errors (since we rely on hints in some scenarios)

      Below is a short excerpt from our logs showing the issue. Let me know if you need the whole log.

      2015-10-23T05:23:45.533+0000 I -        [repl index builder 9]   Index Build (background): 2787300/32715281 8%
      2015-10-23T05:23:47.133+0000 I REPL     [rsSync] initial sync finishing up
      2015-10-23T05:23:47.133+0000 I REPL     [rsSync] replSet set minValid=5629c44b:6
      2015-10-23T05:23:47.133+0000 I REPL     [rsSync] initial sync done
      2015-10-23T05:23:47.153+0000 I REPL     [ReplicationExecutor] transition to RECOVERING
      2015-10-23T05:23:47.160+0000 I REPL     [ReplicationExecutor] transition to SECONDARY
      2015-10-23T05:23:47.754+0000 I REPL     [ReplicationExecutor] could not find member to sync from
      2015-10-23T05:23:48.008+0000 I -        [repl index builder 11]   Index Build (background): 2792400/32715290 8%(...)
      (...)
      2015-10-23T05:48:19.019+0000 I -        [repl index builder 10]   Index Build (background): 5575800/32715628 17%
      2015-10-23T05:48:19.020+0000 I -        [repl index builder 16]   Index Build (background): 5575800/32715628 17%
      2015-10-23T05:48:19.885+0000 I NETWORK  [initandlisten] connection accepted from 10.21.16.41:43272 #3573 (40 connections now open)
      2015-10-23T05:48:19.898+0000 I ACCESS   [conn3573] Successfully authenticated as principal __system on local
      2015-10-23T05:48:19.908+0000 I QUERY    [conn3573] assertion 17007 Unable to execute query: error processing query: ns=closeio.activity limit=50 skip=0
      Tree: $and
          _cls == "Activity.Call"
          direction == "inbound"
          organization == { $ref: "organization", $id: "orga_alksjdfalksjdfklajc" }
          phone == "+6503334444"
          plivo_number $in [ { $ref: "plivo_number", $id: ObjectId('aklsjdfljsdfklajsdf') } ]
      Sort: { date_created: -1 }
      Proj: {}
       planner returned error: bad hint ns:closeio.activity query:{ $query: { plivo_number: { $in: [ { $ref: "plivo_number", $id: ObjectId('aklsjdfljsdfklajsdf') } ] }, organization: { $ref: "organization", $id: "orga_alksjdfalksjdfklajc" }, direction: "inbound", _cls: "Activity.Call", phone: "+6503334444" }, $readPreference: { mode: "secondary" }, $hint: { _cls: 1, organization: 1, phone: 1, lead: 1 }, $orderby: { date_created: -1 } }
      2015-10-23T05:48:19.908+0000 I QUERY    [conn3573]  ntoskip:0 ntoreturn:50
      2015-10-23T05:48:20.007+0000 I -        [repl index builder 4]   Index Build (background): 5577900/32715629 17%
      2015-10-23T05:48:20.009+0000 I -        [repl index builder 17]   Index Build (background): 5577900/32715629 17%
      2015-10-23T05:48:20.011+0000 I -        [repl index builder 12]   Index Build (background): 5577900/32715629 17%
      (...)
      2015-10-23T12:04:08.042+0000 I -        [repl index builder 13]   Index Build (background): 32749400/32752875 99%
      2015-10-23T12:04:08.046+0000 I -        [repl index builder 16]   Index Build (background): 32749400/32752875 99%
      2015-10-23T12:04:08.965+0000 I INDEX    [repl index builder 4] build index done.  scanned 32752154 total records. 34423 secs
      2015-10-23T12:04:08.966+0000 I INDEX    [repl index builder 3] build index done.  scanned 32752154 total records. 34423 secs
      2015-10-23T12:04:08.967+0000 I INDEX    [repl index builder 8] build index done.  scanned 32752154 total records. 25852 secs
      2015-10-23T12:04:08.969+0000 I INDEX    [repl index builder 12] build index done.  scanned 32752154 total records. 25851 secs
      2015-10-23T12:04:08.969+0000 I INDEX    [repl index builder 14] build index done.  scanned 32752154 total records. 25851 secs
      2015-10-23T12:04:08.970+0000 I INDEX    [repl index builder 5] build index done.  scanned 32752154 total records. 34423 secs
      2015-10-23T12:04:08.974+0000 I INDEX    [repl index builder 13] build index done.  scanned 32752154 total records. 25851 secs
      2015-10-23T12:04:08.975+0000 I INDEX    [repl index builder 16] build index done.  scanned 32752154 total records. 25851 secs
      2015-10-23T12:04:08.976+0000 I INDEX    [repl index builder 10] build index done.  scanned 32752154 total records. 25852 secs
      2015-10-23T12:04:08.977+0000 I INDEX    [repl index builder 15] build index done.  scanned 32752154 total records. 25851 secs
      2015-10-23T12:04:08.978+0000 I INDEX    [repl index builder 17] build index done.  scanned 32752154 total records. 25851 secs
      2015-10-23T12:04:08.978+0000 I INDEX    [repl index builder 9] build index done.  scanned 32752154 total records. 25852 secs
      2015-10-23T12:04:08.979+0000 I INDEX    [repl index builder 11] build index done.  scanned 32752154 total records. 25851 secs
      

        1. ciodb17_2015_10_23.log
          6.63 MB
          Stefan Wojcik
        2. ciodb1a_2015_10_23.log
          17.39 MB
          Stefan Wojcik
        3. shard_1_conf.log
          0.8 kB
          Stefan Wojcik

            Assignee:
            wan.bachtiar@mongodb.com Wan Bachtiar
            Reporter:
            stefan@close.io Stefan Wojcik
            Votes:
            3 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: