Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17114

WiredTiger capped collection cursor needs to go DEAD when restoreState to a deleted doc

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Critical - P2 Critical - P2
    • 3.0.0-rc8
    • Affects Version/s: 3.0.0-rc7
    • Component/s: Replication
    • Labels:
    • Fully Compatible
    • ALL
    • Hide

      3 member replica set, c3.2xlarge instances, 8 cpu, 15g

      Socialite load workload: java -jar target/socialite-0.0.1-SNAPSHOT.jar load --users 10000000 --maxfollows 1000 --messages 2000 --threads 32 sample-config.yml

      Show
      3 member replica set, c3.2xlarge instances, 8 cpu, 15g Socialite load workload: java -jar target/socialite-0.0.1-SNAPSHOT.jar load --users 10000000 --maxfollows 1000 --messages 2000 --threads 32 sample-config.yml

      Running a heavy insert workload on a three node replica set, the secondaries fell behind. One quit with Fatal Assertion 18750, the other stayed in RECOVERING.

      It seems that after going into RECOVERING the secondary tried syncing from the other secondary, which had also gone into RECOVERING (EDIT: the secondary that asserted went into recovery before the other secondary) and then asserted:

      Unable to find source-code formatter for language: log. Available languages are: actionscript, ada, applescript, bash, c, c#, c++, cpp, css, erlang, go, groovy, haskell, html, java, javascript, js, json, lua, none, nyan, objc, perl, php, python, r, rainbow, ruby, scala, sh, sql, swift, visualbasic, xml, yaml
      2015-01-29T11:30:54.612+0000 I REPL     [ReplicationExecutor] syncing from: shard2-01.testdomain.com:27017
      2015-01-29T11:30:54.615+0000 W REPL     [rsBackgroundSync] we are too stale to use shard2-01.testdomain.com:27017 as a sync source
      2015-01-29T11:30:54.615+0000 I REPL     [ReplicationExecutor] could not find member to sync from 
      2015-01-29T11:30:54.615+0000 I REPL     [rsBackgroundSync] replSet error RS102 too stale to catch up
      2015-01-29T11:30:54.617+0000 I REPL     [rsBackgroundSync] replSet our last optime : Jan 29 09:56:58 54ca03ea:1c1f
      2015-01-29T11:30:54.617+0000 I REPL     [rsBackgroundSync] replSet oldest available is Jan 29 10:03:57 54ca058d:1803
      2015-01-29T11:30:54.617+0000 I REPL     [rsBackgroundSync] replSet See http://dochub.mongodb.org/core/resyncingaverystalereplicasetmember
      2015-01-29T11:30:54.737+0000 I REPL     [ReplicationExecutor] transition to RECOVERING
      2015-01-29T11:30:54.925+0000 I NETWORK  [conn3286] end connection 10.93.30.140:59321 (5 connections now open)
      2015-01-29T11:30:54.932+0000 I NETWORK  [initandlisten] connection accepted from 10.93.30.140:59323 #3288 (6 connections now open)
      2015-01-29T11:31:17.471+0000 I QUERY    [conn1] assertion 13436 not master or secondary; cannot currently read from this replSet member ns:config.settings query:{}
      2015-01-29T11:31:24.407+0000 I NETWORK  [conn3287] end connection 10.93.30.139:60915 (5 connections now open)
      2015-01-29T11:31:24.408+0000 I NETWORK  [initandlisten] connection accepted from 10.93.30.139:60917 #3289 (6 connections now open)
      2015-01-29T11:31:24.942+0000 I NETWORK  [conn3288] end connection 10.93.30.140:59323 (5 connections now open)
      2015-01-29T11:31:24.943+0000 I NETWORK  [initandlisten] connection accepted from 10.93.30.140:59325 #3290 (6 connections now open)
      2015-01-29T11:31:35.744+0000 I REPL     [ReplicationExecutor] syncing from: shard2-03.testdomain.com:27017
      2015-01-29T11:31:36.077+0000 I REPL     [SyncSourceFeedback] replset setting syncSourceFeedback to shard2-03.testdomain.com:27017
      2015-01-29T11:31:36.360+0000 I REPL     [rsBackgroundSync] replSet our last op time fetched: Jan 29 09:56:58:1c1f
      2015-01-29T11:31:36.360+0000 I REPL     [rsBackgroundSync] replset source's GTE: Jan 29 09:57:02:5a0
      2015-01-29T11:31:36.361+0000 F REPL     [rsBackgroundSync] replSet need to rollback, but in inconsistent state
      2015-01-29T11:31:36.361+0000 I REPL     [rsBackgroundSync] minvalid: 54ca058d:1803 our last optime: 54ca03ea:1c1f
      2015-01-29T11:31:36.362+0000 I -        [rsBackgroundSync] Fatal Assertion 18750
      2015-01-29T11:31:36.362+0000 I -        [rsBackgroundSync] 
      

        1. shard2-02.log.gz
          91 kB
        2. shard2-03.log.gz
          354 kB

            Assignee:
            milkie@mongodb.com Eric Milkie
            Reporter:
            michael.grundy Michael Grundy
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: