Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-19472

count() incorrect after recovery with WiredTiger

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.0.3, 3.0.4, 3.1.5
    • Component/s: WiredTiger
    • None
    • ALL
    • Hide

      Insert documents on standalone/primary:

      for (i=0;i<10000000;i++){db.abc.insert({a:i,name:"abc"})}
      

      Wait for a while (maybe 100k inserts) and 'kill -9' the mongod.

      Restart process and check the stats:

      replset:SECONDARY> db.abc.count()
      77248
      replset:SECONDARY> db.abc.find({}).toArray().length
      145350
      replset:SECONDARY> db.abc.validate(true)
      {
      	"ns" : "test.abc",
      	"nrecords" : 145350,
      	"nIndexes" : 1,
      	"keysPerIndex" : {
      		"test.abc.$_id_" : 145350
      	},
      	"indexDetails" : {
      		"test.abc.$_id_" : {
      			"valid" : true
      		}
      	},
      	"valid" : true,
      	"errors" : [ ],
      	"ok" : 1
      }
      replset:SECONDARY> db.abc.count()
      145350
      
      Show
      Insert documents on standalone/primary: for (i=0;i<10000000;i++){db.abc.insert({a:i,name: "abc" })} Wait for a while (maybe 100k inserts) and 'kill -9' the mongod . Restart process and check the stats: replset:SECONDARY> db.abc.count() 77248 replset:SECONDARY> db.abc.find({}).toArray().length 145350 replset:SECONDARY> db.abc.validate(true) { "ns" : "test.abc", "nrecords" : 145350, "nIndexes" : 1, "keysPerIndex" : { "test.abc.$_id_" : 145350 }, "indexDetails" : { "test.abc.$_id_" : { "valid" : true } }, "valid" : true, "errors" : [ ], "ok" : 1 } replset:SECONDARY> db.abc.count() 145350

      When mongod is restarted after a hard crash (and a successful recovery) the values returned by 'db.stats.objects', 'db.<coll>.stats.count', 'db.<coll>.count()' are invalid.

      Note this is not the issue of count in a sharded clusters - it applies to standalone hosts and replica sets too (though only when using WiredTiger)

      It looks like the count can be reset to the correct value using for example a 'db.<coll>.validate(true)' command.

      The problem appears to involve the recovery phase when the log/journal is replayed on top of the data from the last successful checkpoint.

      Note: This is not an issue with data integrity. The data is recovered successfully, it's just the statistics reported by 'db.stats' and relatives which are incorrect following a hard crash/kill.

            Assignee:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Reporter:
            ronan.bohan@mongodb.com Ronan Bohan
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: