Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-23794

Multiple documents with same _id after 3.2 upgrade

    • Type: Icon: Bug Bug
    • Resolution: Incomplete
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 3.2.3
    • Component/s: WiredTiger
    • Labels:
    • ALL
    • Hide

      Hard to be precise, but I expect this is connected with the upgrade from 3.0 on MMAPv1 to Wired Tiger on 3.2.

      Show
      Hard to be precise, but I expect this is connected with the upgrade from 3.0 on MMAPv1 to Wired Tiger on 3.2.

      We upgraded a replica set from 2.6 to 3.0 then 3.2, each time adding new servers, letting them replicate then removing the old servers.

      On the day that the first 3.2 server was added as a 4th (hidden) node in the 3.0 cluster to start the data sync, we appear to have encountered some data corruption.

      To be precise, we now have examples of multiple documents in a collection with the same _id. An unbounded find on the collection shows them (only showing the record around the problem one for brevity, and anonymising the collection name):

      db['COLLECTION'].find()
      { "_id" : "2016-03-15", "percent" : 7.317073170731707 }
      { "_id" : "2016-03-16", "percent" : 7.4074074074074066 }
      { "_id" : "2016-03-17", "percent" : 6.666666666666667 }
      { "_id" : "2016-03-18", "percent" : 6.944444444444445 }
      { "_id" : "2016-03-18", "percent" : 7.792207792207792 }
      { "_id" : "2016-03-19", "percent" : 7.6923076923076925 }
      { "_id" : "2016-03-20", "percent" : 7.6923076923076925 }
      { "_id" : "2016-03-21", "percent" : 6.756756756756757 }
      { "_id" : "2016-03-22", "percent" : 6.944444444444445 }
      { "_id" : "2016-03-23", "percent" : 7.142857142857142 }
      

      Note that "_id" : "2016-03-18" is there twice.

      If I try and query directly for this record, only one appears:

      db['COLLECTION'].find({ "_id" : "2016-03-18" })
      { "_id" : "2016-03-18", "percent" : 7.792207792207792 }
      
      db['COLLECTION'].find({ "_id" : {$gt: "2016-03-17", $lt: "2016-03-19"} })
      { "_id" : "2016-03-18", "percent" : 7.792207792207792 }
      

      Would a copy of the WiredTiger datafiles for this collection and its indexes help with analysing this issue?

            Assignee:
            milkie@mongodb.com Eric Milkie
            Reporter:
            gregmurphy Greg Murphy
            Votes:
            1 Vote for this issue
            Watchers:
            22 Start watching this issue

              Created:
              Updated:
              Resolved: