Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-63667

.find() returning multiple instances of the same document

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Minor - P4 Minor - P4
    • None
    • Affects Version/s: 5.0.5
    • Component/s: None
    • Labels:
    • Environment:
      archlinux , 128 GB RAM server
    • Storage Execution

      Mongo appears to be returning duplicate documents for the same query, i.e. it returns more documents than there are unique {{_id}}s in the returned documents:
      lobby-brain> count_iterated = 0; ids = {}
      {}
      lobby-brain> db.the_collection.find({
      'a_boolean_key': true
      }).forEach((el) => {
      count_iterated += 1;
      ids[el._id] = (ids[el._id]||0) + 1;
      })
      lobby-brain> count_iterated
      278
      lobby-brain> Object.keys(ids).length
      251
      That is, the number of unique _id returned is 251 – but there were 278 documents returned by the cursor.

      Investigating further:
      lobby-brain> ids
      {
      '60cb8cb92c909a974a96a430': 1,
      '61114dea1a13c86146729f21': 1,
      '6111513a1a13c861467d3dcf': 1,
      ...
      '61114c491a13c861466d39cf': 2,
      '61114bcc1a13c861466b9f8e': 2,
      ...
      }
      lobby-brain> db.the_collection.find({
      _id: ObjectId("61114c491a13c861466d39cf")
      }).forEach((el) => print("foo"));
      foo

      >
      {{}}

      That is, there aren't actually duplicate documents with the same _id -- it's just an issue with the .find().

      I tried restarting the database, and rebuilding an index involving 'a_boolean_key', with the same results.

      I've never seen this before and this seems impossible... 

      Version info:
      Using MongoDB: 5.0.5
      Using Mongosh: 1.0.4
      {{}}

      It is a stand-alone database, no replica set or sharding or anything like that.

      Further Info

      One thing to note is, there is a compound index with a_boolean_key as the first index, and a datetime field as the second. The boolean key is rarely updated on the database (~once/day), but the datetime field is frequently updated.

      Maybe these updates are causing the duplicate return values?

            Assignee:
            backlog-server-execution [DO NOT USE] Backlog - Storage Execution Team
            Reporter:
            csaftoiu@gmail.com Claudiu Saftoiu
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: