Clustered collections do not correctly filter out results with $lt on _id

XMLWordPrintableJSON

    • Query Optimization
    • Fully Compatible
    • ALL
    • v8.3, v8.2, v8.0
    • 200
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Issue Status as of June 17th, 2026

      SUMMARY
      On MongoDB Server v8.0.0+, queries against clustered collections can return incorrect results when they are executed as backward (descending) bounded clustered scans over the clustered key and the predicate on _id uses mismatched inclusive/exclusive bounds.

      ISSUE DESCRIPTION AND IMPACT
      The issue occurs exactly for predicates on the _id field of the form {$lt: X}, {$gt: X}, {$gte: A, $lt: B}, {$gt: A, $lte: B}. combined with sort({_id: -1}).

      In these cases, queries such as find({_id: {$lt: 3}).sort({_id: -1}) or find({_id: {$gt: 1, $lte: 3}).sort({_id: -1}) and their aggregation equivalents ($match + $sort), can behave as though $lt were $lte and $gt were $gte.

      For example,

      const db = db.getSiblingDB("test");
      db.dropDatabase();
      db.createCollection("coll", {clusteredIndex: {key: {_id: 1}, unique: true}});
      
      db.coll.insertMany([{_id: 1}, {_id: 2}, {_id: 3}, {_id: 4}, {_id: 5}]);
      
      
      printjson(db.coll.find({_id: {$lt: 3}}).sort({_id: -1}).toArray());
      
      

      Expected output:

      [{_id: 1}, {_id: 2}]
      
      

      Actual (incorrect) output:

      [{_id: 1}, {_id: 2}, {_id: 3}]
      
      

      This issue can only manifest if all of the following are true:

      The collection being queried is a clustered collection
      The query on _id uses either:
      a single comparison operator that is either inclusive or exclusive
      e.g. {_id: {$lt: A}} or {_id: {$gte: B}}
      multiple comparison operators of different types (one inclusive and one exclusive)
      e.g. {_id: {$gte: A, $lt: B}} or {_id: {$gt: A, $lte: B}}
      The query applies a descending sort on _id, e.g. .sort({_id: -1}) or aggregation $sort stage with {_id: -1}

      When these conditions are satisfied, the query may return incorrect results, where the boundary document at either end of the descending result range is wrongly included or excluded.

      DIAGNOSIS AND AFFECTED VERSIONS
      Users running queries as described above on or prior to MongoDB 8.0.23, 8.2.9, 8.3.2 may have been affected.
      To confirm if the issue currently impacts a particular query, users can validate that their query matches all the criteria described above.

      REMEDIATION AND WORKAROUNDS
      Affected customers may rewrite vulnerable queries to use a forward (ascending) scan and then reorder results on the client side. This workaround has significant tradeoffs:

      Pagination correctness
      It does not preserve offset-based pagination semantics by itself. Rewriting find(<vulnerable query>).sort({_id: -1}).skip(5).limit(5) to find(<vulnerable query>).sort({_id: 1}).skip(5).limit(5) and reversing client-side will return the first 5 documents in reverse order rather than the expected last 5 documents.

      A countDocuments(<filter>)-based strategy may reduce this problem by first calling countDocument(<filter>) and computing an offset from the end but that introduces an extra race condition window where collection modifications may result in skips or repeats of documents.

      Performance
      There is no server-side impact on performance of flipping the sort order since it is a clustered index scan but having to do the descending sort on the client could result in increased client memory usage since each result set needs to be fully materialized before doing a descending sort. This need to materialize full result sets also effectively disables streaming patterns.

      Remediation
      This is strictly a query correctness issue and thus, there is no remediation necessary since there is no direct persistent impact on the user’s data.

      However, if the user’s application used the query results and then performed a write, then there may have been write loss due to the missing results. There is no remediation possible in this case.

      —-----------------------------------------------------

      Original description

      In some cases, clustered collections incorrectly return results that should be filtered out by the $lt operator. Specifically, this query:

      db.clustered_repro.find({ _id: { $lt: 3 } }).sort({ _id: -1 })
      

      May return a document with { _id: 3 }.

      The same happens when using an aggregation pipeline. Removing the sort condition, using ascending sort, or changing to $lte, resolves the issue. But I could find no workaround without changing the query or using a plain collection.

      NodeJS script reproducing the issue:

      import * as mongo from 'mongodb';
      
      /**
       * Run with:
       *   export MONGO_TEST_URL='mongodb://localhost:27017/clustered_test'
       *   node script.mjs
       */
      const MONGO_URL = process.env.MONGO_TEST_URL ?? 'mongodb://localhost:27017/clustered_test';
      
      const client = new mongo.MongoClient(MONGO_URL);
      
      const db = client.db();
      const clusteredName = `clustered_repro`;
      const plainName = `plain_repro`;
      
      await client.connect();
      // Ensure the client is closed when the script finishes
      await using _ = { [Symbol.asyncDispose]: async () => await client.close() };
      
      const buildInfo = await db.command({ buildInfo: 1 });
      console.log(`MongoDB version: ${buildInfo.version}`);
      console.log(`Database: ${MONGO_URL}`);
      
      // Make sure the collections don't exist before running the test
      await db
        .collection(clusteredName)
        .drop()
        .catch(() => {});
      await db
        .collection(plainName)
        .drop()
        .catch(() => {});
      
      await db.createCollection(clusteredName, {
        clusteredIndex: { name: '_id', unique: true, key: { _id: 1 } }
      });
      await db.createCollection(plainName);
      
      const clustered = db.collection(clusteredName);
      const plain = db.collection(plainName);
      
      const docs = [{ _id: 1 }, { _id: 2 }, { _id: 3 }, { _id: 4 }];
      
      await clustered.insertMany(docs);
      await plain.insertMany(docs);
      
      const plainResults1 = await plain
        .find(
          {
            _id: {
              $lt: 3
            }
          },
          { sort: { _id: -1 } }
        )
        .toArray();
      
      const plainResults2 = await plain
        .aggregate([
          {
            $match: {
              _id: {
                $lt: 3
              }
            }
          },
          { $sort: { _id: -1 } }
        ])
        .toArray();
      
      // This one fails - includes { _id: 3 }
      const clusteredResults1 = await clustered
        .find(
          {
            _id: {
              $lt: 3
            }
          },
          { sort: { _id: -1 } }
        )
        .toArray();
      
      // This one also fails - includes { _id: 3 }
      const clusteredResults2 = await clustered
        .aggregate([
          {
            $match: {
              _id: {
                $lt: 3
              }
            }
          },
          { $sort: { _id: -1 } }
        ])
        .toArray();
      
      const clusteredResults3 = await clustered
        .aggregate([
          {
            $match: {
              _id: {
                $lt: 3
              }
            }
          },
          { $sort: { _id: 1 } }
        ])
        .toArray();
      
      const clusteredResults4 = await clustered
        .find(
          {
            _id: {
              $lte: 2
            }
          },
          { sort: { _id: -1 } }
        )
        .toArray();
      
      const clusteredResults5 = await clustered
        .aggregate([
          {
            $match: {
              _id: {
                $lte: 2
              }
            }
          },
          { $sort: { _id: -1 } }
        ])
        .toArray();
      
      console.log('These should all filter out { _id: 3 }');
      
      console.log('Plain 1:', plainResults1, '# find');
      console.log('Plain 2:', plainResults2, '# aggregate');
      console.log('Clustered 1:', clusteredResults1, '# find');
      console.log('Clustered 2:', clusteredResults2, '# aggregate, sort _id: -1');
      console.log('Clustered 3:', clusteredResults3, '# aggregate, sort _id: 1');
      console.log('Clustered 4:', clusteredResults4, '# find, sort _id: -1, $lte 2');
      console.log('Clustered 5:', clusteredResults5, '# aggregate, sort _id: -1, $lte 2');
      
      

       
      Tested on 8.0.19 and 8.2.6-rc0.

      Results:

      MongoDB version: 8.2.6-rc0
      Database: mongodb://localhost:27017/clustered_test
      These should all filter out { _id: 3 }
      Plain 1: [ { _id: 2 }, { _id: 1 } ] # find
      Plain 2: [ { _id: 2 }, { _id: 1 } ] # aggregate
      Clustered 1: [ { _id: 3 }, { _id: 2 }, { _id: 1 } ] # find
      Clustered 2: [ { _id: 3 }, { _id: 2 }, { _id: 1 } ] # aggregate, sort _id: -1
      Clustered 3: [ { _id: 1 }, { _id: 2 } ] # aggregate, sort _id: 1
      Clustered 4: [ { _id: 2 }, { _id: 1 } ] # find, sort _id: -1, $lte 2
      Clustered 5: [ { _id: 2 }, { _id: 1 } ] # aggregate, sort _id: -1, $lte 2
      

       
       

            Assignee:
            Naafiyan Ahmed
            Reporter:
            Ralf Kistner (EXT)
            Votes:
            0 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated:
              Resolved: