Validate _id field $type queries to reject array, regex, and undefined types

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

        1. Background & Impact

      When users mistakenly query the `_id` field with invalid types (array, regex, undefined), such as:
      db.collection.find({_id: {$type: 'array'}})
      db.collection.find({_id: {$type: 4}})*This can cause serious performance issues:*

      • The query planner cannot use the `_id` index effectively because it cannot determine valid index bounds for types that can never exist
      • This may result in a *collection scan (COLLSCAN)* instead of an index scan (IXSCAN)
      • On large collections, this leads to *significant performance degradation* and increased resource consumption

      Since the `_id` field can never contain Array, RegEx, or Undefined types (enforced at write time by `storage_validation.cpp`), such queries are meaningless and will always return empty results while potentially causing full table scans.

        1. Solution

      Add parse-time validation to reject `$type` queries on the `_id` field when the requested type is invalid. This:

      • Fails fast with a clear error message: `"The '_id' field cannot be queried by type <typename>"`
      • Prevents users from accidentally running expensive queries
      • Maintains consistency with write-time validation in `storage_validation.cpp::storageValidIdField()`
        1. Changes
      • Added validation in `expression_parser.cpp::parseType()` to reject invalid types (Array, RegEx, Undefined) for `_id` field
      • Returns `InvalidIdField` error code
      • Added comprehensive unit tests in `expression_parser_test.cpp`
        1. Test Coverage

      *Error cases:*

      • `{_id: {$type: 'array'}}` / `{_id: {$type: 4}}`
      • `{_id: {$type: 'regex'}}` / `{_id: {$type: 11}}`
      • `{_id: {$type: 'undefined'}}` / `{_id: {$type: 6}}`
      • `{_id: {$type: ['array', 'objectId']}}` (mixed valid/invalid)

      *Success cases:*

      • `{_id: {$type: 'string'}}` / `{_id: {$type: 'objectId'}}` / `{_id: {$type: 'number'}}`
      • `{a: {$type: 'array'}}` (non-_id field)
      • `{_id.a: {$type: 'array'}}` (nested path within _id)
      • `{a._id: {$type: 'array'}}` (not top-level _id)

            Assignee:
            Unassigned
            Reporter:
            yunfa liu
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: