Refactor MongoProcessInterface::getCollectionOptions to use listCollections internally instead of hand-crafting BSON

    • Catalog and Routing
    • 🟥 DDL
    • None
    • None
    • None
    • None
    • None
    • None

      Problem

      MongoProcessInterface::getCollectionOptions has divergent implementations that can silently produce inconsistent output between replica set and sharded cluster deployments.

      On standalone/replica set (CommonMongodProcessInterface):
      getCollectionOptions delegates to getCollectionOptionsLocally, which reads directly from the local catalog via acquireCollectionOrViewMaybeLockFree and then manually crafts a BSON object to mimic the format of listCollections output. For views it hand-builds viewOn, pipeline, and collation fields. For regular collections it calls CollectionOptions::toBSON().

      On shard servers (ShardServerProcessInterface):
      getCollectionOptions delegates to _getCollectionOptions, which runs a real listCollections command on the cluster and extracts the "options" field from the response.

      This means:

      • The local path (getCollectionOptionsLocally) implicitly assumes its hand-crafted BSON matches the format that listCollections would return. If the listCollections output format ever changes, the two paths can silently diverge.
      • The local path has special-case logic for views and timeseries that must be kept in sync with the listCollections command implementation — a maintenance burden with no compile-time or test-time safety net.

      There are only 2 external call sites, both in out_stage.cpp (lines 180 and 426).

      Proposal

      • Step 1 — Introduce getCollectionInfoLocally — a new method on CommonMongodProcessInterface that uses DBDirectClient::getCollectionInfos to run listCollections locally, returning a ListCollectionsReplyItem. This is exactly the same pattern already used by getCollectionInfoFromPrimary (which also uses DBDirectClient), but without the primary-only read preference. This guarantees format parity since both go through the real listCollections code.
      • Step 2 — Rewrite getCollectionOptions on top of getCollectionInfo — In CommonMongodProcessInterface, getCollectionOptions calls getCollectionInfoLocally and extracts the options. In ShardServerProcessInterface, it calls getCollectionInfoFromPrimary and extracts the options. This eliminates the hand-crafted BSON.
      • Step 3 (optional follow-up) — Remove getCollectionOptions entirely — migrate the 2 call sites in out_stage.cpp to use getCollectionInfo directly and delete the virtual method, simplifying the MongoProcessInterface API surface.

      After these changes, getCollectionOptionsLocally can be deleted — the single source of truth for collection info format becomes the listCollections command path.

            Assignee:
            Unassigned
            Reporter:
            Tommaso Tocci
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: