Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-80203

Normalization of time-series meta field can break insert targeting

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 7.1.1, 7.2.0-rc0, 5.0.22, 7.0.3, 6.0.12
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • Storage Execution NAMER
    • Fully Compatible
    • ALL
    • v7.1, v7.0, v6.0, v5.0
    • Execution NAMR Team 2023-09-18, Execution NAMR Team 2023-10-02

      Issue summary SERVER-80203

      Issue Summary
      This is a time series sharded cluster operation routing issue which can result in metadata inconsistencies. Documents affected by this issue may be written to the wrong shard, such that it may not be returned by queries and may be subject to later deletion.

      This affects time series sharded collections starting in MongoDB version 5.0.6 through versions 5.0.21, 6.0.11 and 7.0.2 and Rapid Release version 7.1.1.

      Issue Description and Impact
      Documents inserted into a sharded Time Series collection may be routed to an incorrect shard and become un-owned by any shard if:

      • The document's time series metaField contains an embedded document/object composed of multiple fields and the shard key of the collection includes that object. Examples include:
        • A metaField value of { "a" : 1, "b" : 1 } when the shard key is the metaField.
        • A metaField value of { "a" : 1, "b" : {"c": 1, "d": 1} } when the shard key includes metaField.b.
      • At insert time, the fields in the embedded document or object are not provided in alphabetic (lexicographic) order. Importantly, app-supplied key order within documents is not guaranteed by all drivers.
        • The same shard does not own the two chunks that own both:
        • The alphabetically (lexicographically) ordered version of the embedded document
        • The provided version of the embedded document.

      This occurs because:

      • A mongos routers incorrectly route documents to shards using the provided metaField value. For example, { "b" : 1, "a" : 1 } is routed to the shard that owns the chunk range for { "b" : 1, "a" : 1 }.
      • At insert time, mongod nodes normalize to alphabetic/lexicographic order the metaField values that are embedded documents. For example, { "b" : 1, "a" : 1 }, becomes { "a" : 1, "b" : 1 }.

      When the shard that receives a vulnerable document does not own the chunk range for the normalized form of the shard key / metaField value, the document is orphaned and effectively lost. For example - a shard which has the chunk range { "b" : 1 }{ "$maxKey" : 1 } could receive the document with metaField { "b" : 1, "a" : 1 } even though the document is persisted with the metaField { "a" : 1, "b" : 1 }.

      Note that when the correct chunk range and "incorrect" chunk range are owned by the same shard, this issue is self-healing.

      Documents orphaned in this way:

      • Will not be returned by queries issued through a mongos.
      • May be deleted under the following circumstances:
        • Orphaned documents are normalized in such a way that they fall into a chunk range with a pending chunk deletion task.
        • A chunk migration to a destination shard containing orphans in the given range is aborted, resulting in the creation of a chunk deletion task over the chunk range in which the orphaned documents exist.

      Workaround

      Upgrading to MongoDB versions 5.0.22, 6.0.12, or 7.0.3 prevents the issue from occurring, but remediation is still required. See the Remediation section below for further guidance.

      Please reach out to MongoDB Support if you are unable to upgrade to a version containing a fix for this issue.

      Remediation

      We recommend taking the following actions, in order, to identify and preserve orphaned documents for later recovery. Please review all steps carefully before proceeding.

      1. Disable the balancer using sh.stopBalancer().
      2. Upgrade the cluster to the latest maintenance release (See the Release Notes for information on the latest versions).
      3. Follow the guidance at MongoDB Support Tools - Sharded Time Series Orphan Check to identify and recover orphaned time series documents.

      Please reach out to MongoDB Support if you have any questions or issues with performing the steps above.

      Documentation

            Assignee:
            gregory.noma@mongodb.com Gregory Noma
            Reporter:
            arun.banala@mongodb.com Arun Banala
            Votes:
            0 Vote for this issue
            Watchers:
            29 Start watching this issue

              Created:
              Updated:
              Resolved: