Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-82860

Local data access for aggregations should not keep retrying in case of StaleConfig

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 5.0.0, 6.0.0, 7.0.0, 7.1.0
    • Component/s: None
    • Labels:
    • Catalog and Routing
    • ALL
    • CAR Team 2023-12-25, CAR Team 2024-02-05, CAR Team 2024-02-19, CAR Team 2024-03-04, CAR Team 2024-03-18, CAR Team 2024-04-01, CAR Team 2024-04-15, CAR Team 2024-04-29, CAR Team 2024-05-13
    • 3

      In all version previous 7.2, in case of aggregation with $lookup, if the user data are located on the local shard we will simply run a router loop that will attempt 10 times to run the aggregation locally hoping at least one will succeed.

      The local access will cause a check on the local filtering metadata which in case they are not installed yet, the collection access would return StaleConfig. Usually it's ok to retry since it's just a transient error that requires a refresh on the shard side. However, because the access is local, the filtering metadata are not refreshed until the error is propagated back to the entry point which will performed the refresh and obtain the filtering metadata

      This happens after failing 10 times, but we could simply fail at the 1th in case of StaleConfig. In 7.2 this issue was unintentionally fixed by SERVER-74816https://github.com/10gen/mongo/blob/ba27121ae83e40362e418f7f4b0f88ef79977765/src/mongo/db/pipeline/sharded_agg_helpers.cpp#L1822-L1862 

      The goal of this ticket is to backport that specific change up to 5.0 

       

            Assignee:
            jordi.serra-torrens@mongodb.com Jordi Serra Torrens
            Reporter:
            enrico.golfieri@mongodb.com Enrico Golfieri
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated: