Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-78321

MongoDB 6.0: Adding a new shard renders all preceding resume tokens invalid

    • Type: Icon: Bug Bug
    • Resolution: Works as Designed
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 6.0.6
    • Component/s: None
    • Labels:
    • Query Execution
    • ALL
    • QE 2023-09-04, QE 2023-09-18, QE 2023-10-02, QE 2023-10-16, QE 2023-10-30, QE 2023-11-13

      Very similar to what was observed here:

      https://jira.mongodb.org/browse/SERVER-42232

      Seeing the same issue in MongoDB 6.0.

      More info:

      Using MongoDB 6.0.6.
      Having a DB with sharded collections.
      The DB has a few shards, every shard is in a P-S-A (2 node replicas and an arbiter).
      My application is using change-streams to gather statistics, and is constantly reading for operations, while also maintaining the clusterTime timestamp in case the the app crashes, so we can continue from the last handled point.

      For the shards, we are self-maintaining the storage, and decide on adding a shard from time to time.
      Been noticing that when a shard is added the following happens:

      1. added shard at ~9:15 AM, shard comes up, rebalancing its data with other shards
      2. few hours later at ~12:30 PM, Resume of change stream was not possible errors start showing up, coming from the port of the new added shard.
        {"t":{"$date":"2023-08-02T06:32:40.283+00:00"},"s":"W",  "c":"QUERY",    "id":20478,   "ctx":"conn188","msg":"getMore command executor error","attr":{"error":{"code":286,"codeName":"ChangeStreamHistoryLost","errmsg":"Resume of change stream was not possible, as the resume point may no longer be in the oplog."},"stats":{},"cmd":{"getMore":2383202557420931747,"collection":"$cmd.aggregate","maxTimeMS":1000}}}
      3. also, the first available event in the new shard is delayed very much, here we added the shard around ~9:30 AM, but the first available event was only at ~14:30 PM, many hours later, but for other shards the first even is before that, so not sure how this makes sense because this is a totally new shard.

      In this case, the client, which is connected to the mongos router, is unable to proceed unless we move the start_at_operation_time pointer to 14:30 PM so it can continue reading (and this also makes us lose all the update from ~12:30 PM to ~14:30 PM which is not acceptable).

      • Why is this happening? isn’t the change-stream suppose to continue regularly and add the new shard updates once its ready? failing like this does not look like normal behavior.
      • Is there a safe way to add a shard and keep reading incoming updates for the other shards through the mongos router without getting stopped by this un-synced shard?
      • Is this happening because of the P-S-A configuration, and will not occur in P-S-S? if yes, why is that?

      Following this community thread, were the question was raised few days ago but with no answer:

      https://www.mongodb.com/community/forums/t/change-stream-resume-point-lost-after-adding-a-new-shard/232005

            Assignee:
            romans.kasperovics@mongodb.com Romans Kasperovics
            Reporter:
            oraiches@zadarastorage.com Oded Raiches
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated:
              Resolved: