Change stream getMore fails with code 280 (NonResumableChangeStreamError) on a healthy replica set — resume token references a point that is well inside the oplog

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: 8.0.18
    • Component/s: Change streams
    • Query Execution
    • ALL
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      1. Summary

      A long-running change stream against an 8.0.18 Enterprise shard suddenly failed mid-stream with:
       code 280 (ChangeStreamFatalError)
       errorLabels: ["NonResumableChangeStreamError"]
       errmsg: "PlanExecutor error during aggregation :: caused by :: cannot resume stream;
                the resume token was not found. {_data: "82..."}"
      The defining facts that we would like help explaining:

      1. The first failure was a getMore command failure, not a client resumeAfter. For a getMore, the client supplies only the cursor id — the resume token that "was not found" is the one the server itself is holding/advancing for that cursor. The application never supplied or constructed it.
      1. The replica set was continuously healthy at the time: no election, no step-down, no rollback on the primary for the entire day.
      1. The failing token's clusterTime decodes to 2026-05-20 15:58:05, while the oplog (200 GB; window 2026-03-05 → 2026-06-05, ~92 days) places the resume point ~76 days inside the window — and the exact oplog entry is still present today (see §4.3/§7). Not aged off.
      1. The watched collection was not dropped/recreated — its UUID is stable across the whole day.

      We have been unable to reproduce this on a clean 8.0.18 deployment; a legitimately-issued resume token always resumes in our tests. We would like MongoDB's help determining why a server-issued change-stream resume token, referencing a point that is inside the oplog on a healthy node, becomes non-resumable.


      2. Environment

      Item Value
      MongoDB version 8.0.18 Enterprise (gitVersion 5d1557a9ac6f1cd358cef85fd11c0fabfcfc4e2e, rhel88, x86_64)
      Topology Sharded cluster, single shard; shard replica set shDMP_LPT_0 (3 voting members)
      Shard members dmpmgovmctst11a:28168 (id 0), dmpmgovmctst91c:28168 (id 2), dmpmgovmctst71b:28168 (id 3)
      mongos dmpmgovmctst11a:28368
      FCV 7.0 — note: binaries are 8.0.18 but featureCompatibilityVersion has not been raised to 8.0 ({{featureCompatibilityVersion: { version: '7.0' }}})
      Affected DB / collections FDM.FDM_HPI_cpi_patient (UUID 5d0d4ad5-992b-4e20-83af-166656096da6), FDM.FDM_HPI_cpi_patient_hospital_data (UUID 2a115cd6-354d-4cb8-a393-fab9d8e8299f) — unsharded collections residing on the primary shard
      Client driver mongo-java-driver (sync) 4.9.1 (log: "driver":{"name":"mongo-java-driver|sync","version":"4.9.1"})
      Connection method Driver connects directly to the shard replica set via mongodb://...11a:28168,71b:28168,91c:28168/FDM?replicaSet=shDMP_LPT_0&authSource=admin

      Note on topology: the change streams are opened directly against the shard (the getMore/aggregate commands target local.oplog.rs on the shard mongod, with the client IP as remote). We are aware of the recommendation to open change streams via mongos; we call this out for completeness. However, the token-integrity question below is independent of that (the failing token is server-generated on a getMore), which is the point we need help with.

      Change stream configuration

      DB-level watch on FDM. Both fullDocument and fullDocumentBeforeChange were explicitly set — confirmed present in the server "Slow query" log (their values appear as ### there only because of server-side client-data log redaction; the connector specifies updateLookup / whenAvailable). The salient fact — the stream uses pre-images (fullDocumentBeforeChange) — is independently confirmed by the collMod below.
       db.watch(
        [ { $match: { "ns.coll":

      { $in: [ "FDM_HPI_cpi_patient", ... ] }

      } } ],
        {
           fullDocument: "updateLookup",
           fullDocumentBeforeChange: "whenAvailable",   // pre/post images enabled via collMod
           readConcern:

      { level: "majority" }

        }
       )
      changeStreamPreAndPostImages is enabled on the watched collections (via collMod).


      3. The error (verbatim)

      First failure — a getMore failure (mongod log, msg id 20478):
       2026-05-20T15:58:05.597+08:00 W QUERY id=20478 ctx=conn627986
        "getMore command executor error"
        error: { code: 280, codeName: "ChangeStreamFatalError",
                  errmsg: "cannot resume stream; the resume token was not found.
                          {_data: \"826A0D698D000000072B042C0100296E5A10045D0D4AD5992B4E2083AF166656096DA6
                                  463C6F7065726174696F6E54797065003C64656C6574650046646F63756D656E744B6579
                                  0046645F6964006469A90134B4574833104559F6000004\"}" }
        (cursor was reading local.oplog.rs; remote = application IP)
      Decoded resume token (_data):

      Field Value
      clusterTime Timestamp(1779263885, 7) = 2026-05-20 15:58:05 (UTC+8), ordinal 7
      collection UUID 5d0d4ad5-992b-4e20-83af-166656096da6 (= FDM.FDM_HPI_cpi_patient)
      operationType delete
      documentKey {{ { _id: ... }

      }}

      Full server response (from the client-side capture):
       

      { "errorLabels": ["NonResumableChangeStreamError"], "ok": 0.0,    "errmsg": "PlanExecutor error during aggregation :: caused by :: cannot resume stream; the resume token was not found. \{_data: \"826A0D698D00000007...\"}

      ",
         "code": 280, "codeName": "ChangeStreamFatalError",
         "$clusterTime": { "clusterTime": { "$timestamp":

      { "t": 1779266194, "i": 2 }

      }, ... },
         "operationTime": { "$timestamp":

      { "t": 1779266194, "i": 2 }

      } }
      The same failure recurred at 2026-05-20 18:51:36 on a different collection (MDM_HKPMI_patient_api, UUID 3b1ac50f-1922-4006-a3a9-50c841b066cc), also first observed as a getMore failure.


      4. Evidence that this is not a client/driver token-handling problem

      4.1 The failing token is server-generated (the decisive point)

      The first occurrence (and the trigger of everything that followed) is a getMore command executor error (mongod msg id 20478), not an aggregate/resumeAfter. On a getMore the client sends only the cursor id; the change stream's resume position is maintained by the server. Therefore the resume token that "was not found" is one MongoDB itself generated and was advancing for its own cursor — the application could not have corrupted it.

      (The application uses the standard driver change-stream API and persists/replays the token bytes verbatim; but for the getMore case above, no client token is involved at all.)

      4.2 No election / step-down / rollback

      The primary dmpmgovmctst11a was continuously PRIMARY for the entire day. The mongod log contains zero REPL state transitions / rollback entries (Transition to PRIMARY/SECONDARY, stepping down, Starting rollback, transition to ROLLBACK) across the whole file. So the resume point was not invalidated by a failover or rollback of un-replicated writes.

      4.3 The resume point is deep inside the oplog window — and still is today

      rs.printReplicationInfo() on the shard primary (run 2026-06-05):
       actual / configured oplog size : 200000 MB (~200 GB)
       log length start to end       : 7943945 secs (2206.65 hrs ≈ 92 days)
       oplog first event time         : 2026-03-05 12:58:51 GMT+0800
       oplog last event time         : 2026-06-05 11:37:56 GMT+0800
       now                           : 2026-06-05 11:37:57 GMT+0800 * Failing token clusterTime: 2026-05-20 15:58:05~76 days inside the window (window starts 2026-03-05).

      • The window still covers 2026-05-20 as of 2026-06-05, and the exact oplog entry is still retrievable today (§7's findOne returned it).

      So the resume point was never anywhere near aging off, and oplog truncation is ruled out. (We also observed that 8.0.18 returns code 280 — not 286 ChangeStreamHistoryLost — even for points that are before the oplog start; so the 280 code by itself does not imply in-window. Here the point is deep in-window, independently confirmed by rs.printReplicationInfo().) This also means MongoDB can investigate on the live cluster right now — the failing event is still in the oplog.

      4.4 The collection was not dropped/recreated

      The collection UUID 5d0d4ad5 is stable across the day: collMod oplog entries at 15:58:04 and 16:30:11 both reference the same UUID. No drop/create/renameCollection for these collections appears in the mongod or mongos logs.

      4.5 Application/driver log corroboration — the application only ever replays server-issued token bytes, verbatim

      The application (driver) log shows the standard change-stream resume flow and, critically, that the token bytes the application replays are byte-for-byte identical to the token MongoDB itself produced at the original live getMore failure.

      Timeline on restart (mongo-java-driver 4.9.1, standard change-stream API):
       2026-05-20 16:20:55.033 task MDM_CPI_CASE_MODEL scheduled / restarted on this node
       2026-05-20 16:21:00.7xx "Found exists breakpoint, will decode batch/stream offset"
                                "[FDM_HPI_cpi_patient] Use existing stream offset: {...}"   (persisted resume token loaded)
       2026-05-20 16:21:02.269 SOURCE_STREAM_READ -> code 280 ChangeStreamFatalError,
                                errorLabels:["NonResumableChangeStreamError"],
                                "...the resume token was not found. {_data:"826A0D698D00000007...5D0D4AD5...delete..."}"
                                on server dmpmgovmctst11a:28168
      Byte-for-byte identity (server log ↔ application log). The three failing resume tokens appear identically in both the mongod log (where MongoDB generated/rejected them) and the application log (where the driver replayed them):

      Collection token _data (prefix) occurrences in mongod log occurrences in app log
      FDM_HPI_cpi_patient (5d0d4ad5) 826A0D698D00000007…5D0D4AD5…delete… 54 68
      FDM_HPI_cpi_patient_hospital_data (2a115cd6) 826A0D698D0000000A…2A115CD6…delete… 54 74
      MDM_HKPMI_patient_api (3b1ac50f) 826A0D9238…3B1AC50F…delete… 433 731

      This demonstrates the application's only role with respect to the token is to store and replay the exact bytes MongoDB delivered (as a change-event resume token / cursor position). It does not parse, rewrite, or synthesize tokens. Combined with §4.1 (the first failure was a getMore, where no client token is sent at all), the token that "was not found" is in every case a token MongoDB created — the application cannot be the source of any token corruption.

      Full disclosure: on receiving the 280 the application initially retried the operation (treating it as transient), which only prolonged the symptom (a 60-second retry loop for ~6 hours) and contributed to a connection build-up on the engine node. It did not alter the token. This retry-on-NonResumableChangeStreamError behavior has since been corrected on our side to honor the error label and fail fast. It is unrelated to why MongoDB cannot resolve its own token, which is the subject of this ticket.

      4.6 We cannot reproduce it with a legitimate token

      On a clean 8.0.18 deployment we ran extensive tests (details in §6). A legitimate, server-issued change-stream token always resumes successfully, including delete-event tokens (plain, batched-delete applyOps, and transactional applyOps). We could only produce code 280 by deliberately tampering the token's locator fields (clusterTime / ordinal / UUID / operationType) so they no longer match any oplog event — which is exactly what we are not doing in production.


      5. Correlated server-side activity at the moment of failure

      The breaks coincide to the second with sharding-catalog refresh activity on the same collections. These messages appear only at the two failure moments (15:58:05 and 18:51:36) in the entire day's log:
       15:58:05.578 I SHARDING id=505070 ctx=RecoverRefreshThread
                    "Namespace not found, collection may have been dropped" ns=FDM.FDM_HPI_cpi_patient
       15:58:05.578 I SHARDING id=7917801 ctx=RecoverRefreshThread
                    "Marking collection as untracked" ns=FDM.FDM_HPI_cpi_patient
       15:58:05.597 W QUERY   id=20478   ctx=conn627986 getMore ... code 280   <-- 19 ms later
      (Identical pattern at 18:51:36 for MDM.MDM_HKPMI_patient_api.)

      At ~the same time, another application connection issued {{collMod { changeStreamPreAndPostImages:

      { enabled: true }

      }}} on these collections (a downstream task restart re-enabling pre-images). The collection was not actually dropped (UUID unchanged); the "Namespace not found / Marking collection as untracked" appears to be a routing/filtering-metadata refresh.

      Our working hypothesis (please confirm or correct): on a change stream that is open directly against the shard, a routing/filtering-metadata refresh that transiently marks an (unsharded) collection as untracked invalidates the cursor's resume point, surfacing as code 280 NonResumableChangeStreamError — even though the underlying oplog entry still exists and the node never changed state. We could not reproduce the untracked-while-streaming transition locally (our single-shard test cluster never produced the tracked→untracked transition for an unsharded collection), so we cannot confirm this ourselves.


      6. Reproduction attempts (all on MongoDB 8.0.18, single-shard sharded cluster)

      We tried hard to reproduce and could not trigger 280 on a healthy stream with any of:

      • flushRouterConfig; shardCollection (other & watched collection); collMod changeStreamPreAndPostImages (direct-to-shard and via mongos)
      • drop / drop+recreate / renameCollection of the watched collection while streaming
      • resuming from real delete tokens: plain deletes, batched-delete applyOps (deleteMany), transactional applyOps
      • a stress workload: long-running cursor + continuous insert/update/delete (~45k events) + pre-images + concurrent collMod burst + simulated failover resume + retry loop

      The only way we produced code 280 was by falsifying a token's clusterTime / ordinal / UUID / operationType (so it points to a non-existent event). Tampering only the documentKey produced Location50811; malformed tokens produced Location50796; future clusterTime resumed OK. This is what makes the production case anomalous: the production token's locator fields all look legitimate (real collection UUID, op=delete, clusterTime inside the oplog), yet resume fails with 280.


      7. oplog lookup for the failing token (confirmed)

      To confirm whether the token's exact position corresponds to a real, resumable oplog entry, we queried local.oplog.rs on the shard primary at the token's exact Timestamp(t, i) (decoded from the failing token):
       shDMP_LPT_0 [primary]> db.getSiblingDB('local').oplog.rs.findOne({ ts: Timestamp(

      {t:1779263885, i:7}

      ) })
       {
         lsid:

      { id: UUID('565c8cde-3a58-4618-8aac-cff8b1523eaa'),            uid: Binary.createFromBase64('7KOXdUXQWxhvJ/qb9qauqH7y1NoQDVDuOu623ZzzaZ0=', 0) }

      ,
         txnNumber: 1,
         op: 'd',
         ns: 'FDM.FDM_HPI_cpi_patient',
         ui: UUID('5d0d4ad5-992b-4e20-83af-166656096da6'),
         o:

      { _id: ObjectId('69a90134b4574833104559f6') }

      ,
         stmtId: 0,
         ts: Timestamp(

      { t: 1779263885, i: 7 }

      ),
         t: 2774,
         v: 2,
         wall: ISODate('2026-05-20T07:58:05.582Z'),
         prevOpTime: { ts: Timestamp(

      { t: 0, i: 0 }

      ), t: -1 }
       }
      The oplog entry matches the resume token field-for-field:

      Resume token field Token value Oplog entry Match
      clusterTime / ts Timestamp(1779263885, 7) ts: Timestamp(1779263885, 7)
      operationType delete op: 'd'
      collection UUID 5d0d4ad5-992b-4e20-83af-166656096da6 ui: UUID('5d0d4ad5-…')
      documentKey _id ObjectId('69a90134b4574833104559f6') {{o: { _id: ObjectId('69a90134b4574833104559f6') }}}
      namespace FDM.FDM_HPI_cpi_patient ns: 'FDM.FDM_HPI_cpi_patient'

      Surrounding oplog — both failing positions sit inside a same-second collMod burst

      Listing every entry in that oplog second (t = 1779263885, 2026-05-20 15:58:05) shows the two failing delete positions (i:7, i:10) embedded in a dense burst of {{collMod { changeStreamPreAndPostImages:

      { enabled: true }

      }}} DDL on the same database, each delete immediately preceded by an internal op:'n' noop ensureMajorityPrimaryAndScheduleDbTask:
       i:1   u MDM._tapdata_heartbeat_table
       i:2   c collMod FDM_HKPMI_district_area           { changeStreamPreAndPostImages:

      { enabled: true }

      }
       i:3   c collMod FDM_HKPMI_patient_type           { ... }
       i:4   c collMod FDM_HKPMI_address_detail         { ... }
       i:5   c collMod FDM_HKPMI_patient                 { ... }
       i:6   n noop "ensureMajorityPrimaryAndScheduleDbTask"
       i:7   d FDM.FDM_HPI_cpi_patient                   o:{_id: ObjectId('69a90134b4574833104559f6')}   <-- FAILING TOKEN #1
       i:8   c collMod FDM_HKPMI_patient_hospital_data   { ... }
       i:9   n noop "ensureMajorityPrimaryAndScheduleDbTask"
       i:10 d FDM.FDM_HPI_cpi_patient_hospital_data     o:{_id: ObjectId('69a90064b4574833100f2b45')}   <-- FAILING TOKEN #2
       i:11 c collMod FDM_HKPMI_address_detail2         { ... }
       i:12 c collMod FDM_HKPMI_district               { ... }
       i:13 c collMod FDM_HKPMI_elderly_home_table     { ... }
       i:14 c collMod FDM_HKPMI_document_type           { ... }
       i:15 c collMod FDM_HKPMI_hkpmi_patient_info_log { ... }
       i:16 c collMod FDM_HKPMI_pmi_case               { ... }
       i:17 u FDM._tapdata_heartbeat_table
      Both failing tokens (i:7 cpi_patient and i:10 hospital_data) are confirmed to be real delete entries with id matching the token's documentKey. The watched change stream uses a db-level $match (it surfaces only the FDM_HPI* deletes; the FDM_HKPMI_* collMod events are filtered out of the output, but the cursor still advances its internal resume position across all of these entries).

      Conclusion (confirmed: Outcome A)

      The oplog does contain the exact delete event the resume token names — same ts, same op (d), same collection UUID, same documentKey._id — on the current primary, deep inside the oplog window (200 GB / ~92 days, oldest entry 2026-03-05; entry still present as of 2026-06-05), with no rollback / no election all day. The resume token is therefore a valid, server-issued reference to a real, majority-committed oplog event, yet resumeAfter/getMore reports it as "resume token was not found" (code 280, NonResumableChangeStreamError).

      This rules out a client/driver token-integrity problem (the bytes resolve to a genuine oplog entry) and rules out the usual non-resumable causes (oplog truncation, rollback, drop). We believe this is a server-side defect (or an undocumented behavior) in change-stream resume resolution, and need MongoDB to root-cause it.

      Notable detail — the delete is a retryable write / session-associated operation

      The matching oplog entry carries lsid, txnNumber: 1, stmtId: 0, and prevOpTime: { ts: Timestamp(0,0), t: -1 }}} — i.e. the delete was performed as a retryable write (session/transaction-number tagged), not a plain delete. Combined with the surrounding listing above, the failing resume positions share three traits: (a) they are retryable-write deletes, (b) each is immediately preceded by an {{ensureMajorityPrimaryAndScheduleDbTask noop, and (c) they are embedded in a same-second {{collMod { changeStreamPreAndPostImages }}} DDL burst on the same DB. We flag this combination as the likely-relevant context and ask MongoDB to confirm whether resume-token resolution for such entries (under this DDL/noop interleaving) has a known issue on 8.0.18.


      8. Questions for MongoDB

      1. On 8.0.18, what conditions cause a live getMore on a change stream to fail with 280 NonResumableChangeStreamError when (a) the primary never changed state, (b) the resume point is well inside the oplog, and (c) the collection was not dropped?
      1. Can a routing/filtering-metadata refresh that logs "Marking collection as untracked" (msg id 7917801) / "Namespace not found, collection may have been dropped" (msg id 505070) invalidate an open change stream cursor that was opened directly against the shard? Is this expected for unsharded collections?
      1. Does concurrent {{collMod { changeStreamPreAndPostImages: { enabled: true }

        }}} on a watched collection interact with an in-flight change stream that uses fullDocumentBeforeChange, in a way that can yield 280?

      1. Is opening a change stream directly against a shard (bypassing mongos) expected to be subject to these metadata-refresh invalidations, and is the recommended remedy strictly to connect via mongos?
      1. Why does 8.0.18 return 280 ChangeStreamFatalError rather than 286 ChangeStreamHistoryLost for a resume point that is before the oplog start? (Diagnostic relevance: the error code alone does not distinguish "aged off" from "not found in-window".)
      1. The matching oplog entry is a retryable write (lsid / txnNumber / stmtId present, §7). Is there a known issue on 8.0.18 where a resume token generated for a retryable-write / session-associated delete cannot be resolved by resumeAfter/getMore even though the entry exists in the oplog? Does the resume-token's encoding of txnOpIndex/stmtId for such entries interact with a concurrent metadata refresh (§5)?
      1. The cluster runs 8.0.18 binaries with FCV 7.0 (upgrade not finalized). Could this mixed state affect change-stream resume-token resolution and/or the collection-tracking refresh behavior observed in §5? Is finalizing FCV to 8.0 expected to change this behavior?

      9. Attachments

      • [available] mongod log of dmpmgovmctst11a:28168 covering 2026-05-20 — contains the 280s, the collMod burst, the RecoverRefreshThread untracked messages, and zero REPL state transitions / rollback for the whole day (this log is the authoritative proof of §4.2 "no election/rollback"; a separate rs.status() history is therefore not required).
      • [available] mongos log dmpmgovmctst11a:28368 — contains no change-stream activity / no 280, confirming change streams went direct-to-shard.
      • [available] Application / driver log (tapdata-agent on 11.167.x, 2026-05-20) — the restart at 16:20:55, the persisted-token load, the resumeAfter → 280 at 16:21:02, and the resume-token bytes, which are byte-for-byte identical to the tokens in the mongod log (driver: mongo-java-driver sync 4.9.1). Corroborates §4.5.
      • [available] Live rs.printReplicationInfo() and the oplog.rs lookups (the failing entry + the full 15:58:05 second) — inlined in §4.3 and §7; the failing oplog entry is still present on the live cluster and can be re-checked on request.

       
      TRANSLATE with x
      English

      [ |https://go.microsoft.com/?linkid=9722454]
      TRANSLATE with
      COPY THE URL BELOW

      Back
      EMBED THE SNIPPET BELOW IN YOUR SITE
      Enable collaborative features and customize widget: Bing Webmaster Portal
      Back

        1. 06-oplog-evidence.txt
          5 kB
        2. 07-key-excerpts.txt
          5 kB
        3. 05-app-tapdata-agent-11.167.x-2026-05-20.log.gz
          333 kB
        4. 03-mongos-dmpmgovmctst11a-28368-2026-05-20.log.gz
          3.45 MB
        5. 04-mongos-dmpmgovmctst11a-28368-2026-05-21.log.gz
          3.44 MB
        6. 01-mongod-shard-dmpmgovmctst11a-28168-2026-05-20.log.gz
          5.36 MB
        7. 02-mongod-shard-dmpmgovmctst11a-28168-2026-05-21.log.gz
          6.15 MB

            Assignee:
            [DO NOT USE] Backlog - Query Execution
            Reporter:
            Mason More (EXT)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: