Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-48526

Lengthy oplog scans may cause difficulty resuming change streams in a sharded cluster

    • Query Execution

      If a change stream is resumed in a sharded cluster, all shards that have no oplog entries for the relevant namespace (either because they never had any data for the collection, or because they have not taken any writes recently) will scan through their oplogs until reaching EOF before returning their first getMore batch to mongoS. If the oplogs are large and the requested resume point is early, then this may take so long that the cursors on shards which do have oplog entries time out, effectively making the stream unresumable. This problem could manifest even after SERVER-30784 is complete, although the likelihood of such an occurrence will be significantly reduced.

      Addressing this may require changing the semantics of tailable-awaitData getMores on mongoD. Currently, if there are no matching entries, getMore will scan to the end of the oplog regardless of how long it takes. We could instead have it proactively break and return after the specific awaitData period expires, much like a non-throwing maxTimeMS deadline. This would also be consistent with mongoS' current behaviour.

            Assignee:
            backlog-query-execution [DO NOT USE] Backlog - Query Execution
            Reporter:
            bernard.gorman@mongodb.com Bernard Gorman
            Votes:
            0 Vote for this issue
            Watchers:
            13 Start watching this issue

              Created:
              Updated: