Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-92236

Chunk migrations should use short lived cancellation sources

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • None
    • Cluster Scalability
    • ALL
    • Cluster Scalability Priorities

      Recipients in a chunk migration use a singleton class that reuses a CancellationSource for each chunk receipt, using it to schedule tasks to run if the migration is interrupted, as of SERVER-65947. This includes creating CancelableOperationContexts for cloning data, waiting for write concern, and most notably copying session history, which creates a CancelableOperationContext per oplog entry received from the donor. CancelableOperationContext will create a future that is kept alive until the CancellationSource of its given token is destructed, and the migration recipient code only resets its token when stepping up as primary, so cancellation futures created during chunk receipts are kept in memory until the node steps down and back up or restarts. If a large number of session entries are migrated or there are a large number of migrations, this can use significant memory.

      Instead, we should at least reset the CancellationSource at the end of each migration to avoid indefinite memory growth and possibly create a fresh sub CancellationSource just the duration of each processed batch of oplog entries (ie create a new one for each iteration of this loop), so copying a large number of sessions during one migration won't accumulate too much memory.

            Assignee:
            Unassigned Unassigned
            Reporter:
            jack.mulrow@mongodb.com Jack Mulrow
            Votes:
            5 Vote for this issue
            Watchers:
            11 Start watching this issue

              Created:
              Updated: