Index build collection scan acquires Client mutex per document for progress display

XMLWordPrintableJSON

    • Type: Improvement
    • Resolution: Unresolved
    • Priority: Minor - P4
    • None
    • Affects Version/s: None
    • Component/s: None
    • Storage Execution
    • Storage Execution 2026-06-22, Storage Execution 2026-07-06
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      During the collection scan phase of an index build, the scan loop in _doCollectionScan  acquires the client mutex twice per document scanned: once to call setTotalWhileRunning(numRecords()) and once to call progress->hit(). This applies to both PDIB and hybrid index builds, which share the same code path.

      Both setTotalWhileRunning() and hit() serve only progress display purposes: they update the values read by db.currentOp() and the server log. The numRecords() itself is an atomic load returning an approximate count, and the resulting progress values are only meaningful at human-observable granularity (roughly once per second). Neither operation needs to run at document granularity.

      Proposal

      Rate-limit the progress meter updates to at most once per second of wall-clock time. A lightweight timer can be checked every N documents (to amortize the clock_gettime() cost) and only when the elapsed interval has passed should the mutex be acquired and the accumulated document count flushed via a single hit(n ) call. This keeps currentOp() values at most one second behind actual without changing the update semantics visible to operators, and reduces mutex acquisitions from O(docs/sec) to at most 2 per second regardless of scan speed.

            Assignee:
            Ernesto Rodriguez Reina
            Reporter:
            Ernesto Rodriguez Reina
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: