Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-13310

WT random cursor continues to return duplicate records due to poor interaction with MongoDB layer's query yielding

    • Type: Icon: Bug Bug
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 7.0.9, 6.0.16
    • Component/s: None
    • Storage Engines
    • StorEng - Defined Pipeline

      WT-11532 and WT-12225 addressed issues where the RNG state could be reinitialized across WT_SESSIONs such that the WiredTiger random cursor would make identical coin-flip decisions in its traversal down the Btree for which records to return. The thread ID and current tick of the system clock (accurate within 1 millisecond on Linux systems) are included as entropy sources to prevent a fresh WT_SESSION from having the same RNG state as a previous one did before. However, these changes aren't sufficient due to the possibility of the same thread releasing and reacquiring another WT_SESSION within the same millisecond. This quick release and reacquisition can happen because the default query yield policy in MongoDB 7.0 will release storage engine resources after 1000 calls to its internal work() function which mostly equates to 1000 calls to WT_CURSOR::next(). MongoDB 8.0 is likely not impacted here because the query yield policy changed in SERVER-87163 to be exclusively timed-based by default and continues to use the default interval of 10 milliseconds.

      We should further increase entropy pool for WT_SESSIONs to prevent this common pattern of MongoDB $sample aggregations where the sample size is >1000 records from erroring.

            Assignee:
            jie.chen@mongodb.com Jie Chen
            Reporter:
            max.hirschhorn@mongodb.com Max Hirschhorn
            Votes:
            0 Vote for this issue
            Watchers:
            17 Start watching this issue

              Created:
              Updated: