Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27090

sharding/cursor_timeout.js sets cursorTimeoutMillis incorrectly

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.4.0-rc5, 3.5.1
    • Component/s: None
    • Labels:
      None
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Completed:
    • Steps To Reproduce:
      Hide

      Run the jstests/sharding/cursor_timeout.js test with --repeat. I was able to reproduce in under 200 iterations on my Mac.

      Show
      Run the jstests/sharding/cursor_timeout.js test with --repeat. I was able to reproduce in under 200 iterations on my Mac.
    • Sprint:
      Query 2016-12-12
    • Linked BF Score:
      0

      Description

      The sharding/cursor_timeout.js test sets cursorTimeoutMillis to the same time interval as clientCursorMonitorFrequencySecs. This results in immediate timeout (when the clientcursormon thread wakes up from a clientCursorMonitorFrequencySecs sleep), as the elapsed time interval passed to the timeout check is based on a clientcursormon-level timer and will never be less than clientCursorMonitorFrequencySecs.

      In light of the above, the sequence of events that causes this test failure is:

      1. A find() is run that returns a subset of the result set and leaves an open cursor.
      2. Just after (on the order of <1ms) the clientcursormon thread wakes up from a 1 second sleep and attempts to kill expired cursors.
      3. The clientcursormon passes to the kill method its timer value as the elapsed time. In my testing this would be ballpark 1004ms.
      4. The open cursor is killed after only being open for a few milliseconds.

      A quick fix for the test would be to increase cursorTimeoutMillis to 2000. This will give us what I expect was the desired behavior which is kill after 1 second has passed (and will translate to kill between 1 and 2 seconds).

      We may also want to consider failing startup when 0 < cursorTimeoutMillis <= (clientCursorMonitorFrequencySecs * 1000) as the test (or at a minimum perform an audit to make sure there are no other tests that setup cursorTimeoutMillis in this manner).

        Attachments

          Activity

            People

            Assignee:
            james.wahlin James Wahlin
            Reporter:
            james.wahlin James Wahlin
            Participants:
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: