Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18190

Secondary reads may block replication

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: 3.0.2
    • Fix Version/s: 3.0.4, 3.1.3
    • Component/s: Concurrency, Querying
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Completed:
    • Sprint:
      Quint Iteration 3

      Description

      Issue Status as of Jun 09, 2015

      ISSUE SUMMARY
      Reading from secondary nodes in a replica set may block the application of replication write operations, because longer read operations may not yield appropriately.

      USER IMPACT
      High volume read operations on secondary nodes may cause the nodes to experience increased replication lag, which may make read operations return old data.

      In extreme cases the affected node may become "stale". Stale nodes need to be resynchronized. If enough nodes in a replica set become stale availability may be impacted.

      WORKAROUNDS
      The preferred workaround is to suspend all read operations on secondary nodes.

      Alternatively, the oplog size can be increased on secondary nodes. This is only a suitable workaround if the nodes undergo periods of no reads so replication can catch up.

      AFFECTED VERSIONS
      MongoDB 3.0.0 through 3.0.3.

      FIX VERSION
      The fix is included in the 3.0.4 production release.

      Original description

      • 3 table scans each taking 5-10 seconds (and returning no results) were done on a collection of about 12M documents on the secondary, marked A-B, C-D, E-F above. At the same time documents were inserted into the same collection on the primary, driving replication traffic.
      • During the table scans replication rate falls to 0, replication lag builds.
      • Graphs show straight lines between the beginning and end of the stalls, indicating that the serverStatus command that the data collection depends on was blocked as well.
      • Primary is not similarly affected by the same table scan.
      • Problem reproduces on both WiredTiger and mmapv1

        Attachments

        1. gdbmon.html
          1.20 MB
        2. secondary_reads.png
          secondary_reads.png
          106 kB

          Issue Links

            Activity

              People

              • Votes:
                2 Vote for this issue
                Watchers:
                30 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: