Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-32284

awaitReplication can hang when the optime to wait for does not match the minSnapshot.

    XMLWordPrintable

    Details

      Description

      ReplicationCoordinatorImpl::_awaitReplication_inlock accepts waiting for an opTime and a minSnapshot. This method will register itself onto a waiter list for a condition notification and successfully return when _doneWaitingForReplication_inlock returns true.

      In order for the predicate to return true, a valid snapshot must exist at the minSnapshot time.

      However, the condition variable is notified when _doneWaitingForReplication_inlock succeeds with a trivially true minSnapshot value. Also note that notifying a waiter also removes it from the list waiters that are notified when optimes advance.

      In this case, the predicate for _awaitReplication_inlock is stronger than to be notified, and because notification happens at most once, a client can hang waiting for a followup notification will never come.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              benety.goh Benety Goh
              Reporter:
              daniel.gottlieb Daniel Gottlieb
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: