Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-27005

Write error revalidate logic needs to wait for lastVisibleOpTime to be committed

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Won't Fix
    • Affects Version/s: 3.2.10
    • Fix Version/s: None
    • Component/s: Sharding
    • Labels:
      None
    • Operating System:
      ALL
    • Sprint:
      Sharding 2017-01-02
    • Linked BF Score:
      15

      Description

      A concrete example of this is during applyOps failure:

      https://github.com/mongodb/mongo/blob/r3.4.0-rc3/src/mongo/s/catalog/sharding_catalog_client_impl.cpp#L1325-L1335

      The issue here is that when the applyOps fail for example when the current primary steps down, we don't know how far the write was applied. In the case that the write was replicated, the applyOps will fail when a retry is made because the precondition will fail. However, if we try to inspect the document, we may not see the post-write state because it is not yet in the committed snapshot.

      A fix was attempted before (SERVER-20487) to always advance the lastVisibleOpTime whenever a write is attempted so the client can use it to do readAfterOpTime. This however will become an issue when the set loses the primary and never advances the opTime (SERVER-24630).

      One proposed solution is a hybrid where instead of advancing the global config optime, have the current request either use the last returned visibleOpTime on the readAfterOpTime on the next query or make it wait for replication using getLastError with the returned visibleOpTime.

        Attachments

          Issue Links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: