Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-28810

RangeDeleter appears to abort delete due to 112 WriteConflict

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • 3.4.5, 3.5.8
    • Affects Version/s: 3.2.13, 3.4.4, 3.5.7
    • Component/s: Sharding
    • Labels:
      None
    • Fully Compatible
    • ALL
    • v3.4
    • Hide

      Don't know how to reproduce, and unfortunately I can't share my log files. We're using version 3.2.11.

      Show
      Don't know how to reproduce, and unfortunately I can't share my log files. We're using version 3.2.11.
    • Sharding 2017-05-29

      We have a sharded cluster. One of our primaries had several queued up RangeDeletes from chunks being moved off due to chunk migration. Typically the log shows the following for deleting a chunk after the migration of the chunk to a new primary:

      1. Deleter starting delete for: <namespace> from {<begin-range-of-chunk>} -> {<end-range-of-chunk>}, with opId: xxxxxxxx
      2. Some time later...Helpers::removeRangeUnlocked time spent waiting for replication: x ms
      3. rangeDeleter deleted n documents for <namespace> from {<begin-range-of-chunk>} -> {<end-range-of-chunk>}
      

      However, occasionally we see:

      1. Deleter starting delete for: ... (normal log statement as above)
      2. some time later... Error encountered while trying to delete range: Error encountered while deleting range: ns<namespace> from {<begin-range-of-chunk>} -> {<end-range-of-chunk>}, cause by:  :: caused by :: 112 WriteConflict
      3. No further log statements by the RangeDeleter for the specified chunk range that experienced a write conflict.
      

      I can only assume that the Write Conflict was not handled properly, and the documents were never successfully deleted??

            Assignee:
            nathan.myers Nathan Myers
            Reporter:
            jimreitz James Reitz
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated:
              Resolved: