Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-16911

errorMsg: moveChunk cannot enter critical section before all data is cloned

    XMLWordPrintableJSON

Details

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major - P3 Major - P3
    • None
    • 2.8.0-rc3, 2.8.0-rc4
    • Sharding
    • None
    • Fully Compatible
    • ALL

    Description

      related to SERVER-16763

      found following entry in server log during longevity test, and eventually lead to server crash

      2015-01-05T17:47:22.602+0000 E SHARDING [conn60] moveChunk cannot enter critical section before all data is cloned, 81584 locs were not transferred but to-shard reported { active: true, ns: "sbtest.sbtest1", from: "rs2/172.31.32.214:27017,ip-172-31-35-229:27017", min: { _id: -7816322693657637576 }, max: { _id: -7672769179660119751 }, shardKeyPattern: { _id: "hashed" }, state: "clone", counts: { cloned: 1480, clonedBytes: 321160, catchup: 0, steady: 0 }, ok: 1.0 }

      SERVER-16763 addressed issue related to system clock drifting may cause lock timeout issue.

      For the moveChunk message, this could be a separate issue to be fixed.

      I looked up this message "moveChunk cannot enter critical section before all data is cloned, 81584 locs were not transferred but to-shard reported ", which is the last message before the thread's long wait and eventually crash, it point to here https://github.com/mongodb/mongo/blob/master/src/mongo/s/d_migrate.cpp#L1372-L1380 the comment there says:

      // Should never happen, but safe to abort before critical section

      mongod then crashes after a while when wait for https://github.com/mongodb/mongo/blob/master/src/mongo/s/d_migrate.cpp#L307 (which shall be fixed by SERVER-16763)

      Not sure what condition could trigger migrateFromStatus.cloneLocsRemaining() not 0 here since we think this condition shall not happen?

      Attachments

        Activity

          People

            randolph@mongodb.com Randolph Tan
            rui.zhang Rui Zhang (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: