Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-21463

SyncClusterConnection should gracefully handle fsyncLock

    XMLWordPrintableJSON

Details

    • Icon: Improvement Improvement
    • Resolution: Won't Fix
    • Icon: Major - P3 Major - P3
    • None
    • None
    • Sharding
    • None
    • Sharding

    Description

      Metadata writes are still attempted if a config server is fsyncLocked, which can cause problems with consistency (most likely problems with distlocks getting stuck, but if the fsyncLocked config server cannot be unlocked, then actual inconsistencies when it is forcibly shutdown).

      This is different to the behaviour when a config server is down or unreachable, where the metadata write will not be attempted on any config server. This is problematic because users often think "I need to stop writes to config servers, but taking one down is scary, so I'll just fsyncLock it instead", ie. fsyncLock is often perceived by users as a good way of stopping writes without stopping reads — but this is not true for the SyncClusterConnection config protocol, and can cause more harm than good.

      It would be good if, for the purposes of writes, SCCC considered an fsyncLocked config server to be unavailable, same as if it were actually down.

      The simplest way to do this would probably be for getLastError to return an fsyncLocked bool, which the SCCC protocol can check when it checks for ok: 1 inside ConfigCoordinator::executeBatch().

      Repro:

      1. Start sharded cluster with 3 config servers
      2. Shard a collection
      3. Take a config server down
      4. An attempted split fails, as expected:

        mongos> db.test.splitAt({_id:1})
        {
                "ok" : 0,
                "errmsg" : "not all config servers are up: Not all config servers config:genique:11114,genique:11115,genique:11116 are reachable"
        }
        

      5. Bring the config server back up.
      6. fsyncLock a config server.
      7. Attempt a split, which will block.
      8. Wait 30 seconds.
      9. The split will fail.

        mongos> db.test.splitAt({_id:2})
        {
                "cause" : "SplitFailed splitChunk failed - cmd: { splitChunk: \"test.test\", keyPattern: { _id: 1.0 }, min: { _id: 1.0 }, max: { _id: MaxKey }, from: \"shard01\", splitKeys: [ { _id: 2.0 } ], shardId: \"test.test-_id_1.0\", configdb: \"genique:11114,genique:11115,genique:11116\", epoch: ObjectId('5647c30eab64f42ca072e6ca') } result: { ok: 0.0, errmsg: \"could not acquire collection lock for test.test to split chunk [{ : MinKey },{ : MaxKey }) :: caused by :: exception creating distributed lock test.te...\" }",
                "ok" : 0,
                "errmsg" : "split failed"
        }
        

        2015-11-15T10:28:25.874+1100 I COMMAND  [conn5] splitting chunk [{ _id: 1.0 },{ _id: MaxKey }) in collection test.test on shard shard01
        2015-11-15T10:28:55.956+1100 W SHARDING [conn5] splitChunk failed - cmd: { splitChunk: "test.test", keyPattern: { _id: 1.0 }, min: { _id: 1.0 }, max: { _id: MaxKey }, from: "shard01", splitKeys: [ { _id: 2.0 } ], shardId: "test.test-_id_1.0", configdb: "genique:11114,genique:11115,genique:11116", epoch: ObjectId('5647c30eab64f42ca072e6ca') } result: { ok: 0.0, errmsg: "could not acquire collection lock for test.test to split chunk [{ : MinKey },{ : MaxKey }) :: caused by :: exception creating distributed lock test.te..." }
        

        2015-11-15T10:28:25.874+1100 I SHARDING [conn2] received splitChunk request: { splitChunk: "test.test", keyPattern: { _id: 1.0 }, min: { _id: 1.0 }, max: { _id: MaxKey }, from: "shard01", splitKeys: [ { _id: 2.0 } ], shardId: "test.test-_id_1.0", configdb: "genique:11114,genique:11115,genique:11116", epoch: ObjectId('5647c30eab64f42ca072e6ca') }
        2015-11-15T10:28:50.767+1100 I NETWORK  [LockPinger] SyncClusterConnection connecting to [genique:11114]
        2015-11-15T10:28:50.768+1100 I NETWORK  [LockPinger] SyncClusterConnection connecting to [genique:11115]
        2015-11-15T10:28:50.768+1100 I NETWORK  [LockPinger] SyncClusterConnection connecting to [genique:11116]
        2015-11-15T10:28:55.950+1100 I NETWORK  [conn2] Socket recv() timeout  127.0.1.1:11115
        2015-11-15T10:28:55.950+1100 I NETWORK  [conn2] SocketException: remote: 127.0.1.1:11115 error: 9001 socket exception [RECV_TIMEOUT] server [127.0.1.1:11115]
        2015-11-15T10:28:55.950+1100 I NETWORK  [conn2] DBClientCursor::init call() failed
        2015-11-15T10:28:55.956+1100 W SHARDING [conn2] could not acquire collection lock for test.test to split chunk [{ : MinKey },{ : MaxKey }) :: caused by :: exception creating distributed lock test.test/genique:11112:1447543610:1011412462 :: caused by :: SyncClusterConnection write op failed: genique:11115 (127.0.1.1) failed: {} DBClientBase::findN: transport error: genique:11115 ns: admin.$cmd query: { getlasterror: 1, fsync: 1 }
        2015-11-15T10:28:55.956+1100 I COMMAND  [conn2] command admin.$cmd command: splitChunk { splitChunk: "test.test", keyPattern: { _id: 1.0 }, min: { _id: 1.0 }, max: { _id: MaxKey }, from: "shard01", splitKeys: [ { _id: 2.0 } ], shardId: "test.test-_id_1.0", configdb: "genique:11114,genique:11115,genique:11116", epoch: ObjectId('5647c30eab64f42ca072e6ca') } ntoreturn:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:434 locks:{} 30081ms
        

      10. fsyncUnlock the config server.
      11. The collection lock is still stuck, and subsequent metadata operations continue to fail accordingly:

        mongos> db.test.splitAt({_id:4})
        {
                "cause" : "SplitFailed splitChunk failed - cmd: { splitChunk: \"test.test\", keyPattern: { _id: 1.0 }, min: { _id: 1.0 }, max: { _id: MaxKey }, from: \"shard01\", splitKeys: [ { _id: 4.0 } ], shardId: \"test.test-_id_1.0\", configdb: \"genique:11114,genique:11115,genique:11116\", epoch: ObjectId('5647c30eab64f42ca072e6ca') } result: { ok: 0.0, errmsg: \"could not acquire collection lock for test.test to split chunk [{ : MinKey },{ : MaxKey }) :: caused by :: Lock for splitting chunk [{ : MinKey }, { :...\" }",
                "ok" : 0,
                "errmsg" : "split failed"
        }
        

        2015-11-15T10:29:21.018+1100 I COMMAND  [conn5] splitting chunk [{ _id: 1.0 },{ _id: MaxKey }) in collection test.test on shard shard01
        2015-11-15T10:29:21.019+1100 W SHARDING [conn5] splitChunk failed - cmd: { splitChunk: "test.test", keyPattern: { _id: 1.0 }, min: { _id: 1.0 }, max: { _id: MaxKey }, from: "shard01", splitKeys: [ { _id: 3.0 } ], shardId: "test.test-_id_1.0", configdb: "genique:11114,genique:11115,genique:11116", epoch: ObjectId('5647c30eab64f42ca072e6ca') } result: { ok: 0.0, errmsg: "could not acquire collection lock for test.test to split chunk [{ : MinKey },{ : MaxKey }) :: caused by :: Lock for splitting chunk [{ : MinKey }, { :..." }
        

        2015-11-15T10:29:21.018+1100 I SHARDING [conn2] received splitChunk request: { splitChunk: "test.test", keyPattern: { _id: 1.0 }, min: { _id: 1.0 }, max: { _id: MaxKey }, from: "shard01", splitKeys: [ { _id: 3.0 } ], shardId: "test.test-_id_1.0", configdb: "genique:11114,genique:11115,genique:11116", epoch: ObjectId('5647c30eab64f42ca072e6ca') }
        2015-11-15T10:29:21.018+1100 W SHARDING [conn2] could not acquire collection lock for test.test to split chunk [{ : MinKey },{ : MaxKey }) :: caused by :: Lock for splitting chunk [{ : MinKey }, { : MaxKey }) in test.test is taken.
        

      12. After 15 mins the lock will be forced as usual.

        mongos> db.test.splitAt({_id:4})
        { "ok" : 1 }
        

        2015-11-15T10:51:26.698+1100 I COMMAND  [conn5] splitting chunk [{ _id: 1.0 },{ _id: MaxKey }) in collection test.test on shard shard01
        2015-11-15T10:51:27.312+1100 I SHARDING [conn5] ChunkManager: time to load chunks for test.test: 0ms sequenceNumber: 6 version: 2|3||5647c30eab64f42ca072e6ca based on: 2|1||5647c30eab64f42ca072e6ca
        

        2015-11-15T10:51:26.698+1100 I SHARDING [conn6] received splitChunk request: { splitChunk: "test.test", keyPattern: { _id: 1.0 }, min: { _id: 1.0 }, max: { _id: MaxKey }, from: "shard01", splitKeys: [ { _id: 4.0 } ], shardId: "test.test-_id_1.0", configdb: "genique:11114,genique:11115,genique:11116", epoch: ObjectId('5647c30eab64f42ca072e6ca') }
        2015-11-15T10:51:26.699+1100 I SHARDING [conn6] forcing lock 'test.test/genique:11112:1447543610:1011412462' because elapsed time 1325681 > takeover time 900000
        2015-11-15T10:51:26.824+1100 I SHARDING [conn6] lock 'test.test/genique:11112:1447543610:1011412462' successfully forced
        2015-11-15T10:51:27.028+1100 I SHARDING [conn6] distributed lock 'test.test/genique:11112:1447543610:1011412462' acquired, ts : 5647c8fe87cb2b46d7bc360a
        2015-11-15T10:51:27.028+1100 I SHARDING [conn6] remotely refreshing metadata for test.test based on current shard version 2|0||5647c30eab64f42ca072e6ca, current metadata version is 2|0||5647c30eab64f42ca072e6ca
        2015-11-15T10:51:27.028+1100 I SHARDING [conn6] updating metadata for test.test from shard version 2|0||5647c30eab64f42ca072e6ca to shard version 2|1||5647c30eab64f42ca072e6ca
        2015-11-15T10:51:27.028+1100 I SHARDING [conn6] collection version was loaded at version 2|1||5647c30eab64f42ca072e6ca, took 0ms
        2015-11-15T10:51:27.028+1100 I SHARDING [conn6] splitChunk accepted at version 2|1||5647c30eab64f42ca072e6ca
        2015-11-15T10:51:27.164+1100 I SHARDING [conn6] about to log metadata event: { _id: "genique-2015-11-14T23:51:27-5647c8ff87cb2b46d7bc360b", server: "genique", clientAddr: "127.0.0.1:44034", time: new Date(1447545087164), what: "split", ns: "test.test", details: { before: { min: { _id: 1.0 }, max: { _id: MaxKey } }, left: { min: { _id: 1.0 }, max: { _id: 4.0 }, lastmod: Timestamp 2000|2, lastmodEpoch: ObjectId('5647c30eab64f42ca072e6ca') }, right: { min: { _id: 4.0 }, max: { _id: MaxKey }, lastmod: Timestamp 2000|3, lastmodEpoch: ObjectId('5647c30eab64f42ca072e6ca') } } }
        2015-11-15T10:51:27.311+1100 I SHARDING [conn6] distributed lock 'test.test/genique:11112:1447543610:1011412462' unlocked.
        2015-11-15T10:51:27.311+1100 I COMMAND  [conn6] command admin.$cmd command: splitChunk { splitChunk: "test.test", keyPattern: { _id: 1.0 }, min: { _id: 1.0 }, max: { _id: MaxKey }, from: "shard01", splitKeys: [ { _id: 4.0 } ], shardId: "test.test-_id_1.0", configdb: "genique:11114,genique:11115,genique:11116", epoch: ObjectId('5647c30eab64f42ca072e6ca') } ntoreturn:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:95 locks:{ Global: { acquireCount: { r: 4, w: 2 } }, MMAPV1Journal: { acquireCount: { r: 1, w: 2 } }, Database: { acquireCount: { r: 1, w: 2 } }, Collection: { acquireCount: { R: 1, W: 2 } } } 613ms
        

      Attachments

        Activity

          People

            backlog-server-sharding [DO NOT USE] Backlog - Sharding Team
            kevin.pulo@mongodb.com Kevin Pulo
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: