[SERVER-15691] acquiring balancer lock may fail and get stuck with concurrent write traffic Created: 16/Oct/14  Updated: 25/Apr/16  Resolved: 22/Jan/15

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.4.12, 2.6.5, 2.7.8
Fix Version/s: 3.0.0-rc6

Type: Bug Priority: Critical - P2
Reporter: Rui Zhang (Inactive) Assignee: Randolph Tan
Resolution: Done Votes: 0
Labels: 28qa
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File config-db-dump-1.tar.gz     File config-db-dump-2.tar.gz     File config-db-dump-3.tar.gz     File config-dump.tar.gz     PNG File imbalanced_chunks.png     File mongod-primary-shard.log.gz    
Issue Links:
Related
related to SERVER-16825 Use LocksType to parse config.locks d... Closed
related to SERVER-12548 add indexes to config.locks and lockp... Closed
is related to SERVER-16996 DistributedLock assumes that if a loc... Closed
is related to SERVER-17000 Drop the { ts: 1 } index during --upg... Closed
Tested
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:
  • setup a shard with 3 shards/3 config server/1 mongos
  • start concurrent insert traffic (more than 10 connections)
  • check chunk distribution regularly, there are around 50% chance that chunks are not balanced,
Participants:

 Description   

during concurrent insert testing, saw chunk stay with the primary shard without being balanced. assertion error in log file.



 Comments   
Comment by Randolph Tan [ 22/Jan/15 ]

Background:

Whenever a new lock document is created, it is initialized with a ts of OID(0).

Issue:

When there are multiple threads trying to create a new lock document, some of them will fail to insert the new lock document because config.locks collection has a unique index over the { ts: 1 } field. Depending on the interleaving of the threads, it is possible the unique index violation will be triggered in the 2nd or 3rd config server. In that case, the lock will be in a state where it can never be acquired without manual intervention. Note that lock documents are never deleted by the system, they are set to the unlock state after they are released, so this particular issue can only happen when the lock is about to be used for the first time.

Fix:

Make the index non-unique. Existing cluster need to manually drop their { ts: 1 } index and rebuild them with { unique: false }.

Comment by Daniel Pasette (Inactive) [ 22/Jan/15 ]

randolph is making a follow on ticket for the work involved to have users upgrading to drop their existing

{ ts: 1 }

index in config.locks.

Comment by Githook User [ 22/Jan/15 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-15691 acquiring balancer lock may fail and get stuck with concurrent write traffic

Make config.locks

{ ts: 1 }

not unique
Branch: master
https://github.com/mongodb/mongo/commit/78c3e1803f96d045e91bb2efb9163eee9f43d237

Comment by Randolph Tan [ 14/Jan/15 ]

Based on the latest config, the lock document is missing on the third config server. I still haven't figured out why it was not able to send the upsert to the third config server. The primary logs in the attachment appears to be a different instance from the one in the config dump as the processID for the lock doesn't match.

Comment by Rui Zhang (Inactive) [ 13/Jan/15 ]

renctan here are three dumps from a new test.

these are the error from this run in the primary shard log

[SHARDS-2 ~]$ cat /data/logs/mongod.log | grep -i error | grep expected
2015-01-13T19:30:03.086+0000 W SHARDING [conn29] could not acquire collection lock for sbtest.sbtest9 to split chunk [{ : MinKey },{ : MaxKey }) :: caused by :: distributed lock sbtest.sbtest9/ip-172-31-35-229:27017:1421177402:1559071435 had errors communicating with individual server 172.31.39.168:27019 :: caused by :: field not found, expected type 7
2015-01-13T19:30:03.118+0000 W SHARDING [conn19] could not acquire collection lock for sbtest.sbtest6 to split chunk [{ : MinKey },{ : MaxKey }) :: caused by :: distributed lock sbtest.sbtest6/ip-172-31-35-229:27017:1421177402:1559071435 had errors communicating with individual server 172.31.39.168:27019 :: caused by :: field not found, expected type 7
2015-01-13T19:30:03.238+0000 W SHARDING [conn22] could not acquire collection lock for sbtest.sbtest1 to split chunk [{ : MinKey },{ : MaxKey }) :: caused by :: distributed lock sbtest.sbtest1/ip-172-31-35-229:27017:1421177402:1559071435 had errors communicating with individual server 172.31.39.168:27019 :: caused by :: field not found, expected type 7

Comment by Julian Wissmann [ 15/Dec/14 ]

I can confirm this also happens with less than 10 concurrent connections. I've noticed this on 5 concurrent worker threads for inserting from Java. I open one MongoClient and have my threads hold their own DBCollection. Is it advisable to move to each thread holding its own client instance?

Comment by Randolph Tan [ 20/Oct/14 ]

rui.zhang Would you be able to also provide the test script that can reproduce this if you have one?

Thanks!

Comment by Rui Zhang (Inactive) [ 16/Oct/14 ]

This is one chart of chunk distribution with 2.7.8-pre-

sometime, the primary shards could hold all 60~ chunks.

test is run with 2.7.8-pre- and 2.6.5, can see issue with both.

from log:

1) 1-2 second after I start insert traffic, I got splitChunk failure

2014-10-14T23:19:03.851+0000 I NETWORK  [mongosMain] connection accepted from 127.0.0.1:45155 #14 (12 connections now open)
2014-10-14T23:19:04.132+0000 W SHARDING [conn10] splitChunk failed - cmd: { splitChunk: "htest1.htest1", keyPattern: { _id: "hashed" }, min: { _id: MinKey }, max: { _id: Max
Key }, from: "shard0001", splitKeys: [ { _id: 848411777775835583 }, { _id: 5728680133961294070 } ], shardId: "htest1.htest1-_id_MinKey", configdb: "172.31.38.156:27019,172.3
1.39.168:27019,172.31.35.248:27019", epoch: ObjectId('543daf66384288cc1bd7b599') } result: { ok: 0.0, errmsg: "could not acquire collection lock for htest1.htest1 to split c
hunk [{ : MinKey },{ : MaxKey }) :: caused by :: Lock for splitting chunk [{ : MinKey },..." }

2) later saw assertion backtrace like this.

2014-10-14T23:24:52.560+0000 I SHARDING [LockPinger] cluster 172.31.38.156:27019,172.31.39.168:27019,172.31.35.248:27019 pinged successfully at Tue Oct 14 23:24:52 2014 by distributed lock pinger '172.31.38.156:27019,172.31.39.168:27019,172.31.35.248:27019/ip-172-31-37-78:27017:1413328215:1804289383', sleeping for 30000ms
2014-10-14T23:25:16.909+0000 I SHARDING [Balancer] forcing lock 'balancer/ip-172-31-37-78:27017:1413324528:1804289383' because elapsed time 900458 > takeover time 900000
2014-10-14T23:25:17.073+0000 W SHARDING [Balancer] lock forcing balancer/ip-172-31-37-78:27017:1413324528:1804289383 inconsistent
2014-10-14T23:25:17.073+0000 I SHARDING [Balancer] lock 'balancer/ip-172-31-37-78:27017:1413324528:1804289383' successfully forced
2014-10-14T23:25:17.209+0000 W SHARDING [Balancer] distributed lock 'balancer/ip-172-31-37-78:27017:1413328215:1804289383 did not propagate properly. :: caused by :: 8017 update not consistent  ns: config.locks query: { _id: "balancer", state: 0 } update: { $set: { state: 1, who: "ip-172-31-37-78:27017:1413328215:1804289383:Balancer:1681692777", process: "ip-172-31-37-78:27017:1413328215:1804289383", when: new Date(1413329117073), why: "doing balance round", ts: ObjectId('543db0dd384288cc1bd7b59b') } } gle1: { lastOp: Timestamp 1413329117000|2, connectionId: 16, updatedExisting: true, n: 1, syncMillis: 14, writtenTo: null, err: null, waited: 14, ok: 1.0 } gle2: { lastOp: Timestamp 1413328216000|6, connectionId: 12, updatedExisting: false, n: 0, syncMillis: 23, writtenTo: null, err: null, waited: 23, ok: 1.0 }
2014-10-14T23:25:17.210+0000 I -        [Balancer] Assertion: 13111:field not found, expected type 7
2014-10-14T23:25:17.213+0000 I -        [Balancer]
 0xbc4fb9 0xb76d81 0xb5d48f 0xb5d53c 0x9d0392 0xa92ee0 0x9f43fd 0xb5fa3c 0xc11ae4 0x7f5a3b090f18 0x7f5a3a1a2b9d
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"400000","o":"7C4FB9"},{"b":"400000","o":"776D81"},{"b":"400000","o":"75D48F"},{"b":"400000","o":"75D53C"},{"b":"400000","o":"5D0392"},{"b":"400000","o":"692EE0"},{"b":"400000","o":"5F43FD"},{"b":"400000","o":"75FA3C"},{"b":"400000","o":"811AE4"},{"b":"7F5A3B089000","o":"7F18"},{"b":"7F5A3A0C0000","o":"E2B9D"}],"processInfo":{ "mongodbVersion" : "2.7.8-pre-", "gitVersion" : "6c25c480af3ac850d3e6637ac952cfaabb728301", "uname" : { "sysname" : "Linux", "release" : "3.10.42-52.145.amzn1.x86_64", "version" : "#1 SMP Tue Jun 10 23:46:43 UTC 2014", "machine" : "x86_64" }, "somap" : [ { "elfType" : 2, "b" : "400000", "buildId" : "8F306CC0685A06E5E8DA3FAC5BF0B446B658D22B" }, { "b" : "7FFF1E8FE000", "elfType" : 3, "buildId" : "422398C00613A4323A5E0C656ED6BE4BCFF5FD15" }, { "b" : "7F5A3B089000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "A8E8015F84374807F6920DF95D16BBAAAC02AB41" }, { "b" : "7F5A3AE81000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "CC3700E5EACA3A520D1FD1BA3B469F2EFB1B8F11" }, { "b" : "7F5A3AC7D000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "3D5EB706F159BCCB9A3B3E5F528B512E2C500F90" }, { "b" : "7F5A3A979000", "path" : "/usr/lib64/libstdc++.so.6", "elfType" : 3, "buildId" : "DD6383EEAC49E9BAA9E3D1080AE932F42CF8A385" }, { "b" : "7F5A3A67B000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "226E54A756FDC288551092B4FDE818C8D6D6FC04" }, { "b" : "7F5A3A465000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "C52958E393BDF8E8D090F36DE0F4E620D8736FBF" }, { "b" : "7F5A3A0C0000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "AD1B9E9119DEDF5A95178C0EA240EEBDE71147B6" }, { "b" : "7F5A3B2A5000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "37F70DE2ABEA184F24518EB5CC538FF0F54E081F" } ] }}
 mongos(_ZN5mongo15printStackTraceERSo+0x29) [0xbc4fb9]
 mongos(_ZN5mongo10logContextEPKc+0xE1) [0xb76d81]
 mongos(_ZN5mongo11msgassertedEiPKc+0xAF) [0xb5d48f]
 mongos(+0x75D53C) [0xb5d53c]
 mongos(_ZNK5mongo11BSONElement3OIDEv+0x1C2) [0x9d0392]
 mongos(_ZN5mongo15DistributedLock8lock_tryERKSsbPNS_7BSONObjEd+0x6D70) [0xa92ee0]
 mongos(_ZN5mongo8Balancer3runEv+0x5DD) [0x9f43fd]
 mongos(_ZN5mongo13BackgroundJob7jobBodyEv+0x11C) [0xb5fa3c]
 mongos(+0x811AE4) [0xc11ae4]
 libpthread.so.0(+0x7F18) [0x7f5a3b090f18]
 libc.so.6(clone+0x6D) [0x7f5a3a1a2b9d]
-----  END BACKTRACE  -----
2014-10-14T23:25:17.213+0000 I NETWORK  [Balancer] scoped connection to 172.31.39.168:27019 not being returned to the pool
2014-10-14T23:25:17.213+0000 I NETWORK  [Balancer] scoped connection to 172.31.38.156:27019,172.31.39.168:27019,172.31.35.248:27019 not being returned to the pool
2014-10-14T23:25:17.213+0000 I SHARDING [Balancer] caught exception while doing balance: distributed lock balancer/ip-172-31-37-78:27017:1413328215:1804289383 had errors communicating with individual server 172.31.39.168:27019 :: caused by :: field not found, expected type 7
2014-10-14T23:25:22.709+0000 I SHARDING [LockPinger] cluster 172.31.38.156:27019,172.31.39.168:27019,172.31.35.248:27019 pinged successfully at Tue Oct 14 23:25:22 2014 by distributed lock pinger '172.31.38.156:27019,172.31.39.168:27019,172.31.35.248:27019/ip-172-31-37-78:27017:1413328215:1804289383', sleeping for 30000ms

Generated at Thu Feb 08 03:38:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.