-
Type:
Bug
-
Resolution: Cannot Reproduce
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Networking, Security
-
None
-
ALL
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
Hello!
I use mongodb 2.2.1 with 1 mongos and 2 shards, 3 replicas each.
We had a major network issue, when mongos and two(one from each shard) mongod's became isolated. When network was up again, we experienced following problem:
Wed Dec 26 20:02:37 [conn1183] authenticate db: partition-validator
{ authenticate: 1, user: "auto", nonce: "5b3de430aaacf9a5", key: "6eb58aa336bfc1eef5779a04761f83d3" }Wed Dec 26 20:02:37 [conn1183] auth: couldn't find user auto, partition-validator.system.users
Wed Dec 26 20:02:37 [conn1183] end connection 192.168.101.11:48742 (103 connections now open)
Wed Dec 26 20:02:47 [Balancer] distributed lock 'balancer/cs-mongodb01g:27000:1356537611:1804289383' acquired, ts : 50db1fa69f69900c8fba7b3b
Wed Dec 26 20:02:47 [Balancer] distributed lock 'balancer/cs-mongodb01g:27000:1356537611:1804289383' unlocked.
Wed Dec 26 20:02:56 [mongosMain] connection accepted from 192.168.101.11:48747 #1184 (104 connections now open)
Wed Dec 26 20:02:56 [conn1184] end connection 192.168.101.11:48747 (103 connections now open)
Wed Dec 26 20:02:56 [mongosMain] connection accepted from 192.168.101.11:48748 #1185 (104 connections now open)
Wed Dec 26 20:02:56 [conn1185] end connection 192.168.101.11:48748 (103 connections now open)
Wed Dec 26 20:02:56 [mongosMain] connection accepted from 192.168.101.11:48749 #1186 (104 connections now open)
Wed Dec 26 20:02:56 [conn1186] authenticate db: partition-validator
Wed Dec 26 20:02:56 [conn1186] auth: couldn't find user auto, partition-validator.system.users
Wed Dec 26 20:02:56 [conn1186] end connection 192.168.101.11:48749 (103 connections now open)
Network started about 19:00 and was fixed exactly in 20:00, authentication was ok before 20:00, during network fail too.
Some logs from the interval 19:00-20:00
Wed Dec 26 19:48:44 [conn16453835] Socket say send() errno:32 Broken pipe 192.168.101.16:27017
Wed Dec 26 19:48:44 [conn16453835] warning: could not get last error from a shard cs_mongodb-sh2/cs-mongodb02d:27017,cs-mongodb02f:27017,cs-mongodb02g:27017 :: caused by :: socket exception [SEND_ERROR] for 192.168.101.16:27017
Wed Dec 26 19:48:51 [Balancer] warning: distributed lock 'balancer/cs-mongodb01g:27000:1355404365:864101839 did not propagate properly. :: caused by :: 8017 update not consistent ns: config.locks query
:
update: { $set:
{ state: 1, who: "cs-mongodb01g:27000:1355404365:864101839:Balancer:636453333", process: "cs-mongodb01g:27000:13 55404365:864101839", when: new Date(1356536930010), why: "doing balance round", ts: ObjectId('50db1c61326d9641ed1f7228') }} gle1:
{ updatedExisting: true, n: 1, connectionId: 36369, waited: 34, err: nu ll, ok: 1.0 }gle2:
{ updatedExisting: false, n: 0, connectionId: 38633, waited: 18, err: null, ok: 1.0 }Wed Dec 26 19:48:51 [Balancer] lock update lost, lock 'balancer/cs-mongodb01g:27000:1355404365:864101839' not propagated.
Wed Dec 26 19:48:51 [Balancer] distributed lock 'balancer/cs-mongodb01g:27000:1355404365:864101839' was not acquired.
Wed Dec 26 19:48:55 [conn16452947] Assertion failure cursorCache.getRef( cursor ).size() src/mongo/s/request.cpp 182
0x787d86 0x754d81 0x6e94fd 0x707996 0x6e9eaf 0x5149c1 0x776c49 0x7f9283d779ca 0x7f928311e70d
/usr/bin/mongos(_ZN5mongo15printStackTraceERSo+0x26) [0x787d86]
/usr/bin/mongos(_ZN5mongo12verifyFailedEPKcS1_j+0xc1) [0x754d81]
/usr/bin/mongos(_ZN5mongo7Request5replyERNS_7MessageERKSs+0x19d) [0x6e94fd]
/usr/bin/mongos(_ZN5mongo13ShardStrategy7getMoreERNS_7RequestE+0x376) [0x707996]
/usr/bin/mongos(_ZN5mongo7Request7processEi+0x14f) [0x6e9eaf]
/usr/bin/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x71) [0x5149c1]
/usr/bin/mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x429) [0x776c49]
/lib/libpthread.so.0(+0x69ca) [0x7f9283d779ca]
/lib/libc.so.6(clone+0x6d) [0x7f928311e70d]
Wed Dec 26 19:49:01 [conn16453011] Assertion failure cursorCache.getRef( cursor ).size() src/mongo/s/request.cpp 182
0x787d86 0x754d81 0x6e94fd 0x707996 0x6e9eaf 0x5149c1 0x776c49 0x7f9283d779ca 0x7f928311e70d
/usr/bin/mongos(_ZN5mongo15printStackTraceERSo+0x26) [0x787d86]
/usr/bin/mongos(_ZN5mongo12verifyFailedEPKcS1_j+0xc1) [0x754d81]
/usr/bin/mongos(_ZN5mongo7Request5replyERNS_7MessageERKSs+0x19d) [0x6e94fd]
/usr/bin/mongos(_ZN5mongo13ShardStrategy7getMoreERNS_7RequestE+0x376) [0x707996]
/usr/bin/mongos(_ZN5mongo7Request7processEi+0x14f) [0x6e9eaf]
/usr/bin/mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x71) [0x5149c1]
/usr/bin/mongos(_ZN5mongo3pms9threadRunEPNS_13MessagingPortE+0x429) [0x776c49]
/lib/libpthread.so.0(+0x69ca) [0x7f9283d779ca]
/lib/libc.so.6(clone+0x6d) [0x7f928311e70d]