-
Type:
Bug
-
Resolution: Done
-
Priority:
Major - P3
-
None
-
Affects Version/s: 2.4.5
-
Component/s: None
-
None
-
Environment:CentOS 6.4 x86_64 + pymongo 2.5.2
-
Linux
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The issue only came about when we moved from a single mongod instance to a sharded environment. After a few million inserts I start getting the following OperationFailure error and inserts slow down to a crawl.
I assume this is a bug as the source states:
// it should be impossible because once we call GLE, no other
// writebacks should happen with that connection id
OperationFailure Errors:
ERROR: error inserting 1 documents to shard shard0002:mongo3:27017 at version 16|103||5212c053b8650139d3328156 :: caused by :: got writeback waitfor for older id oid: 5212d98f0000000000000021 s.id: 52133f4c0000000000000082 ident: localhost.localdomain:2
ERROR: error inserting 1 documents to shard shard0001:mongo2:27017 at version 16|103||5212c053b8650139d3328156 :: caused by :: got writeback waitfor for older id oid: 5212d975000000000000001c s.id: 52133f4c0000000000000083 ident: localhost.localdomain:5
ERROR: error inserting 3 documents to shard shard0001:mongo2:27017 at version 16|103||5212c053b8650139d3328156 :: caused by :: got writeback waitfor for older id oid: 5212d98f0000000000000020 s.id: 52133f4c000000000000007d ident: localhost.localdomain:36
(It happens over and over)
mongos log output:
Tue Aug 20 16:36:00.735 [conn58] warning: exception during insert (continue on
error set) :: caused by :: 16460 error inserting 1 documents to shard shard000
0:mongo1:27017 at version 16|103||5212c053b8650139d3328156 :: caused by :: got
writeback waitfor for older id oid: 5212d975000000000000001c s.id: 52133f4c00
00000000000083 ident: localhost.localdomain:5
Tue Aug 20 16:36:00.735 [conn58] warning: swallowing exception during insert
:: caused by :: 16460 error inserting 1 documents to shard
shard0000:mongo1:2701
7 at version 16|103||5212c053b8650139d3328156 :: caused by :: got writeback
waitfor for older id oid: 5212d975000000000000001c s.id:
52133f4c000000000000008
3 ident: localhost.localdomain:5
Tue Aug 20 16:36:00.736 [conn58] Assertion: 14041:got writeback waitfor for
older id oid: 5212d975000000000000001c s.id: 52133f4c0000000000000083 ident:
loc
alhost.localdomain:5
0xa86db1 0xa4cbbb 0xa4d0fc 0x9f2b4f 0x8d376f 0x8d7413 0x9c2960 0x9c522c
0x994011 0x66a161 0xa7351e 0x3bb3e07851 0x3bb36e890d
mongos(_ZN5mongo15printStackTraceERSo+0x21) [0xa86db1]
mongos(_ZN5mongo11msgassertedEiPKc+0x9b) [0xa4cbbb]
mongos() [0xa4d0fc]
mongos(_ZN5mongo17WriteBackListener7waitForERKNS0_15ConnectionIdentERKNS_3OIDE+0xa7f)
[0x9f2b4f]
mongos(_ZN5mongo10ClientInfo17_handleWriteBacksERKSt6vectorINS0_6WBInfoESaIS2_EEb+0x9f)
[0x8d376f]
mongos(_ZN5mongo10ClientInfo12getLastErrorERKSsRKNS_7BSONObjERNS_14BSONObjBuilderERSsb+0x15d3)
[0x8d7413]
mongos(_ZN5mongo13ShardStrategy7_insertERKSsRNS_9DbMessageEiRNS_7RequestE+0x640)
[0x9c2960]
mongos(_ZN5mongo13ShardStrategy7writeOpEiRNS_7RequestE+0x53c) [0x9c522c]
mongos(_ZN5mongo7Request7processEi+0xd1) [0x994011]
mongos(_ZN5mongo21ShardedMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE+0x71)
[0x66a161]
mongos(_ZN5mongo17PortMessageServer17handleIncomingMsgEPv+0x42e) [0xa7351e]
/lib64/libpthread.so.0() [0x3bb3e07851]
/lib64/libc.so.6(clone+0x6d) [0x3bb36e890d]
It's probably also worth noting that the python script doing these inserts is multiprocess.