-
Type:
Bug
-
Resolution: Done
-
Priority:
Critical - P2
-
None
-
Affects Version/s: 2.6.1
-
Component/s: Sharding
-
None
-
Fully Compatible
-
ALL
-
None
-
0
-
None
-
None
-
None
-
None
-
None
-
None
We have a sharded cluster with below config -
2 Sharded servers (shard-db1, shard-db2)
3 Config servers (on servers shard-db1, shard-db2 and app1)
2 MongoS servers (app1, app2)
and collection position is sharded across shard-db1 and shard-db2
Week back. a very strange thing happened:
As per line no, 379 (shardA.log) - a command was fired for deleting records for some range, as:
2015-02-11T04:56:35.564-0500 [conn52] remove ds-db.position query: { dt: { $gte: new Date(1412136000000), $lte: new Date(1424840400000) } } keyUpdates:0 numYields:0 locks(micros) w:169 234ms
But as per line - 603 (shardA.log), records were deleted for much larger date range, as
2015-02-11T05:27:15.556-0500 [conn43] remove ds-db.position query: { dt: { $gte: new Date(1364702400000), $lte: new Date(1424840400000) } } ndeleted:2375989 keyUpdates:0 numYields:37161 locks(micros) w:3561770402 2200618ms
Also, in ShardB.log
where this command took approx 90 mins.
2015-02-11T06:30:11.176-0500 [conn45] remove ds-db.position query: { dt: { $gte: new Date(1364702400000), $lte: new Date(1424840400000) } } ndeleted:5822749 keyUpdates:0 numYields:110266 locks(micros) w:6417752235 5976240ms
So no clue how does the delete command picked start date as: 1364702400000 instead of 1412136000000
Thanks