-
Type:
Bug
-
Resolution: Incomplete
-
Priority:
Major - P3
-
None
-
Affects Version/s: 3.0.4
-
Component/s: MapReduce, Sharding, WiredTiger
-
None
-
ALL
-
-
None
-
3
-
None
-
None
-
None
-
None
-
None
-
None
We got the a similar error to SERVER-16429 at version 3.0.4.
It happens once a week in a sharded environment with WT engine in the mongod instances.
The evidences is a crashed shard, while in the other shard there a remaining of a tmp table that was not deleted.
Attached is the log error from the failed shard:
2015-12-12T00:35:40.814+0000 I COMMAND [conn92487] mr failed, removing collection :: caused by :: WriteConflict 2015-12-12T00:35:40.818+0000 I COMMAND [conn92487] CMD: drop XXXX.tmp.mr.account_231015 2015-12-12T00:35:40.822+0000 I NETWORK [initandlisten] connection accepted from XXX.XXX.XXX.XXX:XXXXX #110728 (47 connections now open) 2015-12-12T00:35:40.920+0000 I COMMAND [conn92487] command XXXX.$cmd command: drop { drop: "tmp.mr.account_231015" } ntoreturn:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:122 locks:{ Global: { acquireCount: { r: 8, w: 4 } }, Database: { acquireCount: { r: 1, w: 1, R: 1, W: 4 }, acquireWaitCount: { W: 4 }, timeAcquiringMicros: { W: 7627049380 } }, Collection: { acquireCount: { r: 1, w: 1, W: 1 } } } 102ms 2015-12-12T00:35:40.920+0000 I QUERY [conn110694] query XXXX.endpoints query: { $query: { gw: { $gt: 0 }, $or: [ { status: "unmanaged" }, { status: "managed" } ] }, $readPreference: { mode: "secondaryPreferred" } } planSummary: IXSCAN { gw: -1.0, status: -1.0 } ntoreturn:0 ntoskip:0 nscanned:0 nscannedObjects:0 keyUpdates:0 writeConflicts:0 numYields:1 nreturned:0 reslen:20 locks:{ Global: { acquireCount: { r: 4 } }, Database: { acquireCount: { r: 2 }, acquireWaitCount: { r: 2 }, timeAcquiringMicros: { r: 3690878996 } }, Collection: { acquireCount: { r: 2 } } } 106ms 2015-12-12T00:35:40.920+0000 I QUERY [conn110692] query XXXX.endpoints query: { $query: { gw: { $gt: 0 }, $or: [ { status: "unmanaged" }, { status: "managed" } ] }, $readPreference: { mode: "secondaryPreferred" } } planSummary: IXSCAN { gw: -1.0, status: -1.0 } ntoreturn:0 ntoskip:0 nscanned:0 nscannedObjects:0 keyUpdates:0 writeConflicts:0 numYields:3 nreturned:0 reslen:20 locks:{ Global: { acquireCount: { r: 8 } }, Database: { acquireCount: { r: 4 }, acquireWaitCount: { r: 4 }, timeAcquiringMicros: { r: 3690908386 } }, Collection: { acquireCount: { r: 4 } } } 182338ms 2015-12-12T00:35:40.927+0000 I COMMAND [conn92491] command admin.$cmd command: listDatabases { listDatabases: 1 } ntoreturn:1 keyUpdates:0 writeConflicts:0 numYields:0 reslen:290 locks:{ Global: { acquireCount: { r: 6 } }, Database: { acquireCount: { r: 3 }, acquireWaitCount: { r: 1 }, timeAcquiringMicros: { r: 14825915727 } } } 179101ms 2015-12-12T00:35:40.978+0000 E NETWORK [conn92487] Uncaught std::exception: std::exception, terminating 2015-12-12T00:35:40.978+0000 I CONTROL [conn92487] dbexit: rc: 100