Details
-
Bug
-
Status: Closed
-
Major - P3
-
Resolution: Fixed
-
None
-
Fully Compatible
-
ALL
-
Description
ISSUE SUMMARY
MongoDB running with the WiredTiger storage engine, under high load with append-only workloads and no reads, may fail to find pages to evict from cache and hang.
USER IMPACT
mongod keeps running but becomes unresponsive.
WORKAROUNDS
Once the process becomes stuck, mongod must be restarted.
AFFECTED VERSIONS
MongoDB 3.0.0 through 3.0.6
FIX VERSION
The fix is included in the 3.0.7 production release.
Configuration:
3 members replica set
db version v3.1.7-pre-
git version: 4cf56d86a386039839dc10bb761bd28c829be426
Two problems:
1) Primary node is up and running but not able to perform any CRUD operations (mongostat and other db. . insert({}) hang), however failover didn't occur.
2) WiredTiger execute endless loop in !__wt_tree_walk and holding CRUD operations w/o timeout/watchdog for robustness (See debugger output for the lock owner)
0:460> !cs -l
|
-----------------------------------------
|
DebugInfo = 0x000000de00a25740
|
Critical section = 0x000000de7fc780c0 (+0xDE7FC780C0)
|
LOCKED
|
LockCount = 0x0
|
WaiterWoken = No
|
OwningThread = 0x000000000000097c
|
RecursionCount = 0x1
|
LockSemaphore = 0xD8C
|
SpinCount = 0x0000000000000fa0
|
|
2 Id: 11d4.97c Suspend: 1 Teb: 00007ff7`4fe68000 Unfrozen
|
Child-SP RetAddr Call Site
|
000000de`01dafc30 00007ff7`51567749 mongod!__wt_tree_walk+0x1a8 [c:\data\mci\src\src\third_party\wiredtiger\src\btree\bt_walk.c @ 243]
|
000000de`01dafcc0 00007ff7`515672e7 mongod!__evict_walk_file+0x329 [c:\data\mci\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 1154]
|
000000de`01dafd60 00007ff7`51566764 mongod!__evict_walk+0x2b7 [c:\data\mci\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 1032]
|
000000de`01dafdf0 00007ff7`51566d5b mongod!__evict_lru_walk+0x24 [c:\data\mci\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 789]
|
000000de`01dafe20 00007ff7`51566f58 mongod!__evict_pass+0x25b [c:\data\mci\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 502]
|
000000de`01dafe80 00007ffb`5e534f7f mongod!__evict_server+0x38 [c:\data\mci\src\src\third_party\wiredtiger\src\evict\evict_lru.c @ 169]
|
000000de`01dafeb0 00007ffb`5e535126 MSVCR120!beginthreadex+0x107
|
000000de`01dafee0 00007ffb`6d3f15dd MSVCR120!endthreadex+0x192
|
000000de`01daff10 00007ffb`6d7343d1 KERNEL32!BaseThreadInitThunk+0xd
|
000000de`01daff40 00000000`00000000 ntdll!RtlUserThreadStart+0x1d
|
RS.Status
EitanRs3a:PRIMARY> rs.status()
|
{
|
"set" : "EitanRs3a",
|
"date" : ISODate("2015-08-18T14:45:29.611Z"),
|
"myState" : 1,
|
"term" : NumberLong(0),
|
"heartbeatIntervalMillis" : NumberLong(2000),
|
"members" : [
|
{
|
"_id" : 0,
|
"name" : "eitan5:5002",
|
"health" : 1,
|
"state" : 1,
|
"stateStr" : "PRIMARY",
|
"uptime" : 66715,
|
"optime" : Timestamp(1439894846, 12455),
|
"optimeDate" : ISODate("2015-08-18T10:47:26Z"),
|
"electionTime" : Timestamp(1439842421, 2),
|
"electionDate" : ISODate("2015-08-17T20:13:41Z"),
|
"configVersion" : 3,
|
"self" : true
|
},
|
{
|
"_id" : 1,
|
"name" : "Eitan1:5002",
|
"health" : 1,
|
"state" : 1,
|
"stateStr" : "PRIMARY",
|
"uptime" : 66704,
|
"optime" : Timestamp(1439894723, 4673),
|
"optimeDate" : ISODate("2015-08-18T10:45:23Z"),
|
"lastHeartbeat" : ISODate("2015-08-18T14:45:29.030Z"),
|
"lastHeartbeatRecv" : ISODate("2015-08-18T14:45:28.841Z"),
|
"pingMs" : 2,
|
"electionTime" : Timestamp(1439906145, 1),
|
"electionDate" : ISODate("2015-08-18T13:55:45Z"),
|
"configVersion" : 3
|
},
|
{
|
"_id" : 2,
|
"name" : "Eitan6:5002",
|
"health" : 1,
|
"state" : 2,
|
"stateStr" : "SECONDARY",
|
"uptime" : 66663,
|
"optime" : Timestamp(1439893521, 7147),
|
"optimeDate" : ISODate("2015-08-18T10:25:21Z"),
|
"lastHeartbeat" : ISODate("2015-08-18T14:45:29.041Z"),
|
"lastHeartbeatRecv" : ISODate("2015-08-18T14:45:28.844Z"),
|
"pingMs" : 1,
|
"syncingTo" : "eitan5:5002",
|
"configVersion" : 3
|
}
|
],
|
"ok" : 1
|
}
|
Attachments
Issue Links
- is depended on by
-
WT-1973 MongoDB changes for WiredTiger 2.7.0
-
- Closed
-