[SERVER-19751] WiredTiger panic halt in eviction-server Created: 04/Aug/15 Updated: 24/Aug/15 Resolved: 12/Aug/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.0.5 |
| Fix Version/s: | 3.0.6, 3.1.7 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Michael Templeman | Assignee: | Michael Cahill (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | RF, WTmem | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
AWS Linux amzn-ami-hvm-2015.03.0.x86_64-gp2 (ami-1ecae776) on a i2.xlarge instance |
||
| Issue Links: |
|
||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||
| Operating System: | Linux | ||||||||||||
| Backport Completed: | |||||||||||||
| Participants: | |||||||||||||
| Description |
|
Primary server on primary shard running mongodb 3.0.5 encountered panic halt. Log report of the crash:
The backtrace:
We have had repeated occurrences of the primary server cache going to 100% and remaining there until the instance is stepped down. We also experience frequent OOM crashes as the memory footprint of mongodb grows beyond available memory. This condition occurs despite setting the wired tiger cache size to 6GB less than available memory. |
| Comments |
| Comment by Githook User [ 06/Aug/15 ] | |||||
|
Author: {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'}Message: Merge pull request #2107 from wiredtiger/pthread-create-retry | |||||
| Comment by Githook User [ 05/Aug/15 ] | |||||
|
Author: {u'username': u'keithbostic', u'name': u'Keith Bostic', u'email': u'keith.bostic@mongodb.com'}Message: Merge pull request #2107 from wiredtiger/pthread-create-retry
| |||||
| Comment by Githook User [ 05/Aug/15 ] | |||||
|
Author: {u'username': u'michaelcahill', u'name': u'Michael Cahill', u'email': u'michael.cahill@mongodb.com'}Message: | |||||
| Comment by Michael Templeman [ 04/Aug/15 ] | |||||
|
oops, I forgot to report that the ulimit was set to 64000, THP was turned off and tcp_keepalive_time is set to 120. | |||||
| Comment by Michael Templeman [ 04/Aug/15 ] | |||||
|
Ramon The i2.xlarge ec2 instance has 30GB of memory. Nothing else runs on the server than mongo. Mongod was started with:
I don't know what the actual memory utilization was when the process halted. Past experience has been that under certain operations the process cache goes to 100% and the process memory utilization as reported by mongostat will rise to around 29GB (at which point a crash is imminent). | |||||
| Comment by Ramon Fernandez Marina [ 04/Aug/15 ] | |||||
|
mike@meshfire.com, the error message:
indicates that the eviction server failed to create a new thread with EAGAIN. pthread_create(3) reads:
Can you please provide more details about the memory configuration and utilization on the affected node, as well as information about system limits for the mongod process? Thanks, | |||||
| Comment by Ramon Fernandez Marina [ 04/Aug/15 ] | |||||
|
Hi mike@meshfire.com, thanks for the report – we're investigating. I'd recommend the {noformat} macro for logs, I've edited the description above to use it. | |||||
| Comment by Michael Templeman [ 04/Aug/15 ] | |||||
|
Darn. I screwed up the formatting again. For log dumps what is the proper format command? |