[SERVER-6151] mongos crashed : corrupted unsorted chunks Created: 21/Jun/12 Updated: 16/Nov/21 Resolved: 01/Oct/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Stability |
| Affects Version/s: | 2.0.6 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Klébert Hodin | Assignee: | Greg Studer |
| Resolution: | Incomplete | Votes: | 1 |
| Labels: | mongos | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
linux x86_64 2.6.18-274.3.1.el5.centos.plus |
||
| Attachments: |
|
| Operating System: | Linux |
| Participants: |
| Description |
|
Mongos crashed.
As seen on mms, mongos process used a lot of virtual memory (18Gb) before failure. |
| Comments |
| Comment by Greg Studer [ 01/Oct/12 ] |
|
Original submitter went away - second issue unrelated. |
| Comment by Greg Studer [ 28/Aug/12 ] |
|
The previous issue was related to mongos - it seems like you're experiencing a problem with mongod (which could be related, but I'm guessing is unlikely). This problem seems more related to https://jira.mongodb.org/browse/SERVER-2652. |
| Comment by David Gubler [ 27/Aug/12 ] |
|
I'm having similar issues (although I cannot tell if they're actually the same) with 2.0.7. The error occured when I tried to shut down mongodb (/etc/init.d/mongodb stop). See attached log file (mongodb.filtered.log). I have removed connect/disconnect/auth log statements. Environment is Debian squeeze/2.6.39 from backports. |
| Comment by Greg Studer [ 10/Jul/12 ] |
|
Note - the above tools would require a debug build of mongos for the trace to be usable. |
| Comment by Greg Studer [ 06/Jul/12 ] |
|
Unfortunately no - we're working on recording log info but aren't there yet. The logs would have helped us potentially pull out anything that seemed abnormal. Is it possible on your end to run memory usage tools on the mongos while running? It would be extremely useful to get memory profiling information, one option is to use tcmalloc as doc'd here: http://gperftools.googlecode.com/svn/trunk/doc/heapprofile.html The resulting output would target the portion of the mongos codebase that was allocating the memory. This will have a performance impact, though it may be manageable if you're running background hadoop processes. |
| Comment by Klébert Hodin [ 06/Jul/12 ] |
|
No we are not. This period is 7 days long. What kind of infos do you need ? Startup logs ? |
| Comment by Greg Studer [ 05/Jul/12 ] |
|
Thanks for the additional information - discussing with mongo-hadoop maintainers now. From the previous crash, it appears you ran out of file descriptors on your system (which shouldn't have caused a crash, but indicates that you (or mongos) were using an unexpected number of connections). Are you able to post the mongos log for a full period from startup to crash (feel free to post a SUPPORT ticket if you need to keep the log private)? |
| Comment by Grégoire Seux [ 04/Jul/12 ] |
|
some precisions : we run mongo-hadoop to dump around a 70GB collection on hdfs. The job gets around 100 map at the same time. Since it does not support connecting to multiple mongoS, one of them get all the load (100 concurrent chunks retrieved). Chunk size is fixed to 64mb. Memory used by the mongoS increases a lot until crash. |
| Comment by Klébert Hodin [ 04/Jul/12 ] |
|
Any updates on this issue ? We found out this crash occurs when running mongo-hadoop adapter. |
| Comment by Klébert Hodin [ 21/Jun/12 ] |
|
We're using 2.5 version. |
| Comment by Scott Hernandez (Inactive) [ 21/Jun/12 ] |
|
What glibc version are you using? |