[SERVER-10260] Server Crash with: warning: DR102 too much data written uncommitted 315.318MB Created: 19/Jul/13 Updated: 10/Dec/14 Resolved: 04/Apr/14 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 2.4.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Esteban Feldman | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | crash | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Linux 3.2.0-23-virtual #36-Ubuntu SMP Tue Apr 10 22:29:03 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux Ubuntu 12.04.2 |
||
| Operating System: | Linux |
| Participants: |
| Description |
|
The system was working for a long time with no problem. Yesterday I came to a crash and this message in the log. I upgraded from 2.4.3 to 2.4.5 to see if that mitigated, but not, still happening. It's a 4 machines shard environment with no replication. 3 Configs. 0xdd9e31 0x921161 0x92123b 0x9212b2 0x9213e3 0x921450 0x91ae8a 0x80022b 0x9d3ae3 0x9db144 0x9dcdcc 0xac35ff 0xac4092 0xac44e1 0xa8c1d3 0xa8fe67 0x9f2ff8 0x9f8588 0x6e8b68 0xdc659e |
| Comments |
| Comment by Daniel Pasette (Inactive) [ 25/Jul/13 ] |
|
you can try reducing the journalCommitInterval to 50. Let me know if this helps. |
| Comment by Esteban Feldman [ 25/Jul/13 ] |
|
@Dan, thanks, I didn't understood. So the thing is that I can't put the journal files o another disk. Is there any other alternative? Thanks |
| Comment by Daniel Pasette (Inactive) [ 25/Jul/13 ] |
|
Regarding putting the journal on a different physical disk, you may wish, before starting mongod to symlink the journal/ directory located in the dbpath to a dedicated hard drive to speed the frequent (fsynced) sequential writes which occur to the current journal file. I don't think there is any explicit mention of this in the current documentation. There is an outstanding documentation ticket for this here with a bit more information: Regarding the journalCommitInterval, this is server configuration setting. It defaults to 100ms if the journal directory is on the same physical volume as the data files. If the journal file is on a separate volume, the default lowers automatically to 30ms. |
| Comment by Esteban Feldman [ 24/Jul/13 ] |
|
Hi Dan. |
| Comment by Daniel Pasette (Inactive) [ 23/Jul/13 ] |
|
DR102 is a warning indicating that the journal isn't keeping up with the write load. There are a couple recommendations for dealing with DR102 errors:
|