[SERVER-2508] Database gets corrupted and reports duplicate collection names Created: 09/Feb/11 Updated: 30/Mar/12 Resolved: 20/Jun/11 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 1.6.5 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Reinaldo Giudici | Assignee: | Aaron Staple |
| Resolution: | Cannot Reproduce | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Linux cloud-mongo22 2.6.32-28-server #55-Ubuntu SMP Mon Jan 10 23:57:16 UTC 2011 x86_64 GNU/Linux |
||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Operating System: | ALL | ||||
| Participants: | |||||
| Description |
|
At least three times on different 1.6 versions we have gotten into this situation. Attached are the database files with the corrupt data. The general flow is that the LEADERBOARD_ collections get created and dropped based on time and other variables, they last for about 24h. For example: , cond: null, $reduce: "function(doc,prev) { prev.numEntries ++ }", initial: { numEntries: 0 }} } reslen:26486 1103ms to generate some data on the rank. Here is a sample of magnitude of the problem: 49111 collections with only 318 unique names: echo "show collections" | mongo localhost:27017/ee895ca8156f4c7fa4104b1b4c9a8c38 | sort -u | wc -l show collections will show (partial output) |
| Comments |
| Comment by Aaron Staple [ 20/Jun/11 ] |
|
Hi Reinaldo - please reopen if you have further problems. |
| Comment by Aaron Staple [ 14/Jun/11 ] |
|
Hi Reinaldo - could you send the log from 1.7.5? |
| Comment by Aaron Staple [ 22/Feb/11 ] |
|
Hi Reinaldo - would it be possible to attach the log from 1.7.5? Thanks |
| Comment by Reinaldo Giudici [ 16/Feb/11 ] |
|
Attached is the full log |
| Comment by Eliot Horowitz (Inactive) [ 16/Feb/11 ] |
|
Can you send the full logs? |
| Comment by Reinaldo Giudici [ 16/Feb/11 ] |
|
This did happened again once after the upgrade 1.7.5. |
| Comment by Eliot Horowitz (Inactive) [ 10/Feb/11 ] |
|
There were a number of fixes it could be. |
| Comment by Reinaldo Giudici [ 10/Feb/11 ] |
|
I do not have a standalone test that is able to reproduce this. it has happened on production a few times. |
| Comment by Eliot Horowitz (Inactive) [ 10/Feb/11 ] |
|
I think this is already fixed in 1.7.x |
| Comment by Reinaldo Giudici [ 09/Feb/11 ] |
|
And digging even further on the logs: Wed Feb 9 01:00:24 [conn9812] CMD: drop ee895ca8156f4c7fa4104b1b4c9a8c38.LEADEBOARD_cityofwonder-22-6-103_l |
| Comment by Reinaldo Giudici [ 09/Feb/11 ] |
|
Here are the errors on the logs when this was happening. Wed Feb 9 11:25:10 [conn10156] getFile(): n=-2 |