[SERVER-17510] "Didn't find RecordId in WiredTigerRecordStore" on collections after an idle period Created: 09/Mar/15 Updated: 23/May/18 Resolved: 09/Mar/15 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | WiredTiger |
| Affects Version/s: | 3.0.0 |
| Fix Version/s: | 3.0.1, 3.1.0 |
| Type: | Bug | Priority: | Critical - P2 |
| Reporter: | Michael Cahill (Inactive) | Assignee: | Michael Cahill (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | ET | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
||||||||
| Issue Links: |
|
||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||
| Operating System: | ALL | ||||||||
| Backport Completed: | |||||||||
| Steps To Reproduce: | No self-contained repro has been found. This bug has only been observed on complex systems. In particular, workloads that continually update collections will not hit this problem: only workloads where collections are idle for a period, then updated can trigger this bug, |
||||||||
| Participants: | |||||||||
| Description |
|
A bug in an internal WiredTiger thread could cause corruption in collections that become idle, then are later updated again. A WiredTiger thread that discards old collections in cache could (under rare circumstances) start discarding a collection but give up part way through, leaving an incomplete tree in cache. If that collection was subsequently updated, the on-disk tree could become corrupted. A repairDatabase operation may be required to salvage data. |
| Comments |
| Comment by Ramon Fernandez Marina [ 23/May/18 ] |
|
onkarb, MongoDB 3.0 is EOL – please upgrade to a supported version (MongoDB 3.6 at the moment) and if the problem persists open a new ticket. Thanks, |
| Comment by onkar [ 23/May/18 ] |
|
Hi, We are using MongoDb 3.0.15 in standalone mode and facing similar issue. Following log is observed in mongod.log while trying to backup database using mongodump command. 2018-05-22T01:43:50.759-0700 I QUERY [conn770575] assertion 28556 Didn't find RecordId in WiredTigerRecordStore ns:analytics_data_C88351b08d33664e013d8ad07ea32ff51.client_visit_data query:{ $query: {}, $snapshot: true } We have recently migrated MongoDb 3.0.6 to 3.0.15. To fix this issue we even executed repairDatabase() command for all databases, but it looks like issue gets reproduced again. Also this issue is observed for more than 1 collections. I have attached complete back trace observed in mongod.log also I have attached O/p of db.serverStatus() command as a attachment files. I would like to understand
Thanks in advance. Regards, Onkar |
| Comment by Ramon Fernandez Marina [ 19/May/15 ] |
|
sega, if you upgraded this replica set from 3.0.0 to 3.0.3 (possibly via 3.0.1 and/or 3.0.2) this is most likely If you either installed this replica set from scratch, or upgraded them but never run 3.0.0, please open a separate ticket and upload all the relevant information (including logs for all nodes). Thanks, |
| Comment by Sergey I. Yarkin [ 19/May/15 ] |
|
ramon.fernandez, I've replica set with 3 nodes with same version of mongod (3.0.3) |
| Comment by Ramon Fernandez Marina [ 19/May/15 ] |
|
sega, were you running 3.0.0 at one point? If you were, it's possible that you were affected by |
| Comment by Sergey I. Yarkin [ 19/May/15 ] |
|
Hi, I still have this issue in 3.0.3 (repairDatabase temporary resolve it) |
| Comment by Ramon Fernandez Marina [ 18/Mar/15 ] |
|
Hi sallgeud, this issue was fixed in the 3.0.1 stable release and the 3.1.0 development release, both published yesterday. Regards, |
| Comment by Chad Kreimendahl [ 18/Mar/15 ] |
|
So this wasn't fixed? |