[DOCS-5151] Standardize and Substantiate how a document is marked for deletion Created: 02/Apr/15 Updated: 11/Jan/17 Resolved: 27/Jul/16 |
|
| Status: | Closed |
| Project: | Documentation |
| Component/s: | bsonspec, manual |
| Affects Version/s: | mongodb-3.0 |
| Fix Version/s: | 01112017-cleanup |
| Type: | Improvement | Priority: | Minor - P4 |
| Reporter: | Yazad Khambata | Assignee: | Unassigned |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | bson | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Participants: | |
| Days since reply: | 7 years, 29 weeks ago |
| Comments |
| Comment by Emily Hall [ 27/Jul/16 ] |
|
Closed for housekeeping on 7/27/2016 by Emily Hall. |
| Comment by Yazad Khambata [ 02/Apr/15 ] |
|
While I understand the structure of a deleted document is internal to Mongo DB but there is a value for standardizing and documenting the structure of a deleted document in Mongo. One benefit that will be immediately available is the ability to recover data from the mongo files by writing simple python scripts relatively reliably in the short run and in the future it may pave the way to create in house tools and third party tools to aid data recovery. This can come in handy in case of an accidental disaster or even in case of digital forensic investigations. These would involve trying to recover data in cases where you don't have a node with a substantial delay and hence the delete is propagated. But as we know the data is not yet physically deleted and there is hope for investigators and others interested in recovering data. Currently it seems that when a document is deleted the first 4 bytes of the document are replaced by \xee\xee\xee\xee - effectively overwriting the size of the document. Which does make recover difficult if not impossible. This issue is that this is not documented anywhere and this has to be discovered by researchers and tool makers by getting our hands dirty and by browsing through many scripts available online that don't necessarily work. Even Mongo understandably chooses not to standardize the internals of the delete structure in the long run - a warning and explaining what the structure looks like internally would be helpful. Let me know if there is any more info or justification that I can provide for this ticket. |