-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
Affects Version/s: 2.0.7, 2.2.0
-
Component/s: Internal Code
-
None
-
Environment:Differing effects in Linux and Windows, but an issue in both
-
Fully Compatible
-
ALL
The NamespaceDetails::__stdAlloc() routine starting at line 321 in src/mongo/db/namespace_details.cpp in today's master branch (line 308 in db/namespace.cpp in today's 2.0 branch) is used to allocate space for documents in non-capped collections. This routine has code to check for a bad link in either the NamespaceDetails::deletedList[] or in a deleted record pointed to by that list. On finding a bad pointer, it will log a warning and print a stack trace. After this, it will incorrectly attempt to change both the bad pointer and the pointer that pointed to the bad pointer.
There are two parts to "incorrectly":
1) These pointers live in memory-mapped files and the code that changes them (on lines 339 and 340 in src/mongo/db/namespace_details.cpp in today's master branch) is not using the journaling mechanism (e.g. getDur().writingDiskLoc(), etc.) to change these values. In Linux or Windows, this means that we may get inconsistent values written to disk and replaying the journal will not correct them because the journal didn't record the change we made. In Windows you have the additional opportunity to generate an access violation, because the private view will be mapped read-only unless the page has been made PAGE_WRITECOPY (copy-on-write) by another (correct) journaled write to the page within the last 100 milliseconds. This is the access violation that we see as part of SERVER-7068.
2) This code should not be attempting to patch a broken chain where the breakage could consist of valid data incorrectly pointed to; it should fassert and stop mongod before any damage is done.
- is related to
-
SERVER-7068 Windows access violation in NamespaceDetails::__stdAlloc() during replication
- Closed