Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.4.0-rc1, 2.4.11, 2.6.4, 2.7.5
Component/s: Storage
Labels:
- community-team

Operating System:
Windows
Sprint:
Platform 8 08/28/15, Platform 7 08/10/15, Platform 9 (09/18/15), Platform A (10/09/15), Platform B (10/30/15), Platform C (11/20/15), Platform D (12/11/15)
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

In Windows, when the size of the data files is close to half the virtual address space limit, then the files can be initially opened, mapped and used (collection created and extents allocated) just fine. However, when the server is merely stopped and restarted (where extents have been allocated for a collection), it crashes with an inability to map the files.

I've narrowed this issue down to a change between 2.4.0-rc0 and 2.4.0-rc1, though it still exists in 2.4.11, 2.6 and 2.7 (though it presents slightly differently in 2.6 and 2.7 than it does in 2.4). It looks like some of the data files might be somehow being mapped multiple times? Some of them are certainly unmapped several times. Maybe related to ~~SERVER-12567~~?

In Windows 2008 R2, the virtual address space limit for 64 bit user processes is 8TB. I've done all of this testing without journalling to simplify things, but when I was previously looking at it with journalling on, the situation was similar but with an effective limit of 4TB instead. The results are the same whether the "2008plus" or "legacy" win32 x64 builds are used.

A workaround is to use Windows 2012 R2 instead of 2008 R2, where the limit is 128TB instead of 8TB. However, this problem will still affect Windows 2012 R2 for datasets around the 32TB mark (with journalling).

By contrast, in Linux if I use "ulimit -v 10485760" to limit the virtual address space to 10GB, then all of these versions have the expected behaviour, ie. they are able to

create 9GB of data files
restart and then open the data files.

Very verbose logfiles are attached. They show the results for

Windows 2008 R2 vs Linux (10GB vmem limit)
MongoDB versions 2.4.0-rc0, 2.4.0-rc1, 2.6.4, and 2.7.5
Creating capped collections of various sizes.

The Windows logfiles do not show any file allocation messages. This is because the files were allocated using an external tool that used (the Windows equivalent of) fast_allocate. (Otherwise allocating TBs of data files on Windows takes hours instead of seconds, even on SSDs. Any fast allocation bugs don't matter, since this is only testing the ability to mmap files.) You can tell when the dbpath has been cleared out by when local.ns gets allocated. A useful command to see the main timeline in each log is something like:

grep -E 'create collection test.|MongoDB starting|dbstats|assert|Map.*errno|allocating new datafile .*local.ns' mongod-2.4.0-rc1.log

Some of the smaller tests were done on an i2.8xlarge instance with 8x 800GB local SSDs in RAID0 (~6TB). The tests above this size used a hs1.8xlarge with 16x 2TB local disks in RAID0.

The results of the tests are:

OS	vmem	MongoDB	size	allocate	restart	db.stats	expected?
Windows	8TB	2.4.0-rc0	3.5TB	Works	Works	Works	Expected
Windows	8TB	2.4.0-rc0	3.8TB	Works	Works	Works	Expected
Windows	8TB	2.4.0-rc0	4.5TB	Works	Works	Works	Expected
Windows	8TB	2.4.0-rc0	5.5TB	Works	Works	Works	Expected
Windows	8TB	2.4.0-rc0	7.5TB	Works	Works	Works	Expected
Windows	8TB	2.4.0-rc0	8.5TB	Fails	Fails	Fails	Expected

OS	vmem	MongoDB	size	allocate	restart	db.stats	expected?
Windows	8TB	2.4.0-rc1	3.5TB	Works	Works	Works	Expected
Windows	8TB	2.4.0-rc1	3.7TB	Works	Works	Works	Expected
Windows	8TB	2.4.0-rc1	3.8TB	Works	Works	Works	Expected
Windows	8TB	2.4.0-rc1	3.9TB	Works	Works	Fails	Unexpected
Windows	8TB	2.4.0-rc1	4.5TB	Works	Works	Fails	Unexpected
Windows	8TB	2.4.0-rc1	5.5TB	Works	Works	Fails	Unexpected
Windows	8TB	2.4.0-rc1	7.5TB	Works	Works	Fails	Unexpected
Windows	8TB	2.4.0-rc1	8.5TB	Fails	Fails	Fails	Expected

OS	vmem	MongoDB	size	allocate	restart	db.stats	expected?
Windows	8TB	2.6.4	3.5TB	Works	Works	Works	Expected
Windows	8TB	2.6.4	3.9TB	Works	Fails		Unexpected
Windows	8TB	2.6.4	5.5TB	Works	Fails		Unexpected
Windows	8TB	2.6.4	7.5TB	Works	Fails		Unexpected
Windows	8TB	2.6.4	8.5TB	Fails	Fails		Expected

OS	vmem	MongoDB	size	allocate	restart	db.stats	expected?
Windows	8TB	2.7.5	3.5TB	Works	Works	Works	Expected
Windows	8TB	2.7.5	3.9TB	Works	Fails		Unexpected
Windows	8TB	2.7.5	5.5TB	Works	Fails		Unexpected
Windows	8TB	2.7.5	7.5TB	Works	Fails		Unexpected
Windows	8TB	2.7.5	8.5TB	Fails	Fails		Expected

OS	vmem	MongoDB	size	allocate	restart	db.stats	expected?
Linux	10GB	2.4.0-rc0	9GB	Works	Works	Works	Expected
Linux	10GB	2.4.0-rc0	11GB	Fails	Works	Fails	Expected
Linux	10GB	2.4.0-rc1	9GB	Works	Works	Works	Expected
Linux	10GB	2.4.0-rc1	11GB	Fails	Works	Fails	Expected
Linux	10GB	2.6.4	9GB	Works	Works	Works	Expected
Linux	10GB	2.6.4	11GB	Fails	Fails		Expected
Linux	10GB	2.7.5	9GB	Works	Works	Works	Expected
Linux	10GB	2.7.5	11GB	Fails	Fails		Expected

Where failures occur, the messages are:

OS	MongoDB	Message
Windows	2.4.0-rc0, 2.4.0-rc1, and 2.6.4	`errno:487 Attempt to access invalid address.`
Windows	2.7.5	`errno:1132 The base address or the file offset specified does not have the proper alignment.`
Linux	2.4.0-rc0, 2.4.0-rc1, 2.6.4, and 2.7.5	`errno:12 Cannot allocate memory`

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

logs.zip
2.45 MB
Aug 28 2014 05:31:43 AM UTC

related to

SERVER-19805 MMap memory mapped file address allocation code cannot handle addresses non-aligned to memory mapped granularity size

Closed

Assignee:: Mark Benvenuto
Reporter:: Kevin Pulo
Participants:: Kevin Pulo, Mark Benvenuto
Votes:: 3 Vote for this issue
Watchers:: 10 Start watching this issue

Created:: Aug 28 2014 05:31:43 AM UTC
Updated:: Dec 14 2015 10:48:19 PM UTC
Resolved:: Dec 14 2015 10:47:59 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates