[SERVER-1163] Writing 44K records a second and mongod.exe gives 'FlushViewOfFile failed 33' warning and then a client issue Created: 27/May/10  Updated: 12/Jul/16  Resolved: 24/Mar/12

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 1.4.2
Fix Version/s: 2.0.5, 2.1.1

Type: Bug Priority: Major - P3
Reporter: Douglas Taft Assignee: Tad Marshall
Resolution: Done Votes: 8
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows 7 client using C# driver, writing to database on a Windows 2008 x64


Attachments: Text File log-1.6.0-replset.1.txt     Text File log.txt    
Issue Links:
Duplicate
is duplicated by SERVER-1459 FlushViewOfFile failed 33 (and othe n... Closed
is duplicated by SERVER-5631 Still getting error [DataFileSync] Fl... Closed
Related
related to SERVER-1459 FlushViewOfFile failed 33 (and othe n... Closed
related to SERVER-5819 Use FileAllocator on Windows Closed
Backwards Compatibility: Fully Compatible
Operating System: Windows
Participants:

 Description   

A client application is attempting to write a large amount of data. For several minutes the data is written to without error with peak throughput of roughly 44K message a second. Then the client receives an error stating the application cannot connect to the database. in the log on the database this error is encountered. FlushViewOfFile failed 33

I have attached a log with the test runs included.



 Comments   
Comment by Tad Marshall [ 08/Feb/13 ]

A hotfix is available from Microsoft for Windows 7 and Windows Server 2008 R2 that eliminates the root cause of the issue.

http://support.microsoft.com/kb/2731284

Comment by auto [ 18/Apr/12 ]

Author:

{u'login': u'tadmarshall', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-1163 Retry calls to FlushViewOfFile on error 33

FlushViewOfFile() will return error code 33 (ERROR_LOCK_VIOLATION)
in some cases, but this is a "transient" error, and just retrying
repeatedly will (supposedly) always work. This changes the code
to retry up to one million times and for up to 60 seconds if it
continues to get ERROR_LOCK_VIOLATION.
Branch: v2.0
https://github.com/mongodb/mongo/commit/83fa0f7bc1611e015024b6acac23b14cca44f35e

Comment by auto [ 27/Mar/12 ]

Author:

{u'login': u'tadmarshall', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-1163 Fix race condition in FlushViewOfFile loop

Make timeout determination only after a call to FlushViewOfFile,
not before. We would need a 60 second pause between two lines
of code to lose this race, but it was possible.
Branch: master
https://github.com/mongodb/mongo/commit/b5a14dd396630e3429491d4abc256a54b7bfb595

Comment by Erich Siedler [ 24/Mar/12 ]

Sleep(0) is special, from the docs: "If you specify 0 milliseconds, the thread will relinquish the remainder of its time slice but remain ready."

We are just giving other threads (kernel included) a chance to run immediately; if there aren't any ready to run our thread will resume immediately.

But your measurements suggest that maybe we don't need to bother. Thanks.

Comment by Tad Marshall [ 24/Mar/12 ]

I am going to close this ticket as "fixed". Most of the information in this report is rather old and refers to versions that will not be getting any further updates: 1.4.2 and 1.6.0. If we see issues related to FlushViewOfFile in version 2.1.1 or later, it's time for a new ticket.

The failure of FlushViewOfFile is completely reproducible. Our "core test suite" hits it in every run in Windows. You can reproduce it yourself if you want to: build "test.exe" from the MongoDB GitHub repository, and run "test pdfile --dbpath C:\mongoTest" after creating a C:\mongoTest (or anything you like) folder. On my machine, I get the "FlushViewOfFile error 33" in seconds, every time.

We have never seen crashes or data corruption that could be reliably connected with this error. The meaning of the error is that we have asked Windows to sync our memory-mapped file (a portion of a database) to disk NOW and Windows has said "not right now, I'm busy". It doesn't mean that something bad has happened and it doesn't mean that the data will not be written to disk: we just can't force it to happen at that instant.

We got some help from a Microsoft engineer who talked to the "NTFS owners" and told us that if we retry repeatedly, we will eventually get a successful response from this API.

That's what the new code change does. We limit "repeatedly" to one million retries or 60 seconds, whichever comes first, and this gets us past the failure we see in our test runs every day. On my home machine running Windows 7 x64, the 32-bit version succeeded after 6 retries taking 4 milliseconds. On my work machine running Windows 7 x64 under VirtualBox under Ubuntu 11.10, the 64-bit version succeeded after 24 retries taking 6400 milliseconds. If you read the code (see https://github.com/mongodb/mongo/commit/c0ed1fde6a726909ae99b928a711e3c38837a4ae in the check-in above) you will see that we report every outcome of FlushViewOfFile other than "worked the first time". This means that (for now) every situation that previously reported "FlushViewOfFile failed 33" will still report something about FlushViewOfFile. If this code change does what we think it will do, the new report will say "FlushViewOfFile for <filename> succeeded after <number> attempts taking <number> ms". After we accumulate some experience with this new code, we may (and probably will) limit the reporting to only cases where the count or time exceeds some arbitrary limits, but today's code will report any count other than 1.

@Erich Siedler, thanks for the link. Both because the internal Windows code already does a "sleep" (KeDelayExecutionThread) on STATUS_FILE_LOCK_CONFLICT and because we don't want to pause if we don't have to, I think we don't want to add a Sleep(0). According to the documentation, Sleep() can have a resolution of as much as 15 milliseconds and if we can do 6 retries in 4 milliseconds (and then succeed) I'd rather not slow anything down unnecessarily. We're already hitting the disk by flushing a memory-mapped file to disk, and we'd like the delays to be only what the disk subsystem forces on us.

I hope this isn't just too-long-didn't-read, but if you see issues in 2.1.1 or later, please post a new Jira ticket and let us know if there is more to do. Thanks!

Comment by Erich Siedler [ 23/Mar/12 ]

Apparently a known issue.

https://connect.microsoft.com/VisualStudio/feedback/details/552859/memorymappedviewaccessor-flush-throws-ioexception

The semi-official workaround calls for a pause between retries. I think we should at least do a ::Sleep(0).

Comment by auto [ 23/Mar/12 ]

Author:

{u'login': u'tadmarshall', u'name': u'Tad Marshall', u'email': u'tad@10gen.com'}

Message: SERVER-1163 Retry calls to FlushViewOfFile on error 33

FlushViewOfFile() will return error code 33 (ERROR_LOCK_VIOLATION)
in some cases, but this is a "transient" error, and just retrying
repeatedly will (supposedly) always work. This changes the code
to retry up to one million times and for up to 60 seconds if it
continues to get ERROR_LOCK_VIOLATION.
Branch: master
https://github.com/mongodb/mongo/commit/c0ed1fde6a726909ae99b928a711e3c38837a4ae

Comment by Tad Marshall [ 23/Dec/11 ]

@Viktar, can you run VMMap ( http://technet.microsoft.com/en-us/sysinternals/dd535533 ) and post a screenshot of the mongod.exe process when it is approaching its problem point? Also, the output from db.serverStatus() and db.stats(), please? I'd like to get a handle on what this looks like from outside (VMMap) and inside to help me figure out what is going wrong. Thanks!

Comment by Tad Marshall [ 15/Dec/11 ]

@Viktar – do you have a server log that includes the hang that you could post to this Jira ticket? How much RAM is in the machine, and is mongod.exe's "Memory (Private Working Set)" at the time according to Windows Task Manager? The FlushViewOfFiles API call is important for forcing modified database pages back to disk, and if we get a lot of failures on that call then we will eat up page file space until it gets better. After the hang, did it go back to normal?

Comment by Viktar Basharymau [ 15/Dec/11 ]

I have the same issue on Windows 2008 using Java driver.

N:\mongodb\bin>mongod --version
db version v2.0.0, pdfile version 4.5
Thu Dec 15 12:55:10 git version: 695c67dff0ffc361b8568a13366f027caa406222

When inserting a large amount of data, mongod just hangs after 6 or 7 million rows are inserted, and I can't connect to MongoDB even from mongo command line client.

Comment by Tad Marshall [ 05/Dec/11 ]

I did some more research and single-stepped through some of the NtFlushVirtualMemory() routine that FlushViewOfFile() translates into. NtFlushVirtualMemory() eventually calls MiFlushSectionInternal(), and this is the routine that returns the STATUS_FILE_LOCK_CONFLICT (0xC0000054) error that gets translated into DOS error 33 (ERROR_LOCK_VIOLATION) that we see in the return from FlushViewOfFile(). The funny thing is that NtFlushVirtualMemory() has code to detect this specific error code, and it retries 5 times before returning the error to us. Any other error is returned without retrying ... so Microsoft knows something about this failure, or the code thinks it does. More research is needed.

Comment by Tad Marshall [ 28/Nov/11 ]

I think that this is "an OS issue" and may require working with Microsoft to get it fixed.

From way back in Windows 2000 days, MS posted this Knowledge Base article: http://support.microsoft.com/kb/266749 .

It says "When you create a file with the CreateFileMapping function, you may receive an ERROR_LOCK_VIOLATION response. This error occurs if the NTFS Transaction log becomes full while you are creating the file mapping. Memory Manager is not waiting long enough for NTFS to flush the transaction log and returns the error, causing the function not to succeed. To resolve this problem, obtain the latest service pack for Windows 2000." Right.

Here's how the folks working on the OpenJDK addressed it in MappedByteBuffer.c:

    /*
     * FlushViewOfFile can fail with ERROR_LOCK_VIOLATION if the memory
     * system is writing dirty pages to disk. As there is no way to
     * synchronize the flushing then we retry a limited number of times.
     */
    retry = 0;
    do {
        result = FlushViewOfFile(a, (DWORD)len);
        if ((result != 0) || (GetLastError() != ERROR_LOCK_VIOLATION))
            break;
        retry++;
    } while (retry < 3);

I'll investigate how we might get some Microsoft attention to the problem.

Comment by Dwight Merriman [ 20/Jun/11 ]

in an isolated test not using most of the mongodb code i can get this to happen suggesting maybe an OS issue.
when i try to flush <= 64MB it works, at least in the one test i am doing.

Comment by Ed Rooth [ 12/Feb/11 ]

Update, seems to only happen when building the indexes. I'm guessing b/c this is a write-intensive operation.

Comment by Ed Rooth [ 07/Feb/11 ]

We are also getting this error consistently with v 1.6.5 on windows 64-bit

Comment by AndrewK [ 31/Aug/10 ]

Alas, no luck. Are you able to repro this on your side?

[Primary node - started first]
D:\bin\mongodb-win32-x86_64-2010-08-31\bin\mongod.exe --port 5001 --rest --dbpath .\node1 --replSet devset/localhost:5002

[Secondary node - this node is the one that generates the error]
D:\bin\mongodb-win32-x86_64-2010-08-31\bin\mongod.exe --port 5002 --rest --dbpath .\node2 --replSet devset/localhost:5001

Tue Aug 31 13:38:15 MongoDB starting : pid=12220 port=5002 dbpath=.\node2 64-bit

    • NOTE: This is a development version (1.7.0-pre-) of MongoDB.
    • Not recommended for production.

Tue Aug 31 13:38:15 db version v1.7.0-pre-, pdfile version 4.5
Tue Aug 31 13:38:15 git version: 387e0e4fcdb3498fc84f3f5386a15e450ea4c5f1
Tue Aug 31 13:38:15 sys info: windows (6, 1, 7600, 2, '') BOOST_LIB_VERSION=1_42
Tue Aug 31 13:38:15 [initandlisten] waiting for connections on port 5002
Tue Aug 31 13:38:15 [initandlisten] ******
Tue Aug 31 13:38:15 [initandlisten] creating replication oplog of size: 944MB... (use --oplogSize to change)
Tue Aug 31 13:38:16 [initandlisten] ******
Tue Aug 31 13:38:16 [websvr] web admin interface listening on port 6002
Tue Aug 31 13:38:16 [initandlisten] connection accepted from 127.0.0.1:60994 #1
Tue Aug 31 13:38:16 [initandlisten] connection accepted from 127.0.0.1:60993 #2
Tue Aug 31 13:38:17 [startReplSets] replSet got config version 2 from a remote, saving locally
Tue Aug 31 13:38:17 [startReplSets] replSet info saving a newer config version to local.system.replset
Tue Aug 31 13:38:18 [startReplSets] FlushViewOfFile failed 33 file: ./node2/local.1

Comment by Mathias Stearn [ 30/Aug/10 ]

Ok, I changed the fix around. Could you try out tomorrow's nightly?

Comment by auto [ 30/Aug/10 ]

Author:

{'login': 'RedBeard0531', 'name': 'Mathias Stearn', 'email': 'mathias@10gen.com'}

Message: Per-file flush mutex on Windows SERVER-1163
http://github.com/mongodb/mongo/commit/2eb1ac7c193e7c0f1234f1facb732b199d5fc037

Comment by auto [ 30/Aug/10 ]

Author:

{'login': 'RedBeard0531', 'name': 'Mathias Stearn', 'email': 'mathias@10gen.com'}

Message: Revert "make sure only one flush happens at a time SERVER-1163"

This reverts commit 6dbd8594e56135347fca75f15ba08578cfacbe1a.
http://github.com/mongodb/mongo/commit/29596f5f9681595b515469c8bc5a2d8bc8d8c0f5

Comment by AndrewK [ 28/Aug/10 ]

Right, we may be on to something here...

I can't get it to produce the error when running a single node. I started with an empty dbpath and then pumped data in, did queries, left it sitting around for a while etc. By this stage, the replSet would definitely have produced the error (in fact, the replSet produces the error on the second node almost immediately after starting it with an empty dbpath and without any queries/inserts).

Here's the command line for the single node:
D:\bin\mongodb-win32-x86_64-2010-08-27\bin\mongod.exe --port 5002 --rest --dbpath .\node2 --master

Here are the command lines for the replSet:
D:\bin\mongodb-win32-x86_64-2010-08-27\bin\mongod.exe --port 5001 --rest --dbpath .\node1 --replSet devset/localhost:5002
D:\bin\mongodb-win32-x86_64-2010-08-27\bin\mongod.exe --port 5002 --rest --dbpath .\node2 --replSet devset/localhost:5001

Comment by Mathias Stearn [ 28/Aug/10 ]

Could you try w/o repl sets (just a regular --master node)? It looks like Repl sets are using a different flush call, so I'd like to take that out of the equation and try to isolate the issue. Also can you give the full commandline you use when starting mongod?

Comment by AndrewK [ 27/Aug/10 ]

Still getting the error:

Fri Aug 27 23:11:33 db version v1.7.0-pre-, pdfile version 4.5
Fri Aug 27 23:11:33 git version: 948b3ad552579f4dfaae50c30d89bad71766f8d0
Fri Aug 27 23:11:33 sys info: windows (6, 1, 7600, 2, '') BOOST_LIB_VERSION=1_42
Fri Aug 27 23:11:33 [initandlisten] waiting for connections on port 5002
Fri Aug 27 23:11:33 [initandlisten] ******
Fri Aug 27 23:11:33 [initandlisten] creating replication oplog of size: 944MB... (use --oplogSize to change)
Fri Aug 27 23:11:34 [initandlisten] ******
Fri Aug 27 23:11:34 [websvr] web admin interface listening on port 6002
Fri Aug 27 23:11:34 [initandlisten] connection accepted from 127.0.0.1:8932 #1
Fri Aug 27 23:11:34 [startReplSets] replSet got config version 2 from a remote, saving locally
Fri Aug 27 23:11:34 [startReplSets] replSet info saving a newer config version to local.system.replset
Fri Aug 27 23:11:34 [initandlisten] connection accepted from 127.0.0.1:8934 #2
Fri Aug 27 23:11:34 [startReplSets] FlushViewOfFile failed 33 file: ./node2/local.1

Comment by AndrewK [ 26/Aug/10 ]

Mathias Stearn: i'm not set up to compile - can you put it into a nightly?

I can get the error pretty reliably every time I start things with a fresh --dbpath

Getting it on an existing --dbpath is a little harder but still happens with time.

Let's assume that the corrupted oplog is red herring for now. let's see if we can get rid of those errors and then work from there.

Comment by Mathias Stearn [ 25/Aug/10 ]

AndrewK: could you try with master, or apply that patch to your branch of choice?

Do you see the error within a few seconds every time you launch the server? If so, does it happen if you point the server to a clean --dbpath?

Since flushing is a background operation unrelated to modifying the data, I wouldn't expect it to cause corruption unless windows does something seriously screwy with mmaped files, so this probably isn't the cause of your oplog issues.

Comment by auto [ 25/Aug/10 ]

Author:

{'login': 'RedBeard0531', 'name': 'Mathias Stearn', 'email': 'mathias@10gen.com'}

Message: make sure only one flush happens at a time SERVER-1163
http://github.com/mongodb/mongo/commit/6dbd8594e56135347fca75f15ba08578cfacbe1a

Comment by AndrewK [ 17/Aug/10 ]

From console:

> db.runCommand('fsync')

{ "numFiles" : 4, "ok" : 1 }

In logs:
Tue Aug 17 11:02:20 [conn21] CMD fsync: sync:1 lock:0

I've tried issuing the command many times and never see the error.

When testing with 1.4.x a while ago, I landed up with a corrupt oplog for no apparent reason. I don't know if this was the cause - there were no server restarts so I will assume that it may have been. I'll try and see if I can land up with a similar situation again.

What worries me is what is causing that error message - is there a chance two threads are trying to modify the file(s)? Can some debugging code be added to see if multiple attempts to flush are occurring?

Lastly, the error appears within seconds of starting the database server. Surely you don't call flush that quickly?

Comment by Mathias Stearn [ 16/Aug/10 ]

The reason it doesn't happen in 1.2.4 is that it was the last release where flush() was a noop on windows.

Have you noticed any issues related to this other than a warning in the log? If you run db.adminCommand('fsync') do you see any errors? This may be a transient non-issue since we call flush every 60 seconds anyway.

Comment by AndrewK [ 12/Aug/10 ]

Although I'm not the original submitter, please get in touch if you want to work through this - I am eager to do what ever you need in order to help reproduce this.

The version information for the 1.2.4 build, where this error does not appear to be present, is:

db version v1.2.4, pdfile version 4.5
Thu Aug 12 14:53:58 git version: 5cf582d3d96b882c400c33e7670b811ccd47f477

Comment by AndrewK [ 12/Aug/10 ]

Here is the logout put from the 3rd node in a 3 node replica-set using 1.6.0.

All nodes on same Windows 7 64bit machine (not a VM), local un-raided disk. No anti-virus.
Nodes 1 & 2 are synced with a single collection of 1m objects.
Node 3 has already been configured to be part of the set and has previously synced successfully.

Node 3's data path is cleaned. The node is then restarted with:

mongod.exe --rest --dbpath .\node3 --port 5003 --replSet devset/localhost:5002 --logpath .\node3\log.txt --vvvvv

The attached log (log-1.6.0-replset.1.txt) is the resulting output.

The following appears pretty early on - without any need to generate updates/deletes or anything.
FlushViewOfFile failed 33 file: ./node3/local.1

Sometimes when running node 3, there is no error. However, restarting with a fresh data path a few times almost always generates this error.

Comment by AndrewK [ 12/Aug/10 ]

This is still a problem with 1.4.4 and 1.6.0. I know you'd prefer not to go back and fix 1.4.4 - but this may help tracing the change. The problem happens on both Windows 7 64bit and Windows Server 2008 R2 64bit. Both instances are with local non-raided drives. I've seen the error happen with both clean data paths and with pre-existing data.

The machines this has been tested on have no anti-virus software and nothing else is accessing the files.

Importantly, it doesn't happen with 1.2.4 - this was the last version I can find where this error doesn't occur.

This output is from 1.6.0. Unfortunately running with --vv doesn't show any additional output around the same time as the error. I've also seen this error happen with the data files.

(from node1)
[startReplSets] FlushViewOfFile failed 33 file: ./node1/local.1

(from node2)
[DataFileSync] FlushViewOfFile failed 33 file: ./node2/local.1

OS: Windows 7 64bit
Thu Aug 12 11:17:50 MongoDB starting : pid=10520 port=5002 dbpath=.\node1 64-bit
Thu Aug 12 11:17:50 db version v1.6.0, pdfile version 4.5
Thu Aug 12 11:17:50 git version: 2c7f164b653f0d703947572ede064aed41cc2185
Thu Aug 12 11:17:50 sys info: windows (6, 1, 7600, 2, '') BOOST_LIB_VERSION=1_42

Comment by Douglas Taft [ 15/Jun/10 ]

We can't reproduce the error. I have tried several configurations, and nothing causes the error. As far as I can tell, nothing has changed in the data or the MongoDB, and yet I can't reproduce the error. I am going to run a few more tests over night, but at this point it looks like our error disappeared. I have added a slave, and I will add another slave tomorrow. I will update tomorrow if we are still error free.

Comment by Eliot Horowitz (Inactive) [ 15/Jun/10 ]

I see.
Running with -vv should be sufficient.
It'll log everything.

Comment by Douglas Taft [ 15/Jun/10 ]

I will run it not as a service to get the log.

What I want from the C# client is some kind of log of events. I want to give you as much data about the issue as possible.

Comment by Eliot Horowitz (Inactive) [ 15/Jun/10 ]

Can you run mongo as a normal daemon for now instead of a service?
There are some issues with the service stuff and we haven't had a chance to work on that yet.

Not sure what you mean by log from the c# client

Comment by Douglas Taft [ 15/Jun/10 ]

I am using the string below to install MongoDB as a service. Unfortunately it is not logging. What am I doing wrong?

G:\MongoDB\bin\mongod.exe --logpath G:\MongoDB\log\log.txt -vv --dbpath G:\MongoDB\db\ --directoryperdb --install

I also tried just the single -v

G:\MongoDB\bin\mongod.exe --logpath G:\MongoDB\log\log.txt -v --dbpath G:\MongoDB\db\ --directoryperdb --install

Also, is there a way to log from the C# MongoDB client?

Comment by Eliot Horowitz (Inactive) [ 14/Jun/10 ]

Can you try again with -vv and send us the full log and when the client had a problem?

Comment by Douglas Taft [ 11/Jun/10 ]

The data volume is on a NetApp SAN, on a relatively new aggregate on new shelves. We have not had any issues with the new disks, shelves, aggregates, or volumes. I had the storage guys comb the logs on both the filers and the systems to ensure the load issue was not disk related.

Comment by Douglas Taft [ 11/Jun/10 ]

Eliot,

Here are the details on the system where MongoDB is running.
OS: Windows 2008 Ent
CPU cores: 8
RAM: 32GB

The data for MongoDB is on its own volume, 100GB.

I will try and rerun this afternoon with the -v enabled for logging. The -v did not seem that verbose though, is -v all events?

Comment by Eliot Horowitz (Inactive) [ 11/Jun/10 ]

Mongo will retry

Some other people with the same issue

http://groups.google.ie/group/microsoft.public.win32.programmer.kernel/browse_frm/thread/a7a70e5bf111f31f/3053356d049d22bc?hl=en

But i'm pretty sure that issue is harmless

Comment by Matthias Goetzke [ 11/Jun/10 ]

We are also running into occasional FlushViewOfFile errors (Code 33) i assume that is the windows error code for ERROR_LOCK_VIOLATION.

How is mongo dealing with this error ? Is it retrying or could data potentially be lost ?

Comment by Eliot Horowitz (Inactive) [ 10/Jun/10 ]

We're still trying to figure out exactly what that means in windows.
The docs are quite obtuse.

Are you sure you have enough disk space?

Can you try running with mongod -v

Comment by Douglas Taft [ 08/Jun/10 ]

What does that error mean? Why did the error occur during our load process?

The error from the client was short, "Client failed to connect."

If you have a way for me to produce a more verbose error log I would be more than happy to run our test again.

Comment by Eliot Horowitz (Inactive) [ 08/Jun/10 ]

No - that error shouldn't cause any kind of problem like that.
Can you send the client error?

Comment by Douglas Taft [ 08/Jun/10 ]

Sorry, the database did not shutdown. The client receives an error indicating that it is unable to connect to the database. I assumed that the last error in the log was related to the clients issue because that was the only activity against the database at that time.

Comment by Mathias Stearn [ 27/May/10 ]

Is that the last line in the log? Are you sure the db crashed? Usually there is a shutting down message, even on crashes.

Generated at Thu Feb 08 02:56:15 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.