[SERVER-6302] Race condition when multiple fsyncLock() invocations made on a mongod instance. Created: 04/Jul/12  Updated: 17/May/19  Resolved: 02/Nov/16

Status: Closed
Project: Core Server
Component/s: Concurrency
Affects Version/s: 2.1.2
Fix Version/s: 3.4.0-rc3

Type: Bug Priority: Critical - P2
Reporter: Tyler Brock Assignee: James Wahlin
Resolution: Done Votes: 4
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
is duplicated by SERVER-7536 fsyncUnlock should indicate how many ... Closed
is duplicated by SERVER-9349 Race condition when fsyncLocking and ... Closed
is duplicated by SERVER-11119 mongod connections hang because of as... Closed
Related
related to SERVER-1423 reads often aren't possible while in ... Closed
is related to SERVER-7536 fsyncUnlock should indicate how many ... Closed
is related to DOCS-9265 3.4: Clarify multiple fsyncLock behav... Closed
is related to SERVER-26875 Create FSM test that exercises concur... Closed
Backwards Compatibility: Fully Compatible
Sprint: Integration 2016-10-31, Query 2016-11-21
Participants:
Case:

 Description   

There is currently no specification for multiple agents attempting to fsyncLock() a mongod instance. However, the observed behavior is somewhat confusing. Observed behavior, from a single mongo shell:

> db.runCommand({fsync: 1, lock: true})
{
	"info" : "now locked against writes, use db.fsyncUnlock() to unlock",
	"seeAlso" : "http://dochub.mongodb.org/core/fsynccommand",
	"ok" : 1
}
> db.runCommand({fsync: 1, lock: true})
{
	"info" : "now locked against writes, use db.fsyncUnlock() to unlock",
	"seeAlso" : "http://dochub.mongodb.org/core/fsynccommand",
	"ok" : 1
}
> db.fsyncUnlock()
{ "ok" : 1, "info" : "unlock completed" }
> db.fsyncUnlock()
{ "ok" : 1, "info" : "unlock completed" }
> db.fsyncUnlock()

At first blush, this appears like the fsyncLock is a shared, recursive resource, and every acquisition must be paired with a release. However, this is misleading. What in fact happens is that the second acquisition returns, reporting success, if the first acquirer has the lock, and in the background on mongod, the lock request gets queued up. When the first acquisition is released with fsyncUnlock(), the pending lock request gets processed, and the fsync lock is re-acquired. There is a brief period where the fsync lock is unheld.

We should:

  1. Fix the race condition, maintaining a locked read-only file set until all clients have unlocked.
  2. Make lock state clearer to clients, so that the message provides the information needed to understand current state.


 Comments   
Comment by J Rassi [ 04/Nov/16 ]

While it's nice to think that the "answer" is for drivers not to depend on fields other than "ok" and "code", end users may also be depending on specific messages from the server, particularly errmsg, either in application code or in log monitoring.

Anything the server sends out is effectively part of its API and changes should be judicious, really necessary and flagged.

For reference, consider the Perl interpreter itself (which I'm just a bit familiar with). Release notes always include a list of changes to diagnostics because the maintainers have learned over decades that users always wind up looking for specific output strings (even if they shouldn't) and complain when their code breaks.

If the volume of uassert changes is high, maybe batching them is the answer.

david.golden and I spoke in person. While we disagree about the exact standard that server changes should be held to in order to flag to downstream consumers in general, we do both agree that the standard for making unflagged user-visible changes should be much higher during the server RC process than during the normal development cycle. Most drivers have shipped their 3.4-ready release candidates already, and there's a real time/stress cost to doing last-minute releases to work around such changes from the server.

Comment by James Wahlin [ 04/Nov/16 ]

I will send out a notification email. There is an addition made under this ticket that will allow for better testing as well. We added a "lockCount" field to both fsyncLock and fsyncUnlock response messages, which allows for evaluation of lock state in the presence of multiple clients performing locks & unlocks.

The following demonstrates fsyncLock and fsyncUnlock response changes between MongoDB 3.2 and 3.4.

3.2.10

replset:PRIMARY> db.fsyncLock()
{
    "info" : "now locked against writes, use db.fsyncUnlock() to unlock",
    "seeAlso" : "http://dochub.mongodb.org/core/fsynccommand",
    "ok" : 1
}
replset:PRIMARY> db.fsyncLock()
{
    "info" : "now locked against writes, use db.fsyncUnlock() to unlock",
    "seeAlso" : "http://dochub.mongodb.org/core/fsynccommand",
    "ok" : 1
}
replset:PRIMARY> db.fsyncUnlock()
{ "info" : "unlock completed", "ok" : 1 }
replset:PRIMARY> db.fsyncUnlock()
{ "info" : "unlock completed", "ok" : 1 }
replset:PRIMARY> db.fsyncUnlock()
{ "ok" : 0, "errmsg" : "not locked" }

For 3.4.0-rc3

> db.fsyncLock();
{
    "info" : "now locked against writes, use db.fsyncUnlock() to unlock",
    "lockCount" : NumberLong(1),
    "seeAlso" : "http://dochub.mongodb.org/core/fsynccommand",
    "ok" : 1
}
> db.fsyncLock();
{
    "info" : "now locked against writes, use db.fsyncUnlock() to unlock",
    "lockCount" : NumberLong(2),
    "seeAlso" : "http://dochub.mongodb.org/core/fsynccommand",
    "ok" : 1
}
> db.fsyncUnlock();
{ "info" : "fsyncUnlock completed", "lockCount" : NumberLong(1), "ok" : 1 }
> db.fsyncUnlock();
{ "info" : "fsyncUnlock completed", "lockCount" : NumberLong(0), "ok" : 1 }
> db.fsyncUnlock();
{ "ok" : 0, "errmsg" : "fsyncUnlock called when not locked" }

Comment by David Golden [ 04/Nov/16 ]

While it's nice to think that the "answer" is for drivers not to depend on fields other than "ok" and "code", end users may also be depending on specific messages from the server, particularly errmsg, either in application code or in log monitoring.

Anything the server sends out is effectively part of its API and changes should be judicious, really necessary and flagged.

For reference, consider the Perl interpreter itself (which I'm just a bit familiar with). Release notes always include a list of changes to diagnostics because the maintainers have learned over decades that users always wind up looking for specific output strings (even if they shouldn't) and complain when their code breaks.

If the volume of uassert changes is high, maybe batching them is the answer.

Comment by J Rassi [ 04/Nov/16 ]

Personally, I don't agree that "Driver Changes: Maybe" and/or emails to downstream-changes should happen for tickets which don't affect the "code" or "ok" fields for a command response.

Consider the scenario where an email to downstream-changes were sent for every ticket resolved which contained a commit that changes the value of "errmsg" in some command response. There are >1000 calls to the uassert() macro currently in master in the server respository; almost every uassert() call contains a unique error message, and I believe that >95% of these calls can bubble up to the command execution framework (causing the verbatim error message to be sent back to the user). I believe that hundreds of these calls are addded and removed in every server release, and, if such an email were sent for each one, the downstream-changes list would turn into a spam list. Furthermore, the quality of informational and error messages in the server certainly needs improvement, and I would love to keep the barrier low for server developers to improve them.

Instead, I think that drivers should not depend on fields from the command response outside of "ok" and "code", except in rare circumstances.

Comment by David Golden [ 04/Nov/16 ]

This commit broke tests for all versions of the Perl driver by changing the response message for "unlock completed".

Please make sure that changes to database responses get flagged at least "Driver Changes Maybe" and notify the "downstream-changes" user list.

Comment by Githook User [ 02/Nov/16 ]

Author:

{u'username': u'jameswahlin', u'name': u'James Wahlin', u'email': u'james.wahlin@10gen.com'}

Message: SERVER-6302 Fix multiple fsyncLock() invocation race condition
Branch: master
https://github.com/mongodb/mongo/commit/fd16deb6dd3d08756f15c181facc707cb53f4e15

Comment by Hiroaki [ 11/Sep/14 ]

It seems like, this is the most important issue for the database product around safety and stability.
Why neglect for more than 1 year !?

Comment by Thomas Rueckstiess [ 21/Jun/13 ]

As a work-around for automated scripts, the currentOp can be inspected if the node is already locked, for example:

$ locked=`mongo --quiet --eval "printjson(db.currentOp().fsyncLock)"`
$ echo $locked
true

Note that in the unlocked case, $locked is undefined, not false.

One should not proceed if the lock is already taken, and perhaps wait/retry after some time.

Generated at Thu Feb 08 03:11:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.