[SERVER-5533] fsync and lock on a Secondary causes the shell to freeze/hang - worse with auth enabled Created: 06/Apr/12 Updated: 23/Feb/15 Resolved: 11/Jun/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Shell |
| Affects Version/s: | 2.0.3, 2.0.4, 2.1.1 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Adam Comerford | Assignee: | Andy Schwerin |
| Resolution: | Duplicate | Votes: | 1 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
mongod 2.0.4 - Mac OSX test env |
||
| Issue Links: |
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Reported here originally: https://groups.google.com/forum/?fromgroups#!topic/mongodb-user/7o_bYPUJuSU This has been recreated with and without authentication. The keyfile/auth version of the bug is much more painful because there is no way to remove the lock and get out of the bad state without sending kill -9 to mongod. With a non-authenticated set up it is possible to issue the unlock command. Repro steps below. |
| Comments |
| Comment by Andy Schwerin [ 14/May/12 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
QLock improvements did not resolve this bug. Looks like it needs more direct examination. | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by auto [ 09/May/12 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {u'login': u'andy10gen', u'name': u'Andy Schwerin', u'email': u'schwerin@10gen.com'}Message: Fix remap private view memory leak and formalize w->X ugprade process. Makes the upgrade R_to_W() block until it succeeds. Replaces "runExclusively" with a new "X" state, which is equivalent to "W", but Because X_to_w() is effectively a barrier, we use generation counters to make Use of w_to_X() is wrapped in the Lock::DBWrite::UpgradeToExclusive() guard Decision making about which condition variables to notify is more fully May help This patch also introduces a directed test of the w->X functionality, | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Adam Comerford [ 06/Apr/12 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
Updated affects version - reproduced on 2.1.1 | ||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Adam Comerford [ 06/Apr/12 ] | ||||||||||||||||||||||||||||||||||||||||||||||
|
To reproduce: Create a replica set, similar to this one:
Start mongod with the --keyFile option to enable auth. I have created the following user in the tests:
Now, fsync and lock the hidden secondary, as if for a snapshot:
Disconnect and reconnect again. Try auto-completing db. commands - this is usually enough to cause the hang. If not, run "show dbs" the shell simply stops responding - Ctrl-C and similar have no effect. Once it hangs, a standard shell cannot be connected. You can get a shell as follows though, however once you try to auth (in order to unlock) it freezes again:
In terms of the mongod process, even Ctrl-C/SIGINT can't break out:
Finally, it is possible to recreate the hang with authentication disabled. Thankfully, with authentication disabled, it is still possible to run the unlock command and restore functionality. |