[SERVER-7493] Possible for read starvation to cause migration to get stuck in critical section Created: 27/Oct/12 Updated: 11/Jul/16 Resolved: 16/Nov/12 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | 2.2.0 |
| Fix Version/s: | 2.2.2, 2.3.1 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Spencer Brody (Inactive) | Assignee: | Spencer Brody (Inactive) |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
If a migration aborts it calls the done() method on MigrateFromStatus, which is what takes that server out of the critical section. That method, however, tries to acquire the database read lock on the database for which the migration is taking place. While in the critical section, however, all requests on that collection hang in running setShardVersion, which waits for the server to be out of the critical section. setShardVersion, however, takes the database's write-lock. So if you have a lot of queries coming in to that namespace on a lot of different threads, all the setShardVersion commands can cause read starvation on the database lock, preventing the migration from ever finishing. Proposed fix is to change MigrateFromStatus::done to use a write lock rather than a read lock so that the lock acquisition will be greedy. |
| Comments |
| Comment by auto [ 16/Nov/12 ] |
|
Author: {u'date': u'2012-11-05T19:19:11Z', u'email': u'spencer@10gen.com', u'name': u'Spencer T Brody'}Message: Use global lock when exiting critical section because it is greedier. Also add verbose logging around exiting critical section. |
| Comment by auto [ 16/Nov/12 ] |
|
Author: {u'date': u'2012-11-05T19:19:11Z', u'email': u'spencer@10gen.com', u'name': u'Spencer T Brody'}Message: Use global lock when exiting critical section because it is greedier. Also add verbose logging around exiting critical section. |
| Comment by Spencer Brody (Inactive) [ 27/Oct/12 ] |
|
https://github.com/mongodb/mongo/commit/4b50937dd119852b6c076902b748286b50306401 |