[SERVER-32284] awaitReplication can hang when the optime to wait for does not match the minSnapshot. Created: 12/Dec/17 Updated: 30/Oct/23 Resolved: 18/Jan/18 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 3.7.2 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Daniel Gottlieb (Inactive) | Assignee: | Benety Goh |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | rollback-functional | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||||||||||
| Sprint: | Repl 2018-01-01, Repl 2018-01-15, Repl 2018-01-29 | ||||||||||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||||||||||
| Linked BF Score: | 0 | ||||||||||||||||||||||||||||||||||||
| Description |
|
ReplicationCoordinatorImpl::_awaitReplication_inlock accepts waiting for an opTime and a minSnapshot. This method will register itself onto a waiter list for a condition notification and successfully return when _doneWaitingForReplication_inlock returns true. In order for the predicate to return true, a valid snapshot must exist at the minSnapshot time. However, the condition variable is notified when _doneWaitingForReplication_inlock succeeds with a trivially true minSnapshot value. Also note that notifying a waiter also removes it from the list waiters that are notified when optimes advance. In this case, the predicate for _awaitReplication_inlock is stronger than to be notified, and because notification happens at most once, a client can hang waiting for a followup notification will never come. |
| Comments |
| Comment by Benety Goh [ 18/Jan/18 ] |
|
This bug is fixed by removing the minSnapshot logic, which is no longer used, from awaitReplication. |
| Comment by Githook User [ 18/Jan/18 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 18/Jan/18 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Benety Goh [ 17/Jan/18 ] |
|
references to awaitReplicationOfLastOpForClient() were removed in |
| Comment by Githook User [ 17/Jan/18 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 17/Jan/18 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 17/Jan/18 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 17/Jan/18 ] |
|
Author: {'name': 'Benety Goh', 'email': 'benety@mongodb.com', 'username': 'benety'}Message: |
| Comment by Githook User [ 09/Jan/18 ] |
|
Author: {'name': 'Benety Goh', 'username': 'benety', 'email': 'benety@mongodb.com'}Message: |
| Comment by Gregory McKeon (Inactive) [ 09/Jan/18 ] |
|
benety.goh should this be assigned to you? |