-
Type: Bug
-
Resolution: Won't Fix
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Replication
-
None
-
Fully Compatible
-
ALL
-
Repl 2017-02-13, Repl 2017-04-17, Repl 2017-05-08
-
0
In OplogFetcher::_callback(), we check to see if the query response actually has metadata, and set a boolean hasMetadata.
However, later on in the function, we blindly call shouldStopFetching() and pass the metadata, regardless of hasMetadata-ness.
This results in some erroneous behavior. In particular, you can get stuck in a tight CPU loop when chaining is turned off, because the node will choose what it believes to be a primary, but immediately afterwards, TopologyCoordinatorImpl::shouldChangeSyncSource() will return true, since the metadata config version (it's null) will not match the current config version. Repeat in a tight loop.
You can see the effects of this on the log here:
https://logkeeper.mongodb.org/build/7419231f517600f1d972ad9bc50cb45b/test/582a071abe07c472fe0b4a36
This tight loop appears to be part of the reason why the test suite failed (a replica set lost quorum due to slow heartbeats).
- related to
-
SERVER-26528 Add additional logging when sync source is changed or cleared
- Closed