[SERVER-48675] Simple 4.4.0-rc7 repset test - restarted member not catching up on oplog [or count drift] Created: 09/Jun/20 Updated: 13/May/21 Resolved: 13/May/21 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | 4.4.0-rc7 |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Paul Done | Assignee: | Bruce Lucas (Inactive) |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Ubuntu 20.04 |
||
| Attachments: |
|
||||
| Issue Links: |
|
||||
| Operating System: | ALL | ||||
| Steps To Reproduce: | Follow this process: https://github.com/pkdone/MongoDB-AUTO-HA In mointor.sh replace .countDocuments({}) with the old method that was being used: .count() Increase the rate of document insertion by reducing the sleep in insert.py from 0.025s to 0.005s to make the count variance appear faster/more obvious Ensure laptop/workstation is under load/strain and you may need to kill three or four primaries before witnessing the behaviour |
||||
| Participants: | |||||
| Description |
|
Simple replicat set test of 3 mongod servers as part of one replica set works fine on 4.2.6 and many other earlier versions of mongodb over the years. Just tried on 4.4.0-rc7 and when a primary is killed and then restarted it does not seem to catch-up on the oplog. To reproduce follow this process: https://github.com/pkdone/MongoDB-AUTO-HA More info to follow below |
| Comments |
| Comment by Daniel Pasette (Inactive) [ 10/Jun/20 ] |
|
Wonder if this is just a side effect of “replicate before journaling” |
| Comment by Paul Done [ 09/Jun/20 ] |
|
As per last comment - probably works as intended |
| Comment by Bruce Lucas (Inactive) [ 09/Jun/20 ] |
|
paul.done can you please attach log files and ftdc (diagnostic.data) for the whole replica set covering a test, along with a timeline - when did you start the test, when did you do the node restart? |