[SERVER-25376] Add a hook to check that oplogs in a replset do not diverge Created: 01/Aug/16 Updated: 21/Jul/17 Resolved: 30/Sep/16 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Testing Infrastructure |
| Affects Version/s: | 3.3.10 |
| Fix Version/s: | 3.4.0-rc0 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Robert Guo (Inactive) | Assignee: | Jonathan Abrahams |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||
| Sprint: | TIG 2016-10-10 | ||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
Add a hook or modify the dbhash hook to check that the oplog matches across replicaset members after calling ReplSetTest.awaitReplication() This can't be done with a simple dbhash comparison because the oplog is a capped collection and the number of documents may not match. We will need to traverse the oplog and ensure that they have a common history by comparing each entry manually. |
| Comments |
| Comment by Githook User [ 02/Oct/16 ] |
|
Author: {u'username': u'hptabster', u'name': u'Jonathan Abrahams', u'email': u'jonathan@mongodb.com'}Message: |
| Comment by Githook User [ 30/Sep/16 ] |
|
Author: {u'username': u'hptabster', u'name': u'Jonathan Abrahams', u'email': u'jonathan@mongodb.com'}Message: |
| Comment by Jonathan Abrahams [ 22/Sep/16 ] |
|
The proposed logic is to traverse backwards through the oplog on the primary, conn.getDB('local').getCollection('oplog.rs').find().sort({$natural: -1}), and compare each corresponding document on a secondary until the cursor is exhausted on primary or secondary, or documents fail to match. If there's a failure, dump oplog entries from both nodes that surround the failure. |
| Comment by Judah Schvimer [ 01/Aug/16 ] |
|
We should add this hook to the initial sync passthrough as well after we wait for the initial sync node to get into secondary mode but before we CleanEveryN |