[SERVER-30947] checkOplogs function should dump more oplog entries on failure Created: 05/Sep/17 Updated: 30/Oct/23 Resolved: 11/Sep/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 3.4.16, 3.5.13 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | William Schultz (Inactive) | Assignee: | Katherine Walker (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | neweng | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Backwards Compatibility: | Fully Compatible | ||||
| Backport Requested: |
v3.4
|
||||
| Sprint: | Repl 2017-09-11, Repl 2017-10-02 | ||||
| Participants: | |||||
| Description |
|
In replsettest.js, the checkOplogs function verifies that the oplogs of each node in a replica set match. If there is a discrepancy between two oplogs, it will currently print the last 10 oplog entries of each node to the logs. A test may execute hundreds or thousands of operations, and this amount (10 entries) is somewhat arbitrary and not always helpful when trying to debug a failure. We should consider increasing this amount significantly, to 100 entries or 1000, or possibly just dumping the entire oplog of each node. This is (hopefully) not a check that fails often, so when it does, it would be nice to have as much debugging information as possible. Dumping the entire oplog of each node to the logs could aid with this. |
| Comments |
| Comment by Githook User [ 17/May/18 ] |
|
Author: {'email': 'katherine.walker@mongodb.com', 'username': 'kvwalker', 'name': 'kvwalker'}Message: This reverts commit 820abe30691f09011183b63ab63cb1e9c43f3d9e. (cherry picked from commit 52bbaa007cd84631d6da811d9a05b59f2dfad4f3) |
| Comment by Ramon Fernandez Marina [ 11/Sep/17 ] |
|
Author: {'username': u'kvwalker', 'name': u'kvwalker', 'email': u'katherine.walker@mongodb.com'}Message: This reverts commit 820abe30691f09011183b63ab63cb1e9c43f3d9e. |
| Comment by Nathan Myers [ 11/Sep/17 ] |
|
I cannot list the BFG tickets for the failures because there is no In lieu of such a list, try Build failures are an exceptionally noisy signal, so there is a chance After backing the patch out, the new failures went away, and other new |
| Comment by William Schultz (Inactive) [ 11/Sep/17 ] |
|
nathan.myers I am also confused by this revert. This commit was a one line change in our Javascript test framework. What are the failures you are referring to? |
| Comment by Max Hirschhorn [ 10/Sep/17 ] |
|
Re-opening this ticket since the changes were reverted.
nathan.myers, given that ReplSetTest#checkOplogs() is a function to help ensure consistency of the oplog across a replica set and Katherine's change simply increased the number of oplog entries dumped as context upon failure, I find it unlikely that the changes from 1baf806 are responsible. Could you provide a link to the Evergreen failures you observed and let's figure out if there's another recent commit to mongodb/mongo that could be responsible? |
| Comment by Ramon Fernandez Marina [ 10/Sep/17 ] |
|
Author: {'username': u'nathan-myers-mongo', 'name': u'Nathan Myers', 'email': u'ncm@cantrip.org'}Message:Revert " This reverts commit 1baf806e71f2d4d2710b9c818b3f954557c4ad16. |
| Comment by Ramon Fernandez Marina [ 08/Sep/17 ] |
|
Author: {'username': u'kvwalker', 'name': u'kvwalker', 'email': u'katherine.walker@10gen.com'}Message: |
| Comment by Spencer Brody (Inactive) [ 05/Sep/17 ] |
|
Yeah, we should probably just remove the limit from the dumpOplog function and always print the whole thing. |