[SERVER-47071] CheckReplOplogs can fail to detect a mismatch Created: 24/Mar/20  Updated: 29/Oct/23  Resolved: 10/Apr/20

Status: Closed
Project: Core Server
Component/s: Replication, Testing Infrastructure
Affects Version/s: None
Fix Version/s: 4.0.19, 4.2.7, 4.4.0-rc3, 4.7.0

Type: Bug Priority: Major - P3
Reporter: Judah Schvimer Assignee: Lingzhi Deng
Resolution: Fixed Votes: 2
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Problem/Incident
is caused by SERVER-37042 Handle exceptions from cursor.next in... Closed
Related
related to WT-5849 Data mismatch detected in agg_match.j... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4, v4.2, v4.0
Sprint: Repl 2020-04-20
Participants:

 Description   

We can see in this comment from max.hirschhorn that CheckReplOplogs did not fail when a node was missing an oplog entry. We should investigate why and fix the hook to catch that case.



 Comments   
Comment by Githook User [ 23/Apr/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47071: skipCheckDBHashes for seed_secondary_without_sessions_table.js
Branch: v4.0
https://github.com/mongodb/mongo/commit/7bffc86bd02fb992f14bcce7bf686ca67adbf6a7

Comment by Githook User [ 23/Apr/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47071: Fix CheckReplOplogs hook

(cherry picked from commit 1700c42daa8b88c17ca49814b7752b9a1b14d3db)
(cherry picked from commit 560ee7f0ac9407c37989d2b00cabbc27e183edbc)
Branch: v4.0
https://github.com/mongodb/mongo/commit/ef971dc0d15f1829081245fdb9986ea6040d1d81

Comment by Githook User [ 23/Apr/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47071: Fix CheckReplOplogs hook

(cherry picked from commit 1700c42daa8b88c17ca49814b7752b9a1b14d3db)
(cherry picked from commit 560ee7f0ac9407c37989d2b00cabbc27e183edbc)
Branch: v4.2
https://github.com/mongodb/mongo/commit/a7206698e24a77abdcf2d1edae718a822a947d98

Comment by Githook User [ 23/Apr/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47071: Fix TypeError after the CheckReplOplogs hook fails

(cherry picked from commit 560ee7f0ac9407c37989d2b00cabbc27e183edbc)
Branch: v4.4
https://github.com/mongodb/mongo/commit/c3b47f6e122ae36ff297944a9d90ab9f1151ec17

Comment by Githook User [ 23/Apr/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47071: Fix CheckReplOplogs hook

(cherry picked from commit 1700c42daa8b88c17ca49814b7752b9a1b14d3db)
Branch: v4.4
https://github.com/mongodb/mongo/commit/fd3180b60cba3d28b651b7b2d69e70f8b7e2931d

Comment by Githook User [ 09/Apr/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47071: Fix TypeError after the CheckReplOplogs hook fails
Branch: master
https://github.com/mongodb/mongo/commit/560ee7f0ac9407c37989d2b00cabbc27e183edbc

Comment by Lingzhi Deng [ 09/Apr/20 ]

Found another issue where if the hook fails, this in assertOplogEntriesEq would be undefined given how we call the function. I think we should do assertOplogEntriesEq.call instead. This bug doesn't affect the oplog checks but it would fail to dump the oplog when the hook fails. I am reopening this ticket for the fix.

Comment by Githook User [ 08/Apr/20 ]

Author:

{'name': 'Lingzhi Deng', 'email': 'lingzhi.deng@mongodb.com', 'username': 'ldennis'}

Message: SERVER-47071: Fix CheckReplOplogs hook
Branch: master
https://github.com/mongodb/mongo/commit/1700c42daa8b88c17ca49814b7752b9a1b14d3db

Comment by Lingzhi Deng [ 08/Apr/20 ]

This was caused by SERVER-37042. OplogReader.next() and OplogReader.hasNext() are both missing a return statement and are effectively returning undefined. This makes the subsequent hasNext() call always return "false", skipping the entire oplog check.

Generated at Thu Feb 08 05:13:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.