[SERVER-47284] Log oplog entries from all nodes after a test failure Created: 02/Apr/20  Updated: 06/Dec/22  Resolved: 21/Apr/20

Status: Closed
Project: Core Server
Component/s: Replication, Testing Infrastructure
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Samyukta Lanka Assignee: Backlog - Server Tooling and Methods (STM) (Inactive)
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Server Tooling & Methods
Participants:

 Description   

There are many test failures where it would help the replication team to diagnose what is going on if we could see the contents of each node's oplog.

One options is that we could log the last 100 oplog entries for each node when a test fails, similar to what happens with a dbhash mismatch.



 Comments   
Comment by Samyukta Lanka [ 21/Apr/20 ]

I don't have specific BFs in mind at the moment. I'm going to close this ticket for now, but I'll reopen it when we come across a BF where having access to the contents of the oplog could have helped us diagnose it.

Comment by Brooke Miller [ 14/Apr/20 ]

Hey samy.lanka, any update on the above question? Are there any BFs that we can add to this to show as an example so that we can investigate this?

Comment by Max Hirschhorn [ 02/Apr/20 ]

There are many test failures where it would help the replication team to diagnose what is going on if we could see the contents of each node's oplog.

samy.lanka, are these test failures concentrated in particular test files or test suites where it'd be possible to enable capturing data files? Trying to run more JavaScript code within a mongo shell which spawned a replica set using ReplSetTest seems cumbersome given that a test failure is by definition an uncaught JavaScript exception.

Generated at Thu Feb 08 05:13:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.