Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 8.3.0-rc0, 8.2.7
Affects Version/s: None
Component/s: None
Labels:
None

Assigned Teams:

Replication
Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v8.2, v8.0
Steps To Reproduce:

Hide

I was able to reproduce this bug by writing a test that pauses the OplogWriter on a secondary node and use currentOp() and killOp() to kill the OplogWriter operation.

Show
I was able to reproduce this bug by writing a test that pauses the OplogWriter on a secondary node and use currentOp() and killOp() to kill the OplogWriter operation.
Sprint:
Repl 2025-10-13, Repl 2025-11-10, Repl 2025-11-24, Repl 2026-01-19, Repl 2026-02-02
Linked BF Score:
200
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

In the past few months we've had various AFs (ex: AF-1462) because a customer is runnign killOp() commands and killing internal replication operations, which lead to a crash of the mongod process.

We recently tried to improve logging when this happens (see: ~~SERVER-101858~~), but it doesn't fix the issue as it catches the exception and the does a fassert().

When reproducing this bug in a test we get logs like this:

[js_test:killOp_against_repl_threads] d20041| {"t":{"$date":"2025-10-09T23:32:04.950+00:00"},"s":"I",  "c":"COMMAND",  "id":558700,  "ctx":"conn1","msg":"Successful killOp","attr":{"remote":"127.0.0.1:60526","metadata":{"application":{"name":"MongoDB Shell"},"driver":{"name":"MongoDB Internal Client","version":"8.3.0-alpha0"},"os":{"type":"Linux","name":"Ubuntu","architecture":"aarch64","version":"22.04"}},"db":"admin","command":{"killOp":1,"op":38914,"lsid":{"id":{"$uuid":"ee5c83c6-52b8-4f0e-9ef4-5b212507a834"}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1760052722,"i":2}},"signature":{"hash":{"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"0"}},"keyId":0}},"$readPreference":{"mode":"secondaryPreferred"},"$db":"admin"}}}
...
[js_test:killOp_against_repl_threads] d20041| {"t":{"$date":"2025-10-09T23:32:05.004+00:00"},"s":"I",  "c":"REPL",     "id":10185800,"ctx":"OplogWriter-0","msg":"OplogWriter threw a DBException","attr":{"what":"operation was interrupted","exception":"Interrupted: operation was interrupted"}}
[js_test:killOp_against_repl_threads] d20041| {"t":{"$date":"2025-10-09T23:32:05.004+00:00"},"s":"F",  "c":"ASSERT",   "id":23089,   "ctx":"OplogWriter-0","msg":"Fatal assertion","attr":{"msgid":10185801,"location":"src/mongo/db/repl/oplog_writer.cpp:62:31:auto mongo::repl::OplogWriter::startup()::(anonymous class)::operator()(const executor::TaskExecutor::CallbackArgs &)"}}
[js_test:killOp_against_repl_threads] d20041| {"t":{"$date":"2025-10-09T23:32:05.004+00:00"},"s":"F",  "c":"ASSERT",   "id":23090,   "ctx":"OplogWriter-0","msg":"\n\n***aborting after fassert() failure\n\n"}
[js_test:killOp_against_repl_threads] d20041| {"t":{"$date":"2025-10-09T23:32:05.004+00:00"},"s":"F",  "c":"CONTROL",  "id":6384300, "ctx":"OplogWriter-0","msg":"Writing fatal message","attr":{"message":"Got signal: 6 (Aborted).\n"}}

Our current documentation states the user should not try to kill internal DB operations, but this is not foolproof.

I suggest we do one of the following options to prevent this crash from happening:

[ Preferred ] Prevent the killOp() command from killing internal Repl operations.
Only allow users with internal privilege (on top of killop) to kill an internal operation.

The first option is preferred since there is no valid reason for a user or an operator to kill an internal Repl operation that I could find.

causes

SERVER-120520 Use of backports_required_for_multiversion_tests.yml for killOp_against_repl_threads.js test

Closed

is related to

SERVER-101858 OplogWriterImpl does not handle interruptions gracefully

Closed

related to

SERVER-118705 Re-add test to check that OplogApplier thread cannot get killed by killOp()

Closed

Assignee:: Pierre Turin
Reporter:: Pierre Turin
Participants:: Githook User, Pierre Turin
Votes:: 0 Vote for this issue
Watchers:: 11 Start watching this issue

Created:: Oct 10 2025 01:31:25 AM UTC
Updated:: Mar 09 2026 03:29:06 PM UTC
Resolved:: Jan 23 2026 10:49:17 PM UTC

Details

Description

Attachments

Issue Links

Activity

People

Dates