-
Type:
Bug
-
Resolution: Unresolved
-
Priority:
Minor - P4
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Server Programmability
-
ALL
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
The blockTimeMS failpoint does not seem to reliably block for the full duration. I suspect this is due to the fact that blockTimeMS is implemented using a non-monotonic clock.
Take the following C++ driver example:
static constexpr auto block_time = std::chrono::milliseconds(50); db.run_command(make_document( kvp("configureFailPoint", "failCommand"), kvp("mode", make_document(kvp("times", 1))), kvp("data", make_document( kvp("failCommands", bsoncxx::builder::basic::make_array("ping")), kvp("blockTimeMS", std::chrono::milliseconds(block_time).count()), kvp("blockConnection", true))))); auto const start = std::chrono::steady_clock::now(); db.run_command(make_document(kvp("ping", 1))); auto const end = std::chrono::steady_clock::now(); auto const elapsed = end - start; REQUIRE(elapsed >= block_time); // Fails!
On RHEL 8, the elapsed duration is consistently just shy (<1ms) of the block time.
Using the full example here, I was able to reproduce this on Evergreen with the following output:
[2026/02/11 10:21:57.682] ../src/mongocxx/test/v_noabi/block_time_ms_bug_repro.cpp:52: failed: to_raw_us(elapsed) >= to_raw_us(block_time) for: 49365.36299999999755528 >= 50000.0 with 1 message: 'iteration := 0' [2026/02/11 10:21:57.732] ../src/mongocxx/test/v_noabi/block_time_ms_bug_repro.cpp:52: failed: to_raw_us(elapsed) >= to_raw_us(block_time) for: 49272.5 >= 50000.0 with 1 message: 'iteration := 1' [2026/02/11 10:21:57.782] ../src/mongocxx/test/v_noabi/block_time_ms_bug_repro.cpp:52: failed: to_raw_us(elapsed) >= to_raw_us(block_time) for: 49332.54299999999784632 >= 50000.0 with 1 message: 'iteration := 2' [2026/02/11 10:21:57.832] ../src/mongocxx/test/v_noabi/block_time_ms_bug_repro.cpp:52: failed: to_raw_us(elapsed) >= to_raw_us(block_time) for: 49312.42100000000209548 >= 50000.0 with 1 message: 'iteration := 3'
This pattern continues the majority of the 100 iterations. To be clear, the above duration values are in microseconds. Since the client-side code is measuring the duration using std::chrono::steady_clock, I feel confident that these duration measurements are accurate.
I suspected this issue was caused by the server basing the sleep duration off the system clock, which is subject to time jumps. After investigating, this appears to be the case:
- blockTimeMS is implemented using OperationContext::sleepFor(): https://github.com/10gen/mongo/blob/9c8a5d399b1ebf13a6fe86ef62e18f02bcf1cf6f/src/mongo/db/commands.cpp#L767
- sleepFor() calls waitForConditionOrInterruptFor(): https://github.com/10gen/mongo/blob/9c8a5d399b1ebf13a6fe86ef62e18f02bcf1cf6f/src/mongo/util/interruptible.h#L254
- waitForConditionOrInterruptFor() calls getExpirationDateForWaitForValue(): https://github.com/10gen/mongo/blob/9c8a5d399b1ebf13a6fe86ef62e18f02bcf1cf6f/src/mongo/util/interruptible.h#L233
- getExpirationDateForWaitForValue() declares the wake up time as getServiceContext()>getPreciseClockSource()>now() + waitFor: https://github.com/10gen/mongo/blob/9c8a5d399b1ebf13a6fe86ef62e18f02bcf1cf6f/src/mongo/db/operation_context.cpp#L575
- Unless a non-null ClockSource is passed to the ServiceContext constructor, _preciseClockSource is initialized using SystemClockSource: https://github.com/10gen/mongo/blob/9c8a5d399b1ebf13a6fe86ef62e18f02bcf1cf6f/src/mongo/db/service_context.cpp#L560
- SystemClockSource::now() returns Date_t::now(): https://github.com/10gen/mongo/blob/77e0b1ec702cc2cd0ba1381ae793387149ecffae/src/mongo/util/system_clock_source.cpp#L39
- Descending into the call stack, eventually we reach curTimeDuration(), which computes the current time using gettimeofday(): https://github.com/10gen/mongo/blob/77e0b1ec702cc2cd0ba1381ae793387149ecffae/src/mongo/util/time_support.cpp#L599
From the gettimeofday documentation:
The time returned by gettimeofday() is affected by discontinuous jumps in the system time (e.g., if the system administrator manually changes the system time). If you need a monotonically increasing clock, see clock_gettime(2).
To correctly implement blockTimeMS, the server should be measuring the duration using a steady, monotonic clock. Otherwise, there are technically no guarantees that the block duration will be honored.
In practice, blockTimeMS seems to get close enough to the correct block duration most of the time, which is perhaps why this issue has gone undetected. Additionally, I could only ever reproduce this issue on RHEL 8 – I am not sure why.
- is related to
-
SERVER-103039 Allow to use failCommand failpoint to hang comamnds before their execution.
-
- Closed
-