Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 4.1.9
Affects Version/s: 3.6.9, 4.0.5, 4.1.6
Component/s: Sharding
Labels:
- sharding-wfbf-day

Backwards Compatibility:
Fully Compatible
Operating System:
ALL
Backport Requested:

v4.0, v3.6
Steps To Reproduce:
Hide

Apply this patch, which turns off all the background threads (the periodic noop writer defaults to off for the noPassthrough suite for mongods started through resmoke.py) and modifies a jstest that tests that noop writes are done for snapshot readConcern against a shard to instead test against a config server:

diff --git a/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js b/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js index b74832e..1ba4ec4 100644 --- a/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js +++ b/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js @@ -44,14 +44,14 @@ // Propagate 'clusterTime' to shard 1. This ensures that its next write will be at time >= // 'clusterTime'. - testDB1.coll1.find().itcount(); + st.s.getDB("config").coll1.find().itcount(); // Attempt a snapshot read at 'clusterTime' on shard 1. Test that it performs a noop write to // advance its lastApplied optime past 'clusterTime'. The snapshot read itself may fail if the // noop write advances the node's majority commit point past 'clusterTime' and it releases that // snapshot. Test reading from the primary. const shard1Session = - st.rs1.getPrimary().getDB("test1").getMongo().startSession({causalConsistency: false}); + st.configRS.getPrimary().getDB("test1").getMongo().startSession({causalConsistency: false}); shard1Session.startTransaction({readConcern: {level: "snapshot", atClusterTime: clusterTime}}); res = shard1Session.getDatabase("test1").runCommand({find: "coll1"}); if (res.ok === 0) { diff --git a/src/mongo/db/read_concern_mongod.cpp b/src/mongo/db/read_concern_mongod.cpp index 0967968..f45a835 100644 --- a/src/mongo/db/read_concern_mongod.cpp +++ b/src/mongo/db/read_concern_mongod.cpp @@ -134,6 +134,8 @@ Status makeNoopWriteIfNeeded(OperationContext* opCtx, LogicalTime clusterTime) { auto shardingState = ShardingState::get(opCtx); // standalone replica set, so there is no need to advance the OpLog on the primary. if (!shardingState->enabled()) { + log() << "xxx not executing noop write to advance last applied op time because " + "sharding state not enabled"; return Status::OK(); } diff --git a/src/mongo/db/repl/replication_coordinator_impl.cpp b/src/mongo/db/repl/replication_coordinator_impl.cpp index c66445f..f14fe0b 100644 --- a/src/mongo/db/repl/replication_coordinator_impl.cpp +++ b/src/mongo/db/repl/replication_coordinator_impl.cpp @@ -1313,6 +1313,7 @@ Status ReplicationCoordinatorImpl::waitUntilOpTimeForReadUntil(OperationContext* } if (readConcern.getArgsAfterClusterTime() || readConcern.getArgsAtClusterTime()) { + log() << "xxx waiting until cluster time for read"; return _waitUntilClusterTimeForRead(opCtx, readConcern, deadline); } else { return _waitUntilOpTimeForReadDeprecated(opCtx, readConcern); diff --git a/src/mongo/s/catalog/replset_dist_lock_manager.cpp b/src/mongo/s/catalog/replset_dist_lock_manager.cpp index 3a960b6..b708b94 100644 --- a/src/mongo/s/catalog/replset_dist_lock_manager.cpp +++ b/src/mongo/s/catalog/replset_dist_lock_manager.cpp @@ -87,9 +87,9 @@ ReplSetDistLockManager::ReplSetDistLockManager(ServiceContext* globalContext, ReplSetDistLockManager::~ReplSetDistLockManager() = default; void ReplSetDistLockManager::startUp() { - if (!_execThread) { - _execThread = stdx::make_unique<stdx::thread>(&ReplSetDistLockManager::doTask, this); - } + // if (!_execThread) { + // _execThread = stdx::make_unique<stdx::thread>(&ReplSetDistLockManager::doTask, this); + // } } void ReplSetDistLockManager::shutDown(OperationContext* opCtx) { diff --git a/src/mongo/s/server.cpp b/src/mongo/s/server.cpp index 5558bed..3789aa1 100644 --- a/src/mongo/s/server.cpp +++ b/src/mongo/s/server.cpp @@ -420,8 +420,8 @@ ExitCode runMongosServer(ServiceContext* serviceContext) { // Construct the sharding uptime reporter after the startup parameters have been parsed in order // to ensure that it picks up the server port instead of reporting the default value. - shardingUptimeReporter.emplace(); - shardingUptimeReporter->startPeriodicThread(); + // shardingUptimeReporter.emplace(); + // shardingUptimeReporter->startPeriodicThread(); clusterCursorCleanupJob.go();

You will see the test hang after logging "xxx not executing noop write to advance last applied op time because sharding state not enabled", e.g.:

[js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.697-0500 c15525| 2019-01-04T17:31:13.696-0500 I SHARDING [conn73] Marking collection config.coll1 as collection version: <unsharded> [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.698-0500 s15528| 2019-01-04T17:31:13.698-0500 I COMMAND [conn8] command config.coll1 appName: "MongoDB Shell" command: find { find: "coll1", filter: {}, lsid: { id: UUID("f05234e5-e178-4706-81bf-2d6bc810cd6a") }, $clusterTime: { clusterTime: Timestamp(1546641073, 12), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $db: "config" } nShards:1 cursorExhausted:1 numYields:0 nreturned:0 reslen:226 protocol:op_msg 7ms [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [jsTest] ---- [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [jsTest] New session started with sessionID: { "id" : UUID("b9e7aa51-9d3d-4f98-94fa-8a333228a54d") } [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 [jsTest] ---- [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.795-0500 c15525| 2019-01-04T17:31:13.795-0500 I COMMAND [conn1] xxx not executing noop write to advance last applied op time because sharding state not enabled [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.795-0500 c15525| 2019-01-04T17:31:13.795-0500 I REPL [conn1] xxx waiting until cluster time for read
Show
Apply this patch, which turns off all the background threads (the periodic noop writer defaults to off for the noPassthrough suite for mongods started through resmoke.py) and modifies a jstest that tests that noop writes are done for snapshot readConcern against a shard to instead test against a config server: diff --git a/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js b/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js index b74832e..1ba4ec4 100644 --- a/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js +++ b/jstests/noPassthrough/readConcern_atClusterTime_noop_write.js @@ -44,14 +44,14 @@ // Propagate 'clusterTime' to shard 1. This ensures that its next write will be at time >= // 'clusterTime'. - testDB1.coll1.find().itcount(); + st.s.getDB("config").coll1.find().itcount(); // Attempt a snapshot read at 'clusterTime' on shard 1. Test that it performs a noop write to // advance its lastApplied optime past 'clusterTime'. The snapshot read itself may fail if the // noop write advances the node's majority commit point past 'clusterTime' and it releases that // snapshot. Test reading from the primary. const shard1Session = - st.rs1.getPrimary().getDB("test1").getMongo().startSession({causalConsistency: false}); + st.configRS.getPrimary().getDB("test1").getMongo().startSession({causalConsistency: false}); shard1Session.startTransaction({readConcern: {level: "snapshot", atClusterTime: clusterTime}}); res = shard1Session.getDatabase("test1").runCommand({find: "coll1"}); if (res.ok === 0) { diff --git a/src/mongo/db/read_concern_mongod.cpp b/src/mongo/db/read_concern_mongod.cpp index 0967968..f45a835 100644 --- a/src/mongo/db/read_concern_mongod.cpp +++ b/src/mongo/db/read_concern_mongod.cpp @@ -134,6 +134,8 @@ Status makeNoopWriteIfNeeded(OperationContext* opCtx, LogicalTime clusterTime) { auto shardingState = ShardingState::get(opCtx); // standalone replica set, so there is no need to advance the OpLog on the primary. if (!shardingState->enabled()) { + log() << "xxx not executing noop write to advance last applied op time because " + "sharding state not enabled"; return Status::OK(); } diff --git a/src/mongo/db/repl/replication_coordinator_impl.cpp b/src/mongo/db/repl/replication_coordinator_impl.cpp index c66445f..f14fe0b 100644 --- a/src/mongo/db/repl/replication_coordinator_impl.cpp +++ b/src/mongo/db/repl/replication_coordinator_impl.cpp @@ -1313,6 +1313,7 @@ Status ReplicationCoordinatorImpl::waitUntilOpTimeForReadUntil(OperationContext* } if (readConcern.getArgsAfterClusterTime() || readConcern.getArgsAtClusterTime()) { + log() << "xxx waiting until cluster time for read"; return _waitUntilClusterTimeForRead(opCtx, readConcern, deadline); } else { return _waitUntilOpTimeForReadDeprecated(opCtx, readConcern); diff --git a/src/mongo/s/catalog/replset_dist_lock_manager.cpp b/src/mongo/s/catalog/replset_dist_lock_manager.cpp index 3a960b6..b708b94 100644 --- a/src/mongo/s/catalog/replset_dist_lock_manager.cpp +++ b/src/mongo/s/catalog/replset_dist_lock_manager.cpp @@ -87,9 +87,9 @@ ReplSetDistLockManager::ReplSetDistLockManager(ServiceContext* globalContext, ReplSetDistLockManager::~ReplSetDistLockManager() = default; void ReplSetDistLockManager::startUp() { - if (!_execThread) { - _execThread = stdx::make_unique<stdx::thread>(&ReplSetDistLockManager::doTask, this); - } + // if (!_execThread) { + // _execThread = stdx::make_unique<stdx::thread>(&ReplSetDistLockManager::doTask, this); + // } } void ReplSetDistLockManager::shutDown(OperationContext* opCtx) { diff --git a/src/mongo/s/server.cpp b/src/mongo/s/server.cpp index 5558bed..3789aa1 100644 --- a/src/mongo/s/server.cpp +++ b/src/mongo/s/server.cpp @@ -420,8 +420,8 @@ ExitCode runMongosServer(ServiceContext* serviceContext) { // Construct the sharding uptime reporter after the startup parameters have been parsed in order // to ensure that it picks up the server port instead of reporting the default value. - shardingUptimeReporter.emplace(); - shardingUptimeReporter->startPeriodicThread(); + // shardingUptimeReporter.emplace(); + // shardingUptimeReporter->startPeriodicThread(); clusterCursorCleanupJob.go(); You will see the test hang after logging "xxx not executing noop write to advance last applied op time because sharding state not enabled", e.g.: [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.697-0500 c15525| 2019-01-04T17:31:13.696-0500 I SHARDING [conn73] Marking collection config.coll1 as collection version: <unsharded> [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.698-0500 s15528| 2019-01-04T17:31:13.698-0500 I COMMAND [conn8] command config.coll1 appName: "MongoDB Shell" command: find { find: "coll1", filter: {}, lsid: { id: UUID("f05234e5-e178-4706-81bf-2d6bc810cd6a") }, $clusterTime: { clusterTime: Timestamp(1546641073, 12), signature: { hash: BinData(0, 0000000000000000000000000000000000000000), keyId: 0 } }, $db: "config" } nShards:1 cursorExhausted:1 numYields:0 nreturned:0 reslen:226 protocol:op_msg 7ms [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [jsTest] ---- [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.775-0500 [jsTest] New session started with sessionID: { "id" : UUID("b9e7aa51-9d3d-4f98-94fa-8a333228a54d") } [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 [jsTest] ---- [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.776-0500 [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.795-0500 c15525| 2019-01-04T17:31:13.795-0500 I COMMAND [conn1] xxx not executing noop write to advance last applied op time because sharding state not enabled [js_test:readConcern_atClusterTime_noop_write] 2019-01-04T17:31:13.795-0500 c15525| 2019-01-04T17:31:13.795-0500 I REPL [conn1] xxx waiting until cluster time for read
Sprint:
Sharding 2019-02-11, Sharding 2019-02-25
Linked BF Score:
18
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

This issue affects:

but it isn't as bad as it sounds, because there are several other background processes that will periodically cause writes against the CSRS anyway, such as

distributed lock pinger thread on mongoses and mongods
sharding uptime reporter thread on mongoses
periodic noop writer thread on mongod primaries

It just means that afterClusterTime or atClusterTime requests against the config server may have high latency (several seconds).

It can be fixed pretty easily by just checking serverGlobalParams.clusterRole instead of ShardingState::enabled.

is related to

SERVER-53935 Noop write for afterClusterTime/atClusterTime on config servers could override opCtx's readConcernArgs

Closed

Assignee:: Blake Oler
Reporter:: Esha Maharishi (Inactive)
Participants:: Blake Oler, Esha Maharishi, Githook User
Votes:: 0 Vote for this issue
Watchers:: 4 Start watching this issue

Created:: Jan 04 2019 10:46:35 PM UTC
Updated:: Oct 29 2023 10:25:21 PM UTC
Resolved:: Feb 14 2019 05:33:16 PM UTC
Confidence Status Last Update:: 28/Jan/19 4:48 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates