[SERVER-30912] shardCollection command hangs when running under a session Created: 31/Aug/17  Updated: 30/Oct/23  Resolved: 19/Sep/17

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.5.12
Fix Version/s: 3.6.0-rc0

Type: Bug Priority: Major - P3
Reporter: Eddie Louie Assignee: Jack Mulrow
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Gantt Dependency
has to be done before SERVER-30965 Run the fuzzer with a permanent logic... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

repro_bf6386.js

(function() {
    "use strict";
 
    const st = new ShardingTest({mongos: 1, config: 1, shards: 2, rs: {nodes: 1}});
    const db = st.s.startSession().getDatabase("test");
 
    assert.commandWorked(db.adminCommand({enableSharding: "test"}));
    assert.commandWorked(db.adminCommand({shardCollection: "test.mycoll", key: {_id: "hashed"}}));
 
    st.stop();
})();

Run with:
python ./buildscripts/resmoke.py --suites=no_server repro_bf6386.js

Sprint: Sharding 2017-10-02
Participants:
Linked BF Score: 0

 Description   

A hang occurs in the sharding collection command when a session is started with at least 2 shards. This happens in the mmapv1 and wiredTiger storageEngines. The two stack traces involved are:

Thread 112: "conn9" (Thread 0x7f1d6f943700 (LWP 11174))
#0  0x00007f1db4c9b360 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f1db547d91c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x000000b51e1541eb in mongo::executor::ThreadPoolTaskExecutor::wait (this=<optimized out>, cbHandle=...) at src/mongo/executor/thread_pool_task_executor.cpp:450
#3  0x000000b51d85da1d in mongo::ShardRemote::_runCommand (this=0xb52685bd20, opCtx=<optimized out>, readPref=..., dbName=..., maxTimeMSOverride=..., cmdObj=owned BSONObj 373 bytes @ 0xb5268aa688 = {...}) at src/mongo/s/client/shard_remote.cpp:204
#4  0x000000b51d8a5564 in mongo::Shard::runCommandWithFixedRetryAttempts (this=0xb52685bd20, opCtx=opCtx@entry=0xb5268a3640, readPref=..., dbName="admin", cmdObj=owned BSONObj 373 bytes @ 0xb5268aa688 = {...}, maxTimeMSOverride=..., retryPolicy=mongo::Shard::RetryPolicy::kIdempotent) at src/mongo/s/client/shard.cpp:171
#5  0x000000b51d8a5ecf in mongo::Shard::runCommandWithFixedRetryAttempts (this=<optimized out>, opCtx=opCtx@entry=0xb5268a3640, readPref=..., dbName="admin", cmdObj=owned BSONObj 373 bytes @ 0xb5268aa688 = {...}, retryPolicy=mongo::Shard::RetryPolicy::kIdempotent) at src/mongo/s/client/shard.cpp:155
#6  0x000000b51d68e27a in mongo::MigrationSourceManager::commitChunkMetadataOnConfig (this=this@entry=0x7f1d6f9405d0, opCtx=opCtx@entry=0xb5268a3640) at src/mongo/db/s/migration_source_manager.cpp:328
#7  0x000000b51d1c9d8d in mongo::(anonymous namespace)::MoveChunkCommand::_runImpl (opCtx=opCtx@entry=0xb5268a3640, moveChunkRequest=..., this=<optimized out>, this=<optimized out>, this=<optimized out>, this=<optimized out>, this=<optimized out>, this=<optimized out>, this=<optimized out>, this=<optimized out>) at src/mongo/db/s/move_chunk_command.cpp:237
#8  0x000000b51d1cb9e3 in mongo::(anonymous namespace)::MoveChunkCommand::run (this=this@entry=0xb51f456b40 <mongo::(anonymous namespace)::moveChunkCmd>, opCtx=opCtx@entry=0xb5268a3640, dbname="admin", cmdObj=unowned BSONObj 798 bytes @ 0xb526896c1d = {...}, result=...) at src/mongo/db/s/move_chunk_command.cpp:138
#9  0x000000b51e2401b6 in mongo::BasicCommand::enhancedRun (this=0xb51f456b40 <mongo::(anonymous namespace)::moveChunkCmd>, opCtx=0xb5268a3640, request=..., result=...) at src/mongo/db/commands.cpp:390
#10 0x000000b51e23ccff in mongo::Command::publicRun (this=0xb51f456b40 <mongo::(anonymous namespace)::moveChunkCmd>, opCtx=0xb5268a3640, request=..., result=...) at src/mongo/db/commands.cpp:328
#11 0x000000b51d20690d in mongo::(anonymous namespace)::runCommandImpl (opCtx=opCtx@entry=0xb5268a3640, command=command@entry=0xb51f456b40 <mongo::(anonymous namespace)::moveChunkCmd>, request=..., replyBuilder=replyBuilder@entry=0xb520de5630, startOperationTime=..., startOperationTime@entry=...) at src/mongo/db/service_entry_point_mongod.cpp:474
#12 0x000000b51d20793f in mongo::(anonymous namespace)::execCommandDatabase (opCtx=<optimized out>, command=command@entry=0xb51f456b40 <mongo::(anonymous namespace)::moveChunkCmd>, request=..., replyBuilder=<optimized out>, this=<optimized out>, this=<optimized out>, this=<optimized out>) at src/mongo/db/service_entry_point_mongod.cpp:700
#13 0x000000b51d209b81 in mongo::(anonymous namespace)::<lambda()>::operator()(void) const (__closure=__closure@entry=0x7f1d6f941dc0) at src/mongo/db/service_entry_point_mongod.cpp:815
#14 0x000000b51d20aa1e in mongo::(anonymous namespace)::runCommands (message=..., opCtx=0xb5268a3640) at src/mongo/db/service_entry_point_mongod.cpp:822
#15 mongo::ServiceEntryPointMongod::handleRequest (this=<optimized out>, opCtx=0xb5268a3640, m=...) at src/mongo/db/service_entry_point_mongod.cpp:1084
#16 0x000000b51d21319e in mongo::ServiceStateMachine::_processMessage (this=this@entry=0xb5268a72b0, guard=...) at src/mongo/transport/service_state_machine.cpp:317
#17 0x000000b51d211488 in mongo::ServiceStateMachine::_runNextInGuard (this=this@entry=0xb5268a72b0, guard=...) at src/mongo/transport/service_state_machine.cpp:407
#18 0x000000b51d212aff in mongo::ServiceStateMachine::runNext (this=0xb5268a72b0) at src/mongo/transport/service_state_machine.cpp:373
#19 0x000000b51d20e621 in mongo::ServiceEntryPointImpl::<lambda()>::operator() (__closure=0xb526861560) at src/mongo/transport/service_entry_point_impl.cpp:89
#20 std::_Function_handler<void(), mongo::ServiceEntryPointImpl::startSession(mongo::transport::SessionHandle)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/5/functional:1871
#21 0x000000b51e61f854 in std::function<void ()>::operator()() const (this=<optimized out>) at /usr/include/c++/5/functional:2267
#22 mongo::(anonymous namespace)::runFunc (ctx=0xb526861960) at src/mongo/transport/service_entry_point_utils.cpp:55
#23 0x00007f1db4c956ba in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#24 0x00007f1db49cb3dd in clone () from /lib/x86_64-linux-gnu/libc.so.6
 
Thread 123: "conn12" (Thread 0x7f8bc6a75700 (LWP 11115))
Duplicate Thread 122: "conn11" (Thread 0x7f8bc6b76700 (LWP 11114))
#0  0x00007f8c010da360 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f8c018bc91c in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2  0x0000002fe5855cc3 in mongo::OperationContext::<lambda()>::operator() (__closure=<optimized out>) at src/mongo/db/operation_context.cpp:307
#3  mongo::OperationContext::waitForConditionOrInterruptNoAssertUntil (this=0x2ff011d000, cv=..., m=..., deadline=...) at src/mongo/db/operation_context.cpp:311
#4  0x0000002fe5855e60 in mongo::OperationContext::waitForConditionOrInterruptNoAssert (this=<optimized out>, cv=..., m=...) at src/mongo/db/operation_context.cpp:234
#5  0x0000002fe5855f2d in mongo::OperationContext::waitForConditionOrInterrupt (this=this@entry=0x2ff011d000, cv=..., m=...) at src/mongo/db/operation_context.cpp:229
#6  0x0000002fe481a0be in mongo::OperationContext::waitForConditionOrInterrupt<mongo::SessionCatalog::checkOutSession(mongo::OperationContext*)::<lambda()> > (pred=..., m=..., cv=..., this=0x2ff011d000) at src/mongo/db/operation_context.h:177
#7  mongo::SessionCatalog::checkOutSession (this=<optimized out>, opCtx=opCtx@entry=0x2ff011d000) at src/mongo/db/session_catalog.cpp:146
#8  0x0000002fe481a46a in mongo::OperationContextSession::OperationContextSession (this=<optimized out>, opCtx=0x2ff011d000) at src/mongo/db/session_catalog.cpp:203
#9  0x0000002fe444364b in mongo::(anonymous namespace)::execCommandDatabase (opCtx=<optimized out>, command=command@entry=0x2fe66948e0 <mongo::cmdCreate>, request=..., replyBuilder=<optimized out>, this=<optimized out>, this=<optimized out>, this=<optimized out>) at src/mongo/db/service_entry_point_mongod.cpp:593
#10 0x0000002fe4445b81 in mongo::(anonymous namespace)::<lambda()>::operator()(void) const (__closure=__closure@entry=0x7f8bc6a73dc0) at src/mongo/db/service_entry_point_mongod.cpp:815
#11 0x0000002fe4446a1e in mongo::(anonymous namespace)::runCommands (message=..., opCtx=0x2ff011d000) at src/mongo/db/service_entry_point_mongod.cpp:822
#12 mongo::ServiceEntryPointMongod::handleRequest (this=<optimized out>, opCtx=0x2ff011d000, m=...) at src/mongo/db/service_entry_point_mongod.cpp:1084
#13 0x0000002fe444f19e in mongo::ServiceStateMachine::_processMessage (this=this@entry=0x2ff01059b0, guard=...) at src/mongo/transport/service_state_machine.cpp:317
#14 0x0000002fe444d488 in mongo::ServiceStateMachine::_runNextInGuard (this=this@entry=0x2ff01059b0, guard=...) at src/mongo/transport/service_state_machine.cpp:407
#15 0x0000002fe444eaff in mongo::ServiceStateMachine::runNext (this=0x2ff01059b0) at src/mongo/transport/service_state_machine.cpp:373
#16 0x0000002fe444a621 in mongo::ServiceEntryPointImpl::<lambda()>::operator() (__closure=0x2ff0103640) at src/mongo/transport/service_entry_point_impl.cpp:89
#17 std::_Function_handler<void(), mongo::ServiceEntryPointImpl::startSession(mongo::transport::SessionHandle)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/5/functional:1871
#18 0x0000002fe585b854 in std::function<void ()>::operator()() const (this=<optimized out>) at /usr/include/c++/5/functional:2267
#19 mongo::(anonymous namespace)::runFunc (ctx=0x2ff0103660) at src/mongo/transport/service_entry_point_utils.cpp:55
#20 0x00007f8c010d46ba in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#21 0x00007f8c00e0a3dd in clone () from /lib/x86_64-linux-gnu/libc.so.6



 Comments   
Comment by Githook User [ 19/Sep/17 ]

Author:

{'username': 'jsmulrow', 'name': 'Jack Mulrow', 'email': 'jack.mulrow@mongodb.com'}

Message: SERVER-30912 Use session in concurrency suites with SessionOptions
Branch: master
https://github.com/mongodb/mongo/commit/86f4e7955327bc009dd4a25135bb741886d0ccaa

Comment by Ramon Fernandez Marina [ 14/Sep/17 ]

Author:

{'username': u'jsmulrow', 'name': u'Jack Mulrow', 'email': u'jack.mulrow@mongodb.com'}

Message:SERVER-30912 Only check out sessions for write commands
Branch:master
https://github.com/mongodb/mongo/commit/5767ee2421fa6c7934a90e9083f07743a83dcf71

Generated at Thu Feb 08 04:25:25 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.