[SERVER-30412] mongos Segmentation fault during aggregation workload Created: 28/Jul/17  Updated: 27/Oct/23  Resolved: 09/Jan/20

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 3.5.11
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: James O'Leary Assignee: [DO NOT USE] Backlog - Sharding Team
Resolution: Gone away Votes: 0
Labels: sysperf-36
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-22760 Sharded aggregation pipelines which i... Closed
Assigned Teams:
Sharding
Operating System: ALL
Sprint: Sharding 2017-10-02
Participants:

 Description   

There is a persistent seg fault being thrown during one of the MMAP aggregation tests.

See the following log lines:

2017-07-28T15:54:20.722+0000 F -        [NetworkInterfaceASIO-TaskExecutorPool-9-0] Invalid access at address: 0xabbb9380
2017-07-28T15:54:20.723+0000 F -        [NetworkInterfaceASIO-TaskExecutorPool-9-0] Got signal: 11 (Segmentation fault).
 
 0x55e0a78bc891 0x55e0a78bbaa9 0x55e0a78bc116 0x7fae31559370 0x55e0a7aa9533
----- BEGIN BACKTRACE -----
{"backtrace":[
{"b":"55E0A69AB000","o":"F11891","s":"_ZN5mongo15printStackTraceERSo"},{"b":"55E0A69AB000","o":"F10AA9"},
{"b":"55E0A69AB000","o":"F11116"},
{"b":"7FAE3154A000","o":"F370"},
{"b":"55E0A69AB000","o":"10FE533"}],
"processInfo":{
"mongodbVersion" : "3.5.10-167-g62edb2d-patch-5977444a2fbabe33e10018a5", 
"gitVersion" : "62edb2ddc7926312bafd33c932a5d9ed14d863f0", 
"compiledModules" : [], 
"uname" : { 
"sysname" : "Linux", "release" : "4.4.41-36.55.amzn1.x86_64", "version" : "#1 SMP Wed Jan 18 01:03:26 UTC 2017", "machine" : "x86_64" }, 
"somap" : [ 
{ "b" : "55E0A69AB000", "elfType" : 3, "buildId" : "95D20FE9CF1BF002A8F4F2EB183F5E8ED50E4C47" }, 
{ "b" : "7FFC3AAB1000", "elfType" : 3, "buildId" : "F3A72C9C20A0FD0E902360D4BC280F2002571040" }, 
{ "b" : "7FAE31E82000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "F965E296DCBFAD601FA60EF6EC19AFEDE633C777" }, 
{ "b" : "7FAE31C7E000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "EB575314825D0BB0D64D2251E8B779E52FA8D419" }, 
{ "b" : "7FAE3197C000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "8284A38DE969B169CBFAF326C5D63342797B8010" }, 
{ "b" : "7FAE31766000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "3FD5F89DE59E124AB1419A0BD16775B4096E84FD" }, 
{ "b" : "7FAE3154A000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "1F4696737495F92BDF68A7201E121A571F0FA762" }, 
{ "b" : "7FAE31188000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "4B81214AF3D685CD8B94A4F8C19D1C6459F2B630" }, 
{ "b" : "7FAE3208A000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "24686735C8371E29ED42E282A7CCE6DE67CF345A" } ] }}
 mongos(_ZN5mongo15printStackTraceERSo+0x41) [0x55e0a78bc891]
 mongos(+0xF10AA9) [0x55e0a78bbaa9]
 mongos(+0xF11116) [0x55e0a78bc116]
 libpthread.so.0(+0xF370) [0x7fae31559370]
 mongos(+0x10FE533) [0x55e0a7aa9533]
-----  END BACKTRACE  -----



 Comments   
Comment by Kaloian Manassiev [ 10/Oct/17 ]

jim.oleary, how do you propose that we continue with this ticket?

Comment by James O'Leary [ 10/Oct/17 ]

I have gone through all the system failures on sys_perf_linux_3_shard_agg_query_comparison_bestbuy_(WT|MMAPv1) from the end of August.

There is no recurrence of this backtrace since then BUT:

  • BF-6538 is still happening which could be masking this issue
  • these tests run relatively infrequently
  • there are a number of failures where no logs or core were captured, so we don't know what caused the failures in these cases.
Comment by Kevin Duong [ 10/Oct/17 ]

pinging jim.oleary to help address this further as needed.

Comment by Nathan Myers [ 06/Oct/17 ]

Is it still possible to reproduce this?

Comment by Kevin Duong [ 06/Oct/17 ]

Changing this from debugging with submitter to 3.6 required.

Comment by Githook User [ 01/Sep/17 ]

Author:

{'username': 'gormanb', 'name': 'Bernard Gorman', 'email': 'bernard.gorman@gmail.com'}

Message: SERVER-30412 Ensure that aggregation splitpoints are not shared between shard and merge pipelines on mongoS
Branch: master
https://github.com/mongodb/mongo/commit/d34c7cf640a6e12b4f1abe86ef3c96d1216f0654

Comment by Bernard Gorman [ 30/Aug/17 ]

This turned out to be a bit subtle. The bug was exposed by this test in the bestbuy_agg_query_comparison.js workload (note that the find is converted to an equivalent aggregation):

runFind({find: testColl.getName(), filter: {type: "Music"}, limit: 1000}, true,
    makeTestName("find_limit"));

Uniquely among SplittableDocumentSources, when mongoS splits the pipeline at a $limit stage the $limit returns a pointer to the same object for both the shard and merge pipelines. Previously this didn't matter, since both the shard and merge pipes were simply serialised to command objects and sent to the relevant shards, but after SERVER-22760 the merge pipeline object produced in ClusterAggregate::runAggregate is the actual machinery that will execute on mongoS.

When we begin to run the merge pipeline on mongos, the $limit splitpoint is correctly stitched into it. We retrieve the first batch, register the cursor containing the pipeline to the ClusterCursorManager, and return the results. However, at this point the shard pipeline - which has already been dispatched to the remotes - is destroyed as its unique_ptr goes out of scope; as part of this process, its deleter calls stitch() on the pipeline. The $limit stage, which exists in both the shard pipeline being destroyed and the merge pipeline stored in the ClusterCursorManager, is stitched back into the shard pipeline and redirected to point to the preceding shard stage, which is promptly destroyed. When we next retrieve the merge pipeline from the cursor manager and run it, the $limit is pointing to nothing and segfaults.

The reason that this bug wasn't picked up by the integration tests previously is that we used a $limit of 50. When we execute this on mongoS, we hit EOF before the first batch of 101 is filled, so we return those results and we don't register the cursor with the CursorManager. The merge pipeline therefore doesn't outlive the ClusterAggregate::runAggregate method and is destroyed along with the shard pipeline. The bug only manifests in cases where we split the pipeline at a $limit stage of at least 102 and the resulting merge pipeline is executable on mongoS.

Comment by James O'Leary [ 30/Aug/17 ]

david.storch, I agree, we can only focus on the current error for the moment. Lets find and fix the issue with the SERVER-22760 functionality.

Once this is resolved, it will likely be necessary to investigate the original issue (possibly with an older build) even if it appears to be gone from the current version. Unfortunately, since the original stacktrace doesn't appear to be very informative, this may require coordination between us to get whatever information is required (maybe a core dump would be more useful).

In the meantime, I'll see if the original backtrace is more useful if line numbers are generated.

Comment by David Storch [ 29/Aug/17 ]

jim.oleary, ah, you're right, there does appear to be two issues at play here. I propose that we focus our efforts for now on the issue related to SERVER-22760. Once that is fixed, we can monitor the test to see if the original crash recurs. Does that sound good?

Comment by James O'Leary [ 29/Aug/17 ]

From the commit date of August 8th for SERVER-22760, this may not be the original issue described in this ticket.

The original trace and date are before this change was committed.

Comment by David Storch [ 23/Aug/17 ]

This looks like it was introduced by bernard.gorman's work in SERVER-22760 to execute the merging part of an aggregation pipeline on mongos. I'm reassigning so that he can take a look.

Comment by James O'Leary [ 23/Aug/17 ]

nathan.myers The test continues to persistently fail, the latest is on v3.5.12-20-g1602ed3.

This stack trace looks more useful, the full backtrace with line numbers:

 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/util/stacktrace_posix.cpp:172:0: mongo::printStackTrace(std::ostream&)
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/util/signal_handlers_synchronous.cpp:180:0: mongo::(anonymous namespace)::printSignalAndBacktrace(int)
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/util/signal_handlers_synchronous.cpp:276:0: mongo::(anonymous namespace)::abruptQuitWithAddrSignal(int, siginfo_t*, void*)
 ??:0:0: ??
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/db/pipeline/document_source_limit.cpp:77:0: mongo::DocumentSourceLimit::getNext()
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/db/pipeline/pipeline.cpp:480:0: mongo::Pipeline::getNext()
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/s/query/router_stage_aggregation_merge.cpp:44:0: mongo::RouterStageAggregationMerge::next()
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/s/query/cluster_client_cursor_impl.cpp:90:0: mongo::ClusterClientCursorImpl::next()
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/s/query/cluster_cursor_manager.cpp:117:0: mongo::ClusterCursorManager::PinnedCursor::next()
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/s/query/cluster_find.cpp:443:0: mongo::ClusterFind::runGetMore(mongo::OperationContext*, mongo::GetMoreRequest const&)
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/s/commands/cluster_getmore_cmd.cpp:107:0: mongo::(anonymous namespace)::ClusterGetMoreCmd::run(mongo::OperationContext*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, mongo::BSONObj const&, mongo::BSONObjBuilder&)
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/db/commands.cpp:390:0: mongo::BasicCommand::enhancedRun(mongo::OperationContext*, mongo::OpMsgRequest const&, mongo::BSONObjBuilder&)
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/db/commands.cpp:328:0: mongo::Command::publicRun(mongo::OperationContext*, mongo::OpMsgRequest const&, mongo::BSONObjBuilder&)
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/s/commands/strategy.cpp:222:0: execCommandClient
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/s/commands/strategy.cpp:264:0: runAgainstRegistered
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/s/commands/strategy.cpp:284:0: mongo::(anonymous namespace)::runCommand(mongo::OperationContext*, mongo::OpMsgRequest const&, mongo::BSONObjBuilder&&) [clone .constprop.408]
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/s/commands/strategy.cpp:431:0: mongo::Strategy::clientCommand(mongo::OperationContext*, mongo::Message const&)::{lambda()#1}::operator()() const
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/s/commands/strategy.cpp:441:0: mongo::Strategy::clientCommand(mongo::OperationContext*, mongo::Message const&)
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/s/service_entry_point_mongos.cpp:92:0: mongo::ServiceEntryPointMongos::handleRequest(mongo::OperationContext*, mongo::Message const&)
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/transport/service_state_machine.cpp:317:0: mongo::ServiceStateMachine::_processMessage(mongo::ServiceStateMachine::ThreadGuard&)
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/transport/service_state_machine.cpp:407:0: mongo::ServiceStateMachine::_runNextInGuard(mongo::ServiceStateMachine::ThreadGuard&)
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/transport/service_state_machine.cpp:373:0: mongo::ServiceStateMachine::runNext()
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/transport/service_entry_point_impl.cpp:89:0: operator()
 /opt/mongodbtoolchain/v2/include/c++/5.4.0/functional:1871:0: std::_Function_handler<void (), mongo::ServiceEntryPointImpl::startSession(std::shared_ptr<mongo::transport::Session>)::{lambda()#2}>::_M_invoke(std::_Any_data const&)
 /opt/mongodbtoolchain/v2/include/c++/5.4.0/functional:2267:0: std::function<void ()>::operator()() const
 /data/mci/65f6578e523734710ec8326c80c5e776/src/src/mongo/transport/service_entry_point_utils.cpp:55:0: mongo::(anonymous namespace)::runFunc(void*)
 ??:0:0: ??
 ??:0:0: ??

The useful links are :

If this isn't sufficient then let me know what you need to progress the issue.

The raw trace looks like:

2017-08-22T23:24:02.322+0000 F -        [conn13] Invalid access at address: 0
2017-08-22T23:24:02.327+0000 F -        [conn13] Got signal: 11 (Segmentation fault).
 
 0x563e00a96b71 0x563e00a95d89 0x563e00a963f6 0x7fdbf951f370 0x563e003a275b 0x563e00364fdd 0x563e000dc262 0x563e000d99b9 0x563e002fdf2a 0x563e000d165b 0x563e0005baa6 0x563e0046ec06 0x563e0046b74f 0x563e000abc6b 0x563e000ad054 0x563e000ad979 0x563dfffc42a1 0x563dfffdd1ce 0x563dfffdb425 0x563dfffdcb07 0x563dfffd84b1 0x563e0095b764 0x7fdbf9517dc5 0x7fdbf92456ed
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"563DFFB3C000","o":"F5AB71","s":"_ZN5mongo15printStackTraceERSo"},{"b":"563DFFB3C000","o":"F59D89"},{"b":"563DFFB3C000","o":"F5A3F6"},{"b":"7FDBF9510000","o":"F370"},{"b":"563DFFB3C000","o":"86675B","s":"_ZN5mongo19DocumentSourceLimit7getNextEv"},{"b":"563DFFB3C000","o":"828FDD","s":"_ZN5mongo8Pipeline7getNextEv"},{"b":"563DFFB3C000","o":"5A0262","s":"_ZN5mongo27RouterStageAggregationMerge4nextEv"},{"b":"563DFFB3C000","o":"59D9B9","s":"_ZN5mongo23ClusterClientCursorImpl4nextEv"},{"b":"563DFFB3C000","o":"7C1F2A","s":"_ZN5mongo20ClusterCursorManager12PinnedCursor4nextEv"},{"b":"563DFFB3C000","o":"59565B","s":"_ZN5mongo11ClusterFind10runGetMoreEPNS_16OperationContextERKNS_14GetMoreRequestE"},{"b":"563DFFB3C000","o":"51FAA6"},{"b":"563DFFB3C000","o":"932C06","s":"_ZN5mongo12BasicCommand11enhancedRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE"},{"b":"563DFFB3C000","o":"92F74F","s":"_ZN5mongo7Command9publicRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE"},{"b":"563DFFB3C000","o":"56FC6B"},{"b":"563DFFB3C000","o":"571054"},{"b":"563DFFB3C000","o":"571979","s":"_ZN5mongo8Strategy13clientCommandEPNS_16OperationContextERKNS_7MessageE"},{"b":"563DFFB3C000","o":"4882A1","s":"_ZN5mongo23ServiceEntryPointMongos13handleRequestEPNS_16OperationContextERKNS_7MessageE"},{"b":"563DFFB3C000","o":"4A11CE","s":"_ZN5mongo19ServiceStateMachine15_processMessageERNS0_11ThreadGuardE"},{"b":"563DFFB3C000","o":"49F425","s":"_ZN5mongo19ServiceStateMachine15_runNextInGuardERNS0_11ThreadGuardE"},{"b":"563DFFB3C000","o":"4A0B07","s":"_ZN5mongo19ServiceStateMachine7runNextEv"},{"b":"563DFFB3C000","o":"49C4B1"},{"b":"563DFFB3C000","o":"E1F764"},{"b":"7FDBF9510000","o":"7DC5"},{"b":"7FDBF914E000","o":"F76ED","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.5.12-20-g1602ed3", "gitVersion" : "1602ed3a3b4b101f88be519b13dd69b1c0f04343", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.4.41-36.55.amzn1.x86_64", "version" : "#1 SMP Wed Jan 18 01:03:26 UTC 2017", "machine" : "x86_64" }, "somap" : [ { "b" : "563DFFB3C000", "elfType" : 3, "buildId" : "43A63AF528BD9045C8519ED5E6754DA80AB1BC59" }, { "b" : "7FFFCBFEF000", "elfType" : 3, "buildId" : "F3A72C9C20A0FD0E902360D4BC280F2002571040" }, { "b" : "7FDBF9E48000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "F965E296DCBFAD601FA60EF6EC19AFEDE633C777" }, { "b" : "7FDBF9C44000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "EB575314825D0BB0D64D2251E8B779E52FA8D419" }, { "b" : "7FDBF9942000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "8284A38DE969B169CBFAF326C5D63342797B8010" }, { "b" : "7FDBF972C000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "3FD5F89DE59E124AB1419A0BD16775B4096E84FD" }, { "b" : "7FDBF9510000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "1F4696737495F92BDF68A7201E121A571F0FA762" }, { "b" : "7FDBF914E000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "4B81214AF3D685CD8B94A4F8C19D1C6459F2B630" }, { "b" : "7FDBFA050000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "24686735C8371E29ED42E282A7CCE6DE67CF345A" } ] }}
 mongos(_ZN5mongo15printStackTraceERSo+0x41) [0x563e00a96b71]
 mongos(+0xF59D89) [0x563e00a95d89]
 mongos(+0xF5A3F6) [0x563e00a963f6]
 libpthread.so.0(+0xF370) [0x7fdbf951f370]
 mongos(_ZN5mongo19DocumentSourceLimit7getNextEv+0x7B) [0x563e003a275b]
 mongos(_ZN5mongo8Pipeline7getNextEv+0x3D) [0x563e00364fdd]
 mongos(_ZN5mongo27RouterStageAggregationMerge4nextEv+0x32) [0x563e000dc262]
 mongos(_ZN5mongo23ClusterClientCursorImpl4nextEv+0x179) [0x563e000d99b9]
 mongos(_ZN5mongo20ClusterCursorManager12PinnedCursor4nextEv+0x2A) [0x563e002fdf2a]
 mongos(_ZN5mongo11ClusterFind10runGetMoreEPNS_16OperationContextERKNS_14GetMoreRequestE+0x20B) [0x563e000d165b]
 mongos(+0x51FAA6) [0x563e0005baa6]
 mongos(_ZN5mongo12BasicCommand11enhancedRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE+0x76) [0x563e0046ec06]
 mongos(_ZN5mongo7Command9publicRunEPNS_16OperationContextERKNS_12OpMsgRequestERNS_14BSONObjBuilderE+0x1F) [0x563e0046b74f]
 mongos(+0x56FC6B) [0x563e000abc6b]
 mongos(+0x571054) [0x563e000ad054]
 mongos(_ZN5mongo8Strategy13clientCommandEPNS_16OperationContextERKNS_7MessageE+0x59) [0x563e000ad979]
 mongos(_ZN5mongo23ServiceEntryPointMongos13handleRequestEPNS_16OperationContextERKNS_7MessageE+0x5B1) [0x563dfffc42a1]
 mongos(_ZN5mongo19ServiceStateMachine15_processMessageERNS0_11ThreadGuardE+0xEE) [0x563dfffdd1ce]
 mongos(_ZN5mongo19ServiceStateMachine15_runNextInGuardERNS0_11ThreadGuardE+0x1B5) [0x563dfffdb425]
 mongos(_ZN5mongo19ServiceStateMachine7runNextEv+0x157) [0x563dfffdcb07]
 mongos(+0x49C4B1) [0x563dfffd84b1]
 mongos(+0xE1F764) [0x563e0095b764]
 libpthread.so.0(+0x7DC5) [0x7fdbf9517dc5]
 libc.so.6(clone+0x6D) [0x7fdbf92456ed]
-----  END BACKTRACE  -----

Demangled:

mongo::printStackTrace(std::basic_ostream<char, std::char_traits<char> >&)
 
 
 
mongo::DocumentSourceLimit::getNext()
mongo::Pipeline::getNext()
mongo::RouterStageAggregationMerge::next()
mongo::ClusterClientCursorImpl::next()
mongo::ClusterCursorManager::PinnedCursor::next()
mongo::ClusterFind::runGetMore(mongo::OperationContext*, mongo::GetMoreRequest const&)
 
mongo::BasicCommand::enhancedRun(mongo::OperationContext*, mongo::OpMsgRequest const&, mongo::BSONObjBuilder&)
mongo::Command::publicRun(mongo::OperationContext*, mongo::OpMsgRequest const&, mongo::BSONObjBuilder&)
 
 
mongo::Strategy::clientCommand(mongo::OperationContext*, mongo::Message const&)
mongo::ServiceEntryPointMongos::handleRequest(mongo::OperationContext*, mongo::Message const&)
mongo::ServiceStateMachine::_processMessage(mongo::ServiceStateMachine::ThreadGuard&)
mongo::ServiceStateMachine::_runNextInGuard(mongo::ServiceStateMachine::ThreadGuard&)
mongo::ServiceStateMachine::runNext()
 

Comment by Nathan Myers [ 17/Aug/17 ]

The backtrace attached identifies only addresses in the signal handling code. Studying the logs, the only events logged, other than periodic connections, disconnections, and metadata refreshes, occurred more than an hour before the crash. The events prior to that point were a series of autosplit operations not followed by chunk migrations ("but no migrations allowed"), and then, 33 seconds later,

2017-07-28T14:47:03.046+0000 I NETWORK  [listener] connection accepted from 10.2.0.10:35532 #5 (1 connection now open)
2017-07-28T14:47:03.046+0000 I NETWORK  [conn5] received client metadata from 10.2.0.10:35532 conn: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "3.5.10-167-g62edb2d-patch-5977444a2fbabe33e10018a5" }, os: { type: "Linux", name: "Amazon Linux AMI release 2016.09", architecture: "x86_64", version: "Kernel 4.4.41-36.55.amzn1.x86_64" } }
2017-07-28T14:47:03.048+0000 W COMMAND  [conn5] mongos collstats doesn't know about: operationTime
2017-07-28T14:47:03.048+0000 W COMMAND  [conn5] mongos collstats doesn't know about: $gleStats
2017-07-28T14:47:03.048+0000 W COMMAND  [conn5] mongos collstats doesn't know about: $clusterTime
2017-07-28T14:47:03.048+0000 W COMMAND  [conn5] mongos collstats doesn't know about: $configServerState
2017-07-28T14:47:03.049+0000 W COMMAND  [conn5] mongos collstats doesn't know about: operationTime
2017-07-28T14:47:03.049+0000 W COMMAND  [conn5] mongos collstats doesn't know about: $gleStats
2017-07-28T14:47:03.049+0000 W COMMAND  [conn5] mongos collstats doesn't know about: $clusterTime
2017-07-28T14:47:03.049+0000 W COMMAND  [conn5] mongos collstats doesn't know about: $configServerState
2017-07-28T14:47:03.050+0000 W COMMAND  [conn5] mongos collstats doesn't know about: operationTime
2017-07-28T14:47:03.050+0000 W COMMAND  [conn5] mongos collstats doesn't know about: $gleStats
2017-07-28T14:47:03.050+0000 W COMMAND  [conn5] mongos collstats doesn't know about: $clusterTime
2017-07-28T14:47:03.050+0000 W COMMAND  [conn5] mongos collstats doesn't know about: $configServerState
2017-07-28T14:47:03.058+0000 I ASIO     [NetworkInterfaceASIO-TaskExecutorPool-7-0] Connecting to 10.2.0.71:27017
2017-07-28T14:47:03.059+0000 I ASIO     [NetworkInterfaceASIO-TaskExecutorPool-7-0] Successfully connected to 10.2.0.71:27017, took 1ms (1 connections now open to 10.2.0.71:27017)
2017-07-28T14:47:03.067+0000 I NETWORK  [conn5] end connection 10.2.0.10:35532 (0 connections now open)
[67 minutes pass]
2017-07-28T15:54:20.722+0000 F -        [NetworkInterfaceASIO-TaskExecutorPool-9-0] Invalid access at address: 0xabbb9380
2017-07-28T15:54:20.723+0000 F -        [NetworkInterfaceASIO-TaskExecutorPool-9-0] Got signal: 11 (Segmentation fault).

Comment by Daniel Pasette (Inactive) [ 31/Jul/17 ]

From the mongos log file:

mongos version v3.5.10-167-g62edb2d-patch-5977444a2fbabe33e10018a5
git version: 62edb2ddc7926312bafd33c932a5d9ed14d863f0

Generated at Thu Feb 08 04:23:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.