[SERVER-19699] Save diagnostic files on failure - Windows Created: 09/Jul/15  Updated: 07/Oct/15  Resolved: 18/Sep/15

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: None
Fix Version/s: 3.1.9

Type: Improvement Priority: Major - P3
Reporter: Jonathan Abrahams Assignee: Jonathan Abrahams
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Duplicate
Related
is related to SERVER-12259 startup option to provide core dumps Closed
Backwards Compatibility: Fully Compatible
Sprint: TIG 9 (09/18/15)
Participants:

 Description   

Saving the Windows diagnostic files on mongod/mongos failure, as a build artifact, is helpful for debugging the failure.

These files include:

  • C:\data\mci\src\mongod.*.mdmp
  • mongod.pdb, mongos.pdb, mongo.pdb

Save the Linux core dumps as a build artifact as well (see SERVER-12259)



 Comments   
Comment by Githook User [ 18/Sep/15 ]

Author:

{u'username': u'hptabster', u'name': u'Jonathan Abrahams', u'email': u'jonathan@mongodb.com'}

Message: SERVER-19699 Save diagnostics files on failure, Windows.

This is part of the Evergreen system to save the Windows minidump in an
archive format.
Branch: master
https://github.com/mongodb/mongo/commit/7e6433a78e27e152c37702193b1f658f2c97287f

Comment by Eitan Klein [ 19/Aug/15 ]

Thanks ian@10gen.com for the update, let me to talk w/ Ernie to understand the issue and I will get back to you.

Comment by Ian Whalen (Inactive) [ 19/Aug/15 ]

eitan.klein: Discussion with the Evergreen team has indicated that the tougher part of this might end up being the cross-platform naming conventions. We might be able to serialize the work if someone wanted to take a stab at a implementing a version of this that just assumes the behavior in EVG-445 - then we can just push the two fixes at basically the same time?

Comment by Ernie Hershey [ 14/Aug/15 ]

Sorry. That does make sense. There's no reason not to do that too.

Comment by Ian Whalen (Inactive) [ 14/Aug/15 ]

ernie.hershey@10gen.com that doesn't really answer my question above. can't we just add a one-liner to start preserving this stuff in S3 and then worry about setting up flytrap later? I'd prefer not to block a super simple solution on a significantly more complicated system that isn't even planned until 3.4.

Comment by Ernie Hershey [ 14/Aug/15 ]

Because if we have a full implementation of flytrap and breakpad, capturing diagnostic info consistently across platforms with a unified way to view them, I don't know if it makes sense to do this as well.

Comment by Atul Kachru [ 14/Aug/15 ]

Hey ernie.hershey@10gen.com, why was this moved to features we are not sure of?

Comment by Ian Whalen (Inactive) [ 10/Aug/15 ]

once EVG-445 is done, does it make sense to at least implement this as a simple s3.put so we don't lose the core dumps entirely, and then focus on storing that data into a cleaner system (like flytrap) as a follow-on?

Comment by Ernie Hershey [ 10/Aug/15 ]

We'd like to do this via breakpad/flytrap, segmenting internal data from normal reports. We can discuss further as well.

Comment by Ian Whalen (Inactive) [ 31/Jul/15 ]

Pending completion of EVG-445, this will be possible simply by adding the correct s3.put to evergreen.yml

Comment by Jonathan Abrahams [ 09/Jul/15 ]

See the full log here: https://logkeeper.mongodb.org/build/559dfc1abe07c41b95188be3/test/559dfc1a90413031ce18827a

Relevant info:

[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.459+0000  m31101| 2015-07-09T04:52:13.458+0000 I CONTROL  [conn12] mongod.exe    f:\dd\vctools\crt\crtw32\stdcpp\thr\mutex.c(49)                                  mtx_do_lock+0x27
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.459+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    ...\src\mongo\s\d_state.cpp(858)                                                 mongo::ShardedConnectionInfo::addHook+0x77
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.459+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    ...\src\mongo\db\pipeline\pipeline_d.cpp(120)                                    mongo::PipelineD::prepareCursorSource+0x31f
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.459+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    ...\src\mongo\db\commands\pipeline_command.cpp(239)                              mongo::PipelineCommand::run+0x40c
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.459+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    ...\src\mongo\db\dbcommands.cpp(1316)                                            mongo::Command::run+0x3ae
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.459+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    ...\src\mongo\db\dbcommands.cpp(1270)                                            mongo::Command::execCommand+0x938
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.460+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    ...\src\mongo\db\commands.cpp(16707566)                                          mongo::runCommands+0x256
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.460+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    ...\src\mongo\db\instance.cpp(293)                                               mongo::`anonymous namespace'::receivedRpc+0x1e7
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.460+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    ...\src\mongo\db\instance.cpp(509)                                               mongo::assembleResponse+0x7de
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.460+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    ...\src\mongo\db\db.cpp(167)                                                     mongo::MyMessageHandler::process+0xa2
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.460+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    ...\src\mongo\util\net\message_server_port.cpp(231)                              mongo::PortMessageServer::handleIncomingMsg+0x47d
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.460+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    c:\program files (x86)\microsoft visual studio 12.0\vc\include\thr\xthread(187)  std::_LaunchPad<std::_Bind<0,void,boost::_bi::bind_t<void,boost::_mfi::mf0<void,mongo::CatalogManagerLegacy>,boost::_bi::list1<boost::_bi::value<mongo::CatalogManagerLegacy * __ptr64> > > > >::_Go+0x2c
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.460+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    f:\dd\vctools\crt\crtw32\stdcpp\thr\threadcall.cpp(28)                           _Call_func+0x14
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.460+0000  m31101| 2015-07-09T04:52:13.459+0000 I CONTROL  [conn12] mongod.exe    f:\dd\vctools\crt\crtw32\startup\threadex.c(376)                                 _callthreadstartex+0x17
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.460+0000  m31101| 2015-07-09T04:52:13.460+0000 I CONTROL  [conn12] mongod.exe    f:\dd\vctools\crt\crtw32\startup\threadex.c(354)                                 _threadstartex+0x102
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:13.460+0000  m31101| 2015-07-09T04:52:13.460+0000 I CONTROL  [conn12] kernel32.dll
...
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.555+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\util\stacktrace_windows.cpp(174)                                   mongo::printStackTrace+0x43
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.555+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\util\log.cpp(134)                                                  mongo::logContext+0xa4
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.555+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\util\assert_util.cpp(225)                                          mongo::msgasserted+0xf1
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.555+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\util\assert_util.cpp(217)                                          mongo::msgasserted+0x13
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.555+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\s\commands\cluster_pipeline_cmd.cpp(372)                           mongo::`anonymous namespace'::PipelineCommand::aggRunCommand+0x37d
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.555+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\s\commands\cluster_pipeline_cmd.cpp(226)                           mongo::`anonymous namespace'::PipelineCommand::run+0x1609
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.555+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\s\s_only.cpp(123)                                                  mongo::Command::execCommandClientBasic+0x457
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.555+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\s\s_only.cpp(164)                                                  mongo::Command::runAgainstRegistered+0x207
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.555+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\s\strategy.cpp(310)                                                mongo::Strategy::clientCommandOp+0x98f
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.556+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\s\request.cpp(117)                                                 mongo::Request::process+0x4ba
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.556+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\s\server.cpp(142)                                                  mongo::ShardedMessageHandler::process+0x6c
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.556+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    ...\src\mongo\util\net\message_server_port.cpp(231)                              mongo::PortMessageServer::handleIncomingMsg+0x47d
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.556+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    c:\program files (x86)\microsoft visual studio 12.0\vc\include\thr\xthread(187)  std::_LaunchPad<std::_Bind<0,void,boost::_bi::bind_t<void,boost::_mfi::mf0<void,mongo::CatalogManagerLegacy>,boost::_bi::list1<boost::_bi::value<mongo::CatalogManagerLegacy * __ptr64> > > > >::_Go+0x2c
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.556+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    f:\dd\vctools\crt\crtw32\stdcpp\thr\threadcall.cpp(28)                           _Call_func+0x14
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.556+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    f:\dd\vctools\crt\crtw32\startup\threadex.c(376)                                 _callthreadstartex+0x17
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.558+0000  m30999| 2015-07-09T04:52:16.554+0000 I CONTROL  [conn314] mongos.exe    f:\dd\vctools\crt\crtw32\startup\threadex.c(354)                                 _threadstartex+0x102
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.558+0000  m30999| 2015-07-09T04:52:16.555+0000 I CONTROL  [conn314] kernel32.dll                                                                                   BaseThreadInitThunk+0xd
[js_test:fsm_all_sharded_replication] 2015-07-09T04:52:16.558+0000  m30999| 2015-07-09T04:52:16.555+0000 I CONTROL  [conn314]

Comment by Eitan Klein [ 09/Jul/15 ]

it's also called crash dump file that happen as result of access violation

jonathan.abrahams can you give them an example of a report vs. the lack of actionable memory dump to investigate the problem.

Generated at Thu Feb 08 03:51:47 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.