[SERVER-16278] Race between shutdown and fsync flush with WiredTiger Created: 21/Nov/14  Updated: 23/Jan/15  Resolved: 20/Jan/15

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 3.0.0-rc6

Type: Bug Priority: Major - P3
Reporter: Kaloian Manassiev Assignee: Mark Benvenuto
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:

 Description   

Looks like if fsync flush is ran just when the system is shutting down, the engine gets ripped out from underneath the command and it AVs:

	
 m30000| 2014-11-21T20:00:06.109+0000 I CONTROL  [signalProcessingThread] got signal 15 (Terminated), will terminate after current cmd ends
 m30000| 2014-11-21T20:00:06.109+0000 I COMMAND  [signalProcessingThread] now exiting
 m30000| 2014-11-21T20:00:06.109+0000 I NETWORK  [signalProcessingThread] shutdown: going to close listening sockets...
 m30000| 2014-11-21T20:00:06.109+0000 I NETWORK  [signalProcessingThread] closing listening socket: 17
 m30000| 2014-11-21T20:00:06.109+0000 I NETWORK  [signalProcessingThread] closing listening socket: 18
 m30000| 2014-11-21T20:00:06.109+0000 I NETWORK  [signalProcessingThread] removing socket file: /tmp/mongodb-30000.sock
 m30000| 2014-11-21T20:00:06.109+0000 I NETWORK  [signalProcessingThread] shutdown: going to flush diaglog...
 m30000| 2014-11-21T20:00:06.109+0000 I NETWORK  [signalProcessingThread] shutdown: going to close sockets...
 m30000| 2014-11-21T20:00:06.109+0000 I STORAGE  [signalProcessingThread] WiredTigerKVEngine shutting down
...
 m30000| 2014-11-21T20:00:06.265+0000 F -        [conn44] Invalid access at address: 0x7cc
 m30000| 2014-11-21T20:00:06.274+0000 F -        [conn44] Got signal: 11 (Segmentation fault).
...
 m30000|  mongod(mongo::printStackTrace(std::ostream&) 0x29) [0xf3ebd9]
 m30000|  mongod( 0xB3E542) [0xf3e542]
 m30000|  mongod( 0xB3E86E) [0xf3e86e]
 m30000|  libpthread.so.0( 0xECA0) [0x2b30c24a5ca0]
 m30000|  mongod(__wt_realloc 0xE0) [0x1343e50]
 m30000|  mongod(__wt_meta_track_on 0x83) [0x1342703]
 m30000|  mongod(__wt_txn_checkpoint 0x815) [0x136b9d5]
 m30000|  mongod( 0xF60116) [0x1360116]
 m30000|  mongod(mongo::WiredTigerKVEngine::flushAllFiles(bool) 0xAE) [0xd531de]
 m30000|  mongod(mongo::FSyncCommand::run(mongo::OperationContext*, std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&, bool) 0x2D5) [0x946605]
 m30000|  mongod(mongo::_execCommand(mongo::OperationContext*, mongo::Command*, std::string const&, mongo::BSONObj&, int, std::string&, mongo::BSONObjBuilder&, bool) 0x34) [0x9b60b4]
 m30000|  mongod(mongo::Command::execCommand(mongo::OperationContext*, mongo::Command*, int, char const*, mongo::BSONObj&, mongo::BSONObjBuilder&, bool) 0xC62) [0x9b6f52]
 m30000|  mongod(mongo::_runCommands(mongo::OperationContext*, char const*, mongo::BSONObj&, mongo::_BufBuilder<mongo::TrivialAllocator>&, mongo::BSONObjBuilder&, bool, int) 0x2D0) [0x9b7ac0]
 m30000|  mongod(mongo::newRunQuery(mongo::OperationContext*, mongo::Message&, mongo::QueryMessage&, mongo::CurOp&, mongo::Message&, bool) 0x101C) [0xbb8f7c]
 m30000|  mongod(mongo::assembleResponse(mongo::OperationContext*, mongo::Message&, mongo::DbResponse&, mongo::HostAndPort const&, bool) 0xBB3) [0xa9d3d3]
 m30000|  mongod(mongo::MyMessageHandler::process(mongo::Message&, mongo::AbstractMessagingPort*, mongo::LastError*) 0xE0) [0x7df9f0]
 m30000|  mongod(mongo::PortMessageServer::handleIncomingMsg(void*) 0x421) [0xefb211]
 m30000|  libpthread.so.0( 0x683D) [0x2b30c249d83d]
 m30000|  libc.so.6(clone 0x6D) [0x2b30c3327fcd]



 Comments   
Comment by Githook User [ 20/Jan/15 ]

Author:

{u'username': u'markbenvenuto', u'name': u'Mark Benvenuto', u'email': u'mark.benvenuto@mongodb.com'}

Message: SERVER-16278: Race between shutdown and fsync flush with WiredTiger
Branch: master
https://github.com/mongodb/mongo/commit/f8c0611e742c07a293da9cb49baac211fe1d8874

Comment by Mark Benvenuto [ 24/Nov/14 ]

I think this has been fixed by WT with the most recent pull - https://github.com/mongodb/mongo/commit/792f66beb03f0c0293b56e9aad16f94ad718357a

Generated at Thu Feb 08 03:40:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.