[SERVER-2662] seg fault equivalent when query plan cannot recover from yield, does not assert, and is yielded again Created: 02/Mar/11  Updated: 12/Jul/16  Resolved: 17/Mar/11

Status: Closed
Project: Core Server
Component/s: Concurrency
Affects Version/s: None
Fix Version/s: 1.8.1, 1.9.0

Type: Bug Priority: Major - P3
Reporter: Aaron Staple Assignee: Aaron Staple
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

Thread 7 Crashed:
0 libSystem.B.dylib 0x00007fff86fcfc75 __abort + 113
1 libSystem.B.dylib 0x00007fff86fcfcd9 abort_report_np + 0
2 libSystem.B.dylib 0x00007fff86fbcc90 __pthread_markcancel + 0
3 mongod 0x000000010008410c boost::shared_ptr<mongo::Cursor>::operator->() const + 64 (shared_ptr.hpp:419)
4 mongod 0x0000000100091098 mongo::ClientCursor::ClientCursor(int, boost::shared_ptr<mongo::Cursor> const&, std::string const&, mongo::BSONObj) + 860 (clientcursor.cpp:260)
5 mongod 0x00000001001a7feb mongo::UserQueryOp::prepareToYield() + 211 (query.cpp:694)
6 mongod 0x00000001000ba12a mongo::QueryPlanSet::Runner::prepareToYield(mongo::QueryOp&) + 76 (queryoptimizer.cpp:659)
7 mongod 0x00000001000bb703 mongo::QueryPlanSet::Runner::mayYield(std::vector<boost::shared_ptr<mongo::QueryOp>, std::allocator<boost::shared_ptr<mongo::QueryOp> > > const&) + 125 (queryoptimizer.cpp:544)
8 mongod 0x00000001000be23a mongo::QueryPlanSet::Runner::run() + 1172 (queryoptimizer.cpp:600)
9 mongod 0x00000001000c09ef mongo::QueryPlanSet::runOp(mongo::QueryOp&) + 591 (queryoptimizer.cpp:494)
10 mongod 0x00000001000c0d57 mongo::MultiPlanScanner::runOpOnce(mongo::QueryOp&) + 157 (queryoptimizer.cpp:715)
11 mongod 0x0000000100247f6c mongo::MultiPlanScanner::runOp(mongo::QueryOp&) + 38 (queryoptimizer.cpp:732)
12 mongod 0x00000001001a975a boost::shared_ptr<mongo::UserQueryOp> mongo::MultiPlanScanner::runOp<mongo::UserQueryOp>(mongo::UserQueryOp&) + 42 (queryoptimizer.h:290)
13 mongod 0x00000001001a015f mongo::runQuery(mongo::Message&, mongo::QueryMessage&, mongo::CurOp&, mongo::Message&) + 4906 (query.cpp:1126)
14 mongod 0x0000000100235e1e mongo::receivedQuery(mongo::Client&, mongo::DbResponse&, mongo::Message&) + 189 (instance.cpp:176)
15 mongod 0x0000000100237253 mongo::assembleResponse(mongo::Message&, mongo::DbResponse&, mongo::SockAddr const&) + 860 (instance.cpp:283)
16 mongod 0x00000001002e9b8c mongo::connThread(mongo::MessagingPort*) + 724 (db.cpp:239)
17 mongod 0x00000001002f2847 void boost::_bi::list1<boost::_bi::value<mongo::MessagingPort*> >::operator()<void (mongo::MessagingPort*), boost::_bi::list0>(boost::_bi::type<void>, void (&)(mongo::MessagingPort), boost::_bi::list0&, int) + 59 (bind.hpp:253)
18 mongod 0x000000010015836a boost::_bi::bind_t<void, void (mongo::MessagingPort*), boost::_bi::list1<boost::_bi::value<mongo::MessagingPort*> > >::operator()() + 54 (bind_template.hpp:20)
19 mongod 0x0000000100158388 boost::detail::thread_data<boost::_bi::bind_t<void, void (mongo::MessagingPort*), boost::_bi::list1<boost::_bi::value<mongo::MessagingPort*> > > >::run() + 28 (thread.hpp:56)
20 libboost_thread-mt.dylib 0x0000000100f3e404 thread_proxy + 132
21 libSystem.B.dylib 0x00007fff86f1a536 _pthread_start + 331
22 libSystem.B.dylib 0x00007fff86f1a3e9 thread_start + 13

virtual bool prepareToYield() {
if ( _findingStartCursor.get() )

{ return _findingStartCursor->prepareToYield(); }

else {
if ( ! _cc )

{ _cc.reset( new ClientCursor( QueryOption_NoCursorTimeout , _c , _pq.ns() ) ); }

return _cc->prepareToYield( _yieldData );
}
}

virtual void recoverFromYield() {
_nYields++;

if ( _findingStartCursor.get() )

{ _findingStartCursor->recoverFromYield(); }

else if ( ! ClientCursor::recoverFromYield( _yieldData ) ) {
_c.reset();
_cc.reset();
_so.reset();

if ( _capped )

{ msgassertedNoTrace( 13338, str::stream() << "capped cursor overrun during query: " << _pq.ns() ); }

else

{ // we don't fail query since we're fine with returning partial data if collection dropped // todo: this is wrong. the cursor could be gone if closeAllDatabases command just ran }

}
}

If we clear _c in recoverFromYield() and don't assert, the query optimizer may call prepareToYield() before next() and we will create a ClientCursor with a null _c causing a bad shared_ptr reference and assert/abort.



 Comments   
Comment by auto [ 19/Mar/11 ]

Author:

{u'login': u'astaple', u'name': u'Aaron', u'email': u'aaron@10gen.com'}

Message: SERVER-2662 don't attempt to yield a query after an earlier yield fails and drops the cursor
https://github.com/mongodb/mongo/commit/15c6cce90ad277ecb10cc9f4f8a53f4061358259

Comment by auto [ 19/Mar/11 ]

Author:

{u'login': u'astaple', u'name': u'Aaron', u'email': u'aaron@10gen.com'}

Message: SERVER-2662 test
https://github.com/mongodb/mongo/commit/ab41f59e8f8d5bd9f40513cd1121f4365d3484c4

Comment by auto [ 17/Mar/11 ]

Author:

{u'login': u'astaple', u'name': u'Aaron', u'email': u'aaron@10gen.com'}

Message: SERVER-2662 don't attempt to yield a query after an earlier yield fails and drops the cursor
https://github.com/mongodb/mongo/commit/d3a9fe9ae236b7b3013533453c0b33fd94051bb6

Comment by auto [ 07/Mar/11 ]

Author:

{u'login': u'astaple', u'name': u'Aaron', u'email': u'aaron@10gen.com'}

Message: SERVER-2662 test
https://github.com/mongodb/mongo/commit/93572fc05b83f3635abd1dc2d55aa810a46b8a83

Generated at Thu Feb 08 03:00:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.