[SERVER-6833] Secondary crashes when replicating a document that the primary accepted (with a field name containing a "$") Created: 23/Aug/12  Updated: 11/Jul/16  Resolved: 04/Sep/13

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 2.2.0-rc1
Fix Version/s: 2.5.2

Type: Bug Priority: Minor - P4
Reporter: Robert Stam Assignee: Matt Dannenberg
Resolution: Done Votes: 3
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
depends on SERVER-7150 update() able to create fields which ... Closed
depends on SERVER-6399 Refactor update() code Closed
Duplicate
is duplicated by SERVER-6537 Replicaset stop replication Closed
is duplicated by SERVER-18938 Inserting $id will crash Secondaries Closed
Related
is related to SERVER-7557 Update doesn't support "$"s in the fi... Closed
is related to CSHARP-547 Use of $csharpnull when serializing a... Closed
is related to SERVER-10163 Relax constraints in new new update f... Closed
Operating System: ALL
Backport Requested:
v2.4
Participants:

 Description   

The following code using the C# driver crashes the secondaries.

var server = MongoServer.Create("mongodb://Kilimanjaro:10001,Kilimanjaro:10002,Kilimanjaro:10003/?safe=true");
var database = server.GetDatabase("test");
var collection = database.GetCollection("test");
 
var document = new BsonDocument
{
    { "_id", new BsonDocument("$csharpnull", true) }
};
 
collection.Drop();
collection.Insert(document);

While it might be argued that the C# driver shouldn't have used a "$" in a field name (and will be changed to not do so), nonetheless the server should have protected itself and not crashed. If the document submitted by the client driver was not acceptable then it should have been rejected by the primary instead of allowing the secondaries to crash.

I've marked this as minor because it's a pretty rare edge case that probably doesn't occur very often in the wild (because it only occurs when serializing a class with a property of type BsonNull whose value is C# null, which isn't really all that useful). However, it is possible that this is just a special case of other funny documents that the primary accepts that subsequently crash the secondaries.

This is the log output from the secondary:

DBException+0x6e
mongod.exe  ...\src\mongo\util\net\sock.h(140)              mongo::SocketException::SocketException+0xd3
mongod.exe  ...\src\mongo\util\net\sock.cpp(763)            mongo::Socket::recv+0x300
mongod.exe  ...\src\mongo\util\net\message_port.cpp(164)    mongo::MessagingPort::recv+0x8c
mongod.exe  ...\src\mongo\util\net\message_port.cpp(227)    mongo::MessagingPort::recv+0x31
mongod.exe  ...\src\mongo\client\dbclient.cpp(1137)         mongo::DBClientConnection::call+0x96
mongod.exe  ...\src\mongo\client\dbclientcursor.cpp(66)     mongo::DBClientCursor::init+0xd1
mongod.exe  ...\src\mongo\client\dbclient.cpp(780)          mongo::DBClientBase::query+0xd8
mongod.exe  ...\src\mongo\client\dbclientinterface.h(1084)  mongo::DBClientConnection::query+0xb4
mongod.exe  ...\src\mongo\client\dbclient.cpp(664)          mongo::DBClientInterface::findN+0xb7
mongod.exe  ...\src\mongo\client\dbclient.cpp(684)          mongo::DBClientInterface::findOne+0x77
mongod.exe  ...\src\mongo\client\dbclient.cpp(354)          mongo::DBClientWithCommands::runCommand+0x2a6
mongod.exe  ...\src\mongo\client\dbclient.cpp(733)          mongo::DBClientConnection::runCommand+0x2d
mongod.exe  ...\src\mongo\db\repl\heartbeat.cpp(158)        mongo::requestHeartbeat+0x4fc
mongod.exe  ...\src\mongo\db\repl\heartbeat.cpp(266)        mongo::ReplSetHealthPollTask::_requestHeartbeat+0x1b9
mongod.exe  ...\src\mongo\db\repl\heartbeat.cpp(213)        mongo::ReplSetHealthPollTask::doWork+0x1ba
mongod.exe  ...\src\mongo\util\concurrency\task.cpp(58)     mongo::task::Task::run+0x5c
mongod.exe  ...\src\mongo\util\background.cpp(64)           mongo::BackgroundJob::jobBody+0x25c
Wed Aug 22 23:54:06 [rsHealthPoll] DBClientCursor::init call() failed
Wed Aug 22 23:54:06 [rsHealthPoll] warning: DBException thrown :: caused by :: 10276 DBClientBase::findN: transport error: Kilimanjaro:10002 ns: admin.$cmd query: { replSetHeartbeat: "myset", v: 3, pv: 1, checkEmpty: false, from: "Kilimanjaro:10003", $auth: {} }
mongod.exe    ...\src\mongo\util\stacktrace.cpp(158)                           mongo::printStackTrace+0x3e
mongod.exe    ...\src\mongo\util\assert_util.cpp(67)                           mongo::DBException::traceIfNeeded+0xa6
mongod.exe    ...\src\mongo\util\assert_util.cpp(141)                          mongo::uasserted+0x144
mongod.exe    ...\src\mongo\client\dbclient.cpp(666)                           mongo::DBClientInterface::findN+0x1ae
mongod.exe    ...\src\mongo\client\dbclient.cpp(684)                           mongo::DBClientInterface::findOne+0x77
mongod.exe    ...\src\mongo\client\dbclient.cpp(354)                           mongo::DBClientWithCommands::runCommand+0x2a6
mongod.exe    ...\src\mongo\client\dbclient.cpp(733)                           mongo::DBClientConnection::runCommand+0x2d
mongod.exe    ...\src\mongo\db\repl\heartbeat.cpp(158)                         mongo::requestHeartbeat+0x4fc
mongod.exe    ...\src\mongo\db\repl\heartbeat.cpp(266)                         mongo::ReplSetHealthPollTask::_requestHeartbeat+0x1b9
mongod.exe    ...\src\mongo\db\repl\heartbeat.cpp(213)                         mongo::ReplSetHealthPollTask::doWork+0x1ba
mongod.exe    ...\src\mongo\util\concurrency\task.cpp(58)                      mongo::task::Task::run+0x5c
mongod.exe    ...\src\mongo\util\background.cpp(64)                            mongo::BackgroundJob::jobBody+0x25c
mongod.exe    ...\src\third_party\boost\boost\bind\mem_fn_template.hpp(165)    boost::_mfi::mf1<void,mongo::BackgroundJob,boost::shared_ptr<mongo::BackgroundJob::JobStatus> >::operator()+0x47
mongod.exe    ...\src\third_party\boost\boost\thread\detail\thread.hpp(63)     boost::detail::thread_data<boost::_bi::bind_t<void,boost::_mfi::mf1<void,mongo::BackgroundJob,boost::shared_ptr<mongo::BackgroundJob::JobStatus> >,boost::_bi::list2<boost::_bi::value<mongo::BackgroundJob * __ptr64>,boost::_bi::value<boost::shared_ptr<mongo::BackgroundJob::JobStatus> > > > >::run+0x31
mongod.exe    ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(180)  boost::`anonymous namespace'::thread_start_function+0x21
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(314)      _callthreadstartex+0x17
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(292)      _threadstartex+0x7f
kernel32.dll                                                                   BaseThreadInitThunk+0xd
Wed Aug 22 23:54:15 [rsHealthPoll] replSet member Kilimanjaro:10002 is up
Wed Aug 22 23:54:25 [initandlisten] connection accepted from 192.168.1.17:19284#6 (3 connections now open)
Wed Aug 22 23:54:25 [conn5] warning: DBException thrown :: caused by :: 9001 socket exception [CLOSED] for 192.168.1.17:19278
Wed Aug 22 23:54:25 [initandlisten] connection accepted from 192.168.1.17:19287#7 (4 connections now open)
mongod.exe    ...\src\mongo\util\stacktrace.cpp(158)                           mongo::printStackTrace+0x3e
mongod.exe    ...\src\mongo\util\assert_util.cpp(67)                           mongo::DBException::traceIfNeeded+0xa6
mongod.exe    ...\src\mongo\util\assert_util.h(93)                             mongo::DBException::DBException+0x6e
mongod.exe    ...\src\mongo\util\net\sock.h(140)                               mongo::SocketException::SocketException+0xd3
mongod.exe    ...\src\mongo\util\net\sock.cpp(741)                             mongo::Socket::recv+0x1e5
mongod.exe    ...\src\mongo\util\net\message_port.cpp(164)                     mongo::MessagingPort::recv+0x8c
mongod.exe    ...\src\mongo\util\net\message_server_port.cpp(75)               mongo::pms::threadRun+0x46a
mongod.exe    ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(180)  boost::`anonymous namespace'::thread_start_function+0x21
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(314)      _callthreadstartex+0x17
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(292)      _threadstartex+0x7f
kernel32.dll                                                                   BaseThreadInitThunk+0xd
Wed Aug 22 23:54:26 [conn5] end connection 192.168.1.17:19278 (3 connections now open)
Wed Aug 22 23:54:27 [rsHealthPoll] replSet member Kilimanjaro:10002 is now in state STARTUP2
Wed Aug 22 23:54:55 [conn6] warning: DBException thrown :: caused by :: 9001 socket exception [CLOSED] for 192.168.1.17:19284
Wed Aug 22 23:54:55 [initandlisten] connection accepted from 192.168.1.17:19291#8 (4 connections now open)
Wed Aug 22 23:54:55 [conn7] warning: DBException thrown :: caused by :: 9001 socket exception [CLOSED] for 192.168.1.17:19287
Stack trace failed, SymInitialize failed with error 87
Wed Aug 22 23:54:55 [initandlisten] connection accepted from 192.168.1.17:19294#9 (5 connections now open)
Wed Aug 22 23:54:55 [conn7] end connection 192.168.1.17:19287 (3 connections now open)
mongod.exe    ...\src\mongo\util\stacktrace.cpp(158)                           mongo::printStackTrace+0x3e
mongod.exe    ...\src\mongo\util\assert_util.cpp(67)                           mongo::DBException::traceIfNeeded+0xa6
mongod.exe    ...\src\mongo\util\assert_util.h(93)                             mongo::DBException::DBException+0x6e
mongod.exe    ...\src\mongo\util\net\sock.h(140)                               mongo::SocketException::SocketException+0xd3
mongod.exe    ...\src\mongo\util\net\sock.cpp(741)                             mongo::Socket::recv+0x1e5
mongod.exe    ...\src\mongo\util\net\message_port.cpp(164)                     mongo::MessagingPort::recv+0x8c
mongod.exe    ...\src\mongo\util\net\message_server_port.cpp(75)               mongo::pms::threadRun+0x46a
mongod.exe    ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(180)  boost::`anonymous namespace'::thread_start_function+0x21
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(314)      _callthreadstartex+0x17
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(292)      _threadstartex+0x7f
kernel32.dll                                                                   BaseThreadInitThunk+0xd
Wed Aug 22 23:54:56 [conn6] end connection 192.168.1.17:19284 (3 connections now open)
Wed Aug 22 23:55:25 [conn8] warning: DBException thrown :: caused by :: 9001 socket exception [CLOSED] for 192.168.1.17:19291
Wed Aug 22 23:55:25 [initandlisten] connection accepted from 192.168.1.17:19300#10 (4 connections now open)
Wed Aug 22 23:55:25 [rsHealthPoll] replSet member Kilimanjaro:10002 is now in state RECOVERING
Wed Aug 22 23:55:25 [conn9] warning: DBException thrown :: caused by :: 9001 socket exception [CLOSED] for 192.168.1.17:19294
Stack trace failed, SymInitialize failed with error 87
Wed Aug 22 23:55:25 [initandlisten] connection accepted from 192.168.1.17:19303#11 (5 connections now open)
Wed Aug 22 23:55:25 [conn9] end connection 192.168.1.17:19294 (3 connections now open)
mongod.exe    ...\src\mongo\util\stacktrace.cpp(158)                           mongo::printStackTrace+0x3e
mongod.exe    ...\src\mongo\util\assert_util.cpp(67)                           mongo::DBException::traceIfNeeded+0xa6
mongod.exe    ...\src\mongo\util\assert_util.h(93)                             mongo::DBException::DBException+0x6e
mongod.exe    ...\src\mongo\util\net\sock.h(140)                               mongo::SocketException::SocketException+0xd3
mongod.exe    ...\src\mongo\util\net\sock.cpp(741)                             mongo::Socket::recv+0x1e5
mongod.exe    ...\src\mongo\util\net\message_port.cpp(164)                     mongo::MessagingPort::recv+0x8c
mongod.exe    ...\src\mongo\util\net\message_server_port.cpp(75)               mongo::pms::threadRun+0x46a
mongod.exe    ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(180)  boost::`anonymous namespace'::thread_start_function+0x21
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(314)      _callthreadstartex+0x17
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(292)      _threadstartex+0x7f
kernel32.dll                                                                   BaseThreadInitThunk+0xd
Wed Aug 22 23:55:26 [conn8] end connection 192.168.1.17:19291 (3 connections now open)
Wed Aug 22 23:55:27 [rsHealthPoll] replSet member Kilimanjaro:10002 is now in state SECONDARY
Wed Aug 22 23:55:55 [conn10] warning: DBException thrown :: caused by :: 9001 socket exception [CLOSED] for 192.168.1.17:19300
Wed Aug 22 23:55:55 [initandlisten] connection accepted from 192.168.1.17:19310#12 (4 connections now open)
Wed Aug 22 23:55:55 [conn11] warning: DBException thrown :: caused by :: 9001 socket exception [CLOSED] for 192.168.1.17:19303
Stack trace failed, SymInitialize failed with error 87
Wed Aug 22 23:55:55 [initandlisten] connection accepted from 192.168.1.17:19313#13 (5 connections now open)
Wed Aug 22 23:55:55 [conn11] end connection 192.168.1.17:19303 (3 connections now open)
mongod.exe    ...\src\mongo\util\stacktrace.cpp(158)                           mongo::printStackTrace+0x3e
mongod.exe    ...\src\mongo\util\assert_util.cpp(67)                           mongo::DBException::traceIfNeeded+0xa6
mongod.exe    ...\src\mongo\util\assert_util.h(93)                             mongo::DBException::DBException+0x6e
mongod.exe    ...\src\mongo\util\net\sock.h(140)                               mongo::SocketException::SocketException+0xd3
mongod.exe    ...\src\mongo\util\net\sock.cpp(741)                             mongo::Socket::recv+0x1e5
mongod.exe    ...\src\mongo\util\net\message_port.cpp(164)                     mongo::MessagingPort::recv+0x8c
mongod.exe    ...\src\mongo\util\net\message_server_port.cpp(75)               mongo::pms::threadRun+0x46a
mongod.exe    ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(180)  boost::`anonymous namespace'::thread_start_function+0x21
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(314)      _callthreadstartex+0x17
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(292)      _threadstartex+0x7f
kernel32.dll                                                                   BaseThreadInitThunk+0xd
Wed Aug 22 23:55:56 [conn10] end connection 192.168.1.17:19300 (3 connections now open)
Wed Aug 22 23:55:58 [initandlisten] connection accepted from 192.168.1.17:19314#14 (4 connections now open)
Wed Aug 22 23:56:03 [repl writer worker 1] warning: DBException thrown :: caused by :: 10068 invalid operator: $csharpnull
mongod.exe  ...\src\mongo\util\stacktrace.cpp(158)                           mongo::printStackTrace+0x3e
mongod.exe  ...\src\mongo\util\assert_util.cpp(67)                           mongo::DBException::traceIfNeeded+0xa6
mongod.exe  ...\src\mongo\util\assert_util.cpp(141)                          mongo::uasserted+0x144
mongod.exe  ...\src\mongo\db\matcher.cpp(285)                                mongo::Matcher::addOp+0x127
mongod.exe  ...\src\mongo\db\matcher.cpp(493)                                mongo::Matcher::parseMatchExpressionElement+0x36a
mongod.exe  ...\src\mongo\db\matcher.cpp(532)                                mongo::Matcher::Matcher+0x165
mongod.exe  ...\src\mongo\db\matcher_covered.cpp(34)                         mongo::CoveredIndexMatcher::CoveredIndexMatcher+0x48
mongod.exe  ...\src\mongo\db\queryoptimizer.cpp(482)                         mongo::QueryPlan::matcher+0x85
mongod.exe  ...\src\mongo\db\queryoptimizercursorimpl.cpp(823)               mongo::CursorGenerator::singlePlanCursor+0x12f
mongod.exe  ...\src\mongo\db\queryoptimizercursorimpl.cpp(845)               mongo::CursorGenerator::generate+0xad
mongod.exe  ...\src\mongo\db\queryoptimizercursorimpl.cpp(724)               mongo::NamespaceDetailsTransient::getCursor+0x79
mongod.exe  ...\src\mongo\db\ops\update.cpp(215)                             mongo::_updateObjects+0x6ff
mongod.exe  ...\src\mongo\db\ops\update.cpp(487)                             mongo::updateObjects+0x288
mongod.exe  ...\src\mongo\db\oplog.cpp(779)                                  mongo::applyOperation_inlock+0x4d9
mongod.exe  ...\src\mongo\db\repl\rs_sync.cpp(82)                            mongo::replset::SyncTail::syncApply+0x23e
mongod.exe  ...\src\mongo\db\repl\rs_sync.cpp(114)                           mongo::replset::multiSyncApply+0x4d
mongod.exe  ...\src\mongo\util\concurrency\thread_pool.cpp(66)               mongo::threadpool::Worker::loop+0x94
mongod.exe  ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(180)  boost::`anonymous namespace'::thread_start_function+0x21
mongod.exe  f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(314)      _callthreadstartex+0x17
mongod.exe  f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(292)      _threadstartex+0x7f
Wed Aug 22 23:56:04 [repl writer worker 1] ERROR: writer worker caught exception: invalid operator: $csharpnull on: { ts: Timestamp 1345694163000|1, h: 4794769045432243117, op: "i", ns: "test.test", o: { _id: { $csharpnull: true } } }
Wed Aug 22 23:56:04 [repl writer worker 1]   Fatal Assertion 16360
mongod.exe    ...\src\mongo\util\stacktrace.cpp(158)                                      mongo::printStackTrace+0x3e
mongod.exe    ...\src\mongo\util\assert_util.cpp(126)                                     mongo::fassertFailed+0x43
mongod.exe    ...\src\mongo\db\repl\rs_sync.cpp(118)                                      `mongo::replset::multiSyncApply'::`1'::catch$0+0xb5
mongod.exe    f:\dd\vctools\crt_bld\SELF_64_amd64\crt\prebuild\eh\amd64\handlers.asm(44)  _CallSettingFrame+0x20
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\prebuild\eh\frame.cpp(1337)         __CxxCallCatchBlock+0xeb
ntdll.dll                                                                                 RtlRestoreContext+0x2e2
mongod.exe    ...\src\mongo\db\repl\rs_sync.cpp(114)                                      mongo::replset::multiSyncApply+0x4d
mongod.exe    ...\src\mongo\util\concurrency\thread_pool.cpp(66)                          mongo::threadpool::Worker::loop+0x94
mongod.exe    ...\src\third_party\boost\libs\thread\src\win32\thread.cpp(180)             boost::`anonymous namespace'::thread_start_function+0x21
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(314)                 _callthreadstartex+0x17
mongod.exe    f:\dd\vctools\crt_bld\self_64_amd64\crt\src\threadex.c(292)                 _threadstartex+0x7f
kernel32.dll                                                                              BaseThreadInitThunk+0xd
Wed Aug 22 23:56:04 [repl writer worker 1]
 
***aborting after fassert() failure
 



 Comments   
Comment by Matt Dannenberg [ 31/Jul/13 ]

I believe Scott's fix for SERVER-7557 which changed okForStorage fixed this as well.

However, I am not certain of how to demonstrate/check that this is no longer a problem as it used to get blocked at some levels anyway and needed a broken driver to test.

Comment by Tad Marshall [ 27/Aug/12 ]

The Fatal Assertion is new in 2.2.0-rc1, but the issue with dollar signs in _id subfields interfering with replication was also seen in 2.0.3 – see SERVER-6537.

Comment by Craig Wilson [ 23/Aug/12 ]

Ok, after more expirimentation, it seems like this problem occurs only on documents that have an embedded document as an _id value where the first element starts with a $.

OK:
{ _id :

{ a : 1}

}
{ _id :

{ a : 1, $b : 2 }

}

NOT OK:
{ _id :

{ $a : 1 }

}
{ _id :

{ $a : 1, b : 2 }

}
{ _id :

{ $a : 1, $b : 2 }

}

The NOT OK's crash the secondaries upon replication and prevent recovery because they will try to catch up and hit the same issue.

I would assume this case is extremely rare, so probably not as high a priority. Someone should probably confirm this though...

Also, it should be noted that the C# driver prevents any field with a $ except for a few preserved for legacy reasons, so that even lessens the probability that this will affect someone.

Comment by Craig Wilson [ 23/Aug/12 ]

While it is an edge case with the C# driver to automatically persist this field name, there is no accounting for how many people have prefixed a field name with a $(intentionally or otherwise) and whether or not this is the cause. Until this is investigated and it's ramifications clarified, it should probably be classified as major or critical.

Comment by Tad Marshall [ 23/Aug/12 ]

Fixed line breaks in log output to improve readability.

Generated at Thu Feb 08 03:12:49 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.