[SERVER-11670] attempt to create a large collection in local db causes continuous asserts Created: 12/Nov/13  Updated: 11/Jul/16  Resolved: 13/Feb/14

Status: Closed
Project: Core Server
Component/s: Concurrency
Affects Version/s: 2.5.3
Fix Version/s: 2.6.0-rc0

Type: Bug Priority: Major - P3
Reporter: Asya Kamsky Assignee: Eric Milkie
Resolution: Done Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-8579 Consolidate Mongod Lock/Resource Sche... Closed
Operating System: ALL
Steps To Reproduce:

Run large aggregation (of oplog.rs for example) with "out" parameter.

db.oplog.rs.aggregate({$project:{ts:1,h:1,op:1,oid:"$o._id",s:{$concat:["$o.shard","$ns"]}}},{$out:"newCollection"})

Participants:

 Description   

I was running aggregations and writing output to a file, but my database I was analyzing was "local" DB. Apparently if _aCommitIsNeeded detects that we have a write lock on "local" DB it fasserts() but since I'm running a regular build, it just continues fasserting till I kill -9 it.



 Comments   
Comment by Githook User [ 13/Feb/14 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-11670 do not print stack trace if attempting to commit journal while in local lock
Branch: master
https://github.com/mongodb/mongo/commit/0003b39a16344bad7fbfa765863d3c712bff3aaa

Comment by Asya Kamsky [ 25/Jan/14 ]

edward.norris@lumension.com regardless of this ticket, it is not ok to write anything to local database - it is for MongoDB's use and using it for user collections (to avoid replicating them) is not guaranteed to not cause problems (other than this problem you already ran into).

Comment by Ed Norris [ 23/Jan/14 ]

I'm seeing something similar with 2.4.9 on Windows.
When I write a bunch of data (about 100MB in my tests) to the "local" DB and it will occasionally spend 30-90 seconds emitting stack traces and errors like "ERROR: can't commitNow from commitIfNeeded, as we are in local db lock".

(The reason I'm writing to the local DB is that I'm doing a mapreduce operation on a replication set and I figure it will save a little time if the raw pre-reduced data doesn't get replicated. The output of the mapreduce operation goes to a replicated DB. If this is a bad idea, let me know)

Comment by Asya Kamsky [ 07/Dec/13 ]

I had about 4700 of these in my log:

2013-11-12T13:44:37.104-0800 [conn1] ERROR: can't commitNow from commitIfNeeded, as we are in local db lock
2013-11-12T13:44:37.110-0800 [conn1] 0x10060f60b 0x1001d85b4 0x1001d6a49 0x10028e521 0x10028eb14 0x1002952a6 0x10028aea1 0x100040c96 0x100340103 0x10033f2f6 0x10016867e 0x10016828e 0x10016787d 0x1001a9d95 0x1001aab46 0x1001ab77c 0x1003053ae 0x100305e9d 0x100293d2a 0x1000076e4
 0   mongod                              0x000000010060f60b _ZN5mongo15printStackTraceERSo + 43
 1   mongod                              0x00000001001d85b4 _ZN5mongo3dur11DurableImpl16_aCommitIsNeededEv + 348
 2   mongod                              0x00000001001d6a49 _ZN5mongo3dur11DurableImpl14commitIfNeededEb + 49
 3   mongod                              0x000000010028e521 _ZN5mongo11insertMultiEbPKcRSt6vectorINS_7BSONObjESaIS3_EERNS_5CurOpE + 113
 4   mongod                              0x000000010028eb14 _ZN5mongo14receivedInsertERNS_7MessageERNS_5CurOpE + 1316
 5   mongod                              0x00000001002952a6 _ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE + 7318
 6   mongod                              0x000000010028aea1 _ZN5mongo14DBDirectClient3sayERNS_7MessageEbPSs + 107
 7   mongod                              0x0000000100040c96 _ZN5mongo12DBClientBase6insertERKSsRKSt6vectorINS_7BSONObjESaIS4_EEi + 782
 8   mongod                              0x0000000100340103 _ZN5mongo17DocumentSourceOut5spillEPNS_12DBClientBaseERKSt6vectorINS_7BSONObjESaIS4_EE + 29
 9   mongod                              0x000000010033f2f6 _ZN5mongo17DocumentSourceOut7getNextEv + 716
 10  mongod                              0x000000010016867e _ZN5mongo14PipelineCursor7getNextEv + 64
 11  mongod                              0x000000010016828e _ZN5mongo14PipelineCursor2okEv + 30
 12  mongod                              0x000000010016787d _ZN5mongo15PipelineCommand3runERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb + 2505
 13  mongod                              0x00000001001a9d95 _ZN5mongo12_execCommandEPNS_7CommandERKSsRNS_7BSONObjEiRSsRNS_14BSONObjBuilderEb + 37
 14  mongod                              0x00000001001aab46 _ZN5mongo7Command11execCommandEPS0_RNS_6ClientEiPKcRNS_7BSONObjERNS_14BSONObjBuilderEb + 1920
 15  mongod                              0x00000001001ab77c _ZN5mongo12_runCommandsEPKcRNS_7BSONObjERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi + 1388
 16  mongod                              0x00000001003053ae _ZN5mongo11runCommandsEPKcRNS_7BSONObjERNS_5CurOpERNS_11_BufBuilderINS_16TrivialAllocatorEEERNS_14BSONObjBuilderEbi + 46
 17  mongod                              0x0000000100305e9d _ZN5mongo8runQueryERNS_7MessageERNS_12QueryMessageERNS_5CurOpES1_ + 2301
 18  mongod                              0x0000000100293d2a _ZN5mongo16assembleResponseERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE + 1818
 19  mongod                              0x00000001000076e4 _ZN5mongo16MyMessageHandler7processERNS_7MessageEPNS_21AbstractMessagingPortEPNS_9LastErrorE + 308

Which seems to be caused by this:

                        printStackTrace();
                        dassert(false); // this will make _DEBUG builds terminate. so we will notice in buildbot.
                        return false;

So, dassert which I thought I tracked down to

src/mongo/util/assert_util.h:# define MONGO_dassert(x) fassert(16199, (x))

but I see that's only on _DEBUG builds.

So, I guess it's not an assert, just an infinite loop of stack traces...

Comment by Andy Schwerin [ 06/Dec/13 ]

asya, you mean that it masserts, right? fasserts are fatal.

Generated at Thu Feb 08 03:26:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.