[SERVER-3594] new crash in 1.9.1: couldn't make room for new record (len: 135548) in capped ns <database>.system.profile Created: 15/Aug/11  Updated: 11/Jul/16  Resolved: 16/Sep/11

Status: Closed
Project: Core Server
Component/s: Stability
Affects Version/s: 1.9.1
Fix Version/s: 2.0.0-rc2

Type: Bug Priority: Major - P3
Reporter: Paul Harvey Assignee: Mathias Stearn
Resolution: Done Votes: 0
Labels: crash, insert
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Single-core VMWare 4GB RAM Ubuntu 10.04.3 LTS instance
2.6.32-33-server #71-Ubuntu SMP Wed Jul 20 17:42:25 UTC 2011 x86_64 GNU/Linux
Single stand-alone mongod, no sharding, no replica
Using auth, noprealloc, nosmallfiles, directoryperdb
mongod --version
db version v1.9.0, pdfile version 4.5
Mon Aug 15 10:52:08 git version: 434d8b47e569578d932e3caedd3bb16e8c0836a3

This is from mongodb-10gen-unstable package

Foswiki trunk svn rev 12313 with CPAN MongoDB version 0.43


Attachments: File example.json     File foswiki.log     File mongod-nightly.log     File mongod-nightly2.log     File mongod-nightly3.log     File mongod.log    
Issue Links:
Related
is related to SERVER-3618 Need to check size of object before t... Closed
Participants:

 Description   

We have reverted to MongoDB 1.9.0 because of a new crash in 1.9.1, triggered during the initial import of our data. The crash happened reliably (even after stopping & deleting /var/lib/mongodb/* & reimporting) until downgrading to 1.9.0.

Running the import - it works okay initially, until it gets to some 'interesting' data (attached an example), which is interesting because it has many fields which Foswiki indexes (many more than the 64 limit; basically, all the FIELD.<foo>.value fields). For what it's worth, the code that adds these indexes is at https://github.com/foswiki/MongoDBPlugin/blob/master/lib/Foswiki/Plugins/MongoDBPlugin.pm#L309

I have been trying to reproduce on our test environment so I can get some verbose log output, but as usual, it's working fine there...



 Comments   
Comment by Paul Harvey [ 14/Sep/11 ]

My apologies for the lengthy delay, fighting deadlines.

Success! Did a several full import runs with mongodb-10gen package
db version v2.0.0, pdfile version 4.5
Wed Sep 14 14:18:35 git version: 695c67dff0ffc361b8568a13366f027caa406222

I guess we can close this now. Thank you

Comment by Mathias Stearn [ 13/Sep/11 ]

How did the test go?

Comment by Paul Harvey [ 06/Sep/11 ]

Sorry for the delay, I will try to test today.

Comment by Mathias Stearn [ 06/Sep/11 ]

Have you been able to test with one of the RCs? I think this has been resolved, but I'd rather not close this case without confirmation.

Comment by Mathias Stearn [ 01/Sep/11 ]

Could you make sure that you don't have a script that is creating the system.profile collection to be 128k? It should now default to 1MB.

Could you also include the full log if this happens again?

Comment by Mathias Stearn [ 01/Sep/11 ]

Have you been able to retest? Based on the log from the 24th, it looks like you were using a pre-existing system.profile collection.

Comment by Paul Harvey [ 26/Aug/11 ]

Do you mean with a more recent build? The last run was done after rm -rf /var/lib/mongodb/* (or at least, I'm pretty sure).

Will try to test again next week.

Comment by Mathias Stearn [ 25/Aug/11 ]

Could you try dropping the system.profile collection? It should now default to 1MB.

Comment by Mathias Stearn [ 25/Aug/11 ]

I reopened. Looking into this now.

Comment by Paul Harvey [ 24/Aug/11 ]

Either I'm an idiot, or I don't have permission to re-open. Anyway, I hope somebody who sees this can re-open it for me

Comment by Paul Harvey [ 24/Aug/11 ]

Although it seems to get through much more data before failing, it does still fail.

Attached mongod-nightly3.log

Comment by Paul Harvey [ 24/Aug/11 ]

Re-confirmed crashes during import using the old
mongod --version
db version v2.0.0-rc0-pre-, pdfile version 4.5
Wed Aug 24 13:04:39 git version: 4edeb8f2b0050dc918a7d935afd458dd05ff515a

Now waiting for a new import to complete, using the new nightly
mongod --version
db version v2.0.0-rc0-pre-, pdfile version 4.5
Wed Aug 24 13:06:41 git version: 278f5cee2c98f9f0edac641111a6794e9c828f96

Comment by Mathias Stearn [ 23/Aug/11 ]

Please reopen if you still see this with the latest nightlies

Comment by Mathias Stearn [ 22/Aug/11 ]

OK, the latest nightly on our site should now have the fix. Could you d/l it and try again?

Comment by Paul Harvey [ 18/Aug/11 ]

The import failed again. Edit: build 4edeb8f2b0050dc918a7d935afd458dd05ff515a is older than the fixes committed, will wait for the next nightly.

mongod-nightly2.log comes from 4edeb8f2b0050dc918a7d935afd458dd05ff515a

Comment by Paul Harvey [ 18/Aug/11 ]

Now testing:

db version v2.0.0-rc0-pre-, pdfile version 4.5
Thu Aug 18 17:44:18 git version: 4edeb8f2b0050dc918a7d935afd458dd05ff515a

Still running, but seems to be getting further than usual.

Immediately prior to the current import run, I have confirmed again that the test environment crashed with db version v1.9.2-pre-, pdfile version 4.5
Thu Aug 18 17:05:21 git version: 7cad9757de992d4db6f6e991dedd970d9a17d0c0

Will report tomorrow if it runs to completion.

Comment by Paul Harvey [ 17/Aug/11 ]

Awesome, thanks! Will try the next nightly.

Comment by Mathias Stearn [ 16/Aug/11 ]

I think I've fixed this issue. Please either pull from master and rebuild or wait until the next nightly comes out and try again with profiling on.

Good catch by the way!

Comment by Mathias Stearn [ 16/Aug/11 ]

I think I found the issue and it is definitely in profile. For now I'd suggest leaving it disabled unless some part of your app depends on it.

Comment by Paul Harvey [ 16/Aug/11 ]

The import ran to completion, with profiling disabled. It's enabled again, waiting on another run with objectcheck enabled..

Comment by Paul Harvey [ 16/Aug/11 ]

Okay, I'll those two things in isolation first.

Indeed, 1.9 in production, and we do notice the "NOT RECOMMENDED FOR PRODUCTION" in the logs

Foswiki makes terribly JS-heavy queries, necessary to emulate the full expressiveness of Foswiki's native QuerySearch, which made 1.8.1 & 2 crash in terribly hard to reproduce ways (see SERVER-3131). 1.9.0 has been surprisingly solid. Anyway, we're a research team, so normal rules don't apply

Comment by Mathias Stearn [ 16/Aug/11 ]

It looks like if is dieing in the profiler. just to confirm, could you try disabling it to see if it still crashes? Also, try starting with --objcheck to make sure there are no garbage documents trying to make their way in.

PS - are you using 1.9 for a production deployment? We recommend running even releases since the odd ones are unstable dev builds. That said, I'm glad we have brave souls who can find bugs like this in the wild.

Comment by Paul Harvey [ 16/Aug/11 ]

Now testing db version v1.9.2-pre-, pdfile version 4.5
Tue Aug 16 12:06:21 git version: 7cad9757de992d4db6f6e991dedd970d9a17d0c0

Update: It failed again, somewhere else. Attached mongod-nightly.log, somewhat [redacted], which ran with verbose=true set in /etc/mongodb.conf.

So it seems the "many fields" document detail is a red herring. The document it was crashing on this time had nothing remarkable about it. It reliably (twice or more) crashes in the same place on each import run, both days I tried a > 1.9.0 mongod, even after stop, delete /var/lib/mongodb/* & start.

Let me know if you want me to try something else.

Comment by Mathias Stearn [ 15/Aug/11 ]

Could you test with the latest 1.9 nightly? There has been a lot of changes since 1.9.1.

Comment by Paul Harvey [ 15/Aug/11 ]

Created http://foswiki.org/Tasks/Item11042 - I'm wondering if the failing assert is new since 1.9.0?

Comment by Paul Harvey [ 15/Aug/11 ]

I'm not sure if calling ensureIndex on > 64 fields is related at all here, it's just unusual that this is happening on the very first piece of data which has enough fields for this to happen.

The code isn't crashing in the ensureIndex bit - it's crashing in the bits that add a new topic.

Generated at Thu Feb 08 03:03:29 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.