[SERVER-7928] Deleting documents constantly corrupts collection Created: 13/Dec/12  Updated: 11/Jul/16  Resolved: 11/Apr/13

Status: Closed
Project: Core Server
Component/s: Concurrency, Index Maintenance, Querying
Affects Version/s: 2.2.2
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Artem Chivchalov Assignee: Thomas Rueckstiess
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Debian 5, Linux kernel 2.6.26-2-amd64, mongod 2.2.2 Dec 13, php driver 1.3.0RC3-dev, node.js driver 1.2.-0


Attachments: Zip Archive mongodb.log.bugged.2.zip     Zip Archive mongodb.log.bugged.zip    
Issue Links:
Related
is related to SERVER-7511 update code does not properly determi... Closed
Operating System: Linux
Participants:

 Description   

I have collection named "games" with rapidly changing content - documents are inserted every few seconds. Collection was TTL-indexed before with field "info.createtime".

Problem is that I constantly get this error with the collection:

Invalid BSONObj size: -286331154 (0xEEEEEEEE) first element: _id: ObjectId('50b5bafbae210be727000000')

I have discovered that it appears in regards of deleting documents while other clients trying to get some query results by that index. I tried to remove expireAfterSeconds attribute from index and do cron job removing expired items instead (with $atomic option), but the error remains the same.

I have struck into this problem, because the error exists on constant basis. Console command db.repairDatabase() fixes the problem but it reappears in next few hours.

After further discovering I was able to workaround this by calling reIndex() every time I remove expired items.

gamesCollection.remove( {'info.createtime': {$lt: hour_ago}, $atomic: 1}, {}, function() {
gamesCollection.reIndex(function(err, result) {});
});

The error didn't show anymore. But it is ugly way.

document example: http://pastebin.com/uFMrVfLR
db.games.getIndexes(): http://pastebin.com/Cz95ejK8
db.games.validate(): http://pastebin.com/BRfqp0kq



 Comments   
Comment by Stennie Steneker (Inactive) [ 11/Apr/13 ]

Hi Artem,

I'm closing this issue as a fix for SERVER-7511 has been included as of MongoDB 2.2.3.

If you are still seeing these assertions after upgrading to MongoDB 2.2.3 or newer, a single reIndex on the index mentioned in the error message should resolve the problem.

Regards,
Stephen

Comment by Thomas Rueckstiess [ 09/Jan/13 ]

Hi Artem,

Apologies for the delay. We've found a related bug that is likely causing this problem.

It manifests under these conditions:

  1. have an index (or part of compound index) that reaches into an array, for example index on {'a.b':1}

    , and example document {_id:0, 'a':[

    {'b':'foo'}

    ] }

  2. update a document in the array with the $ operator syntax (line 1 below) or the array index syntax (line 2 below):

    {'$set': {'a.$': {'b':'bar'}}} 
    {'$set': {'a.0': {'b':'bar'}}}

You have an index on

{
  "info.createtime" : 1,
  "info.type" : 1,
  "players.leave" : 1,
  "info.params.competition" : 1
}

and I found updates of this type in your log file:

update: { $set: { players.1: { v: 26, id: 1, ...

So your case fulfills both conditions (the index part reaching into the array is players.leave).

We are working on fixing the bug currently. Please check SERVER-7511 for possible work arounds in the mean time.

Regards,
Thomas

Comment by Artem Chivchalov [ 27/Dec/12 ]

Any progress with this? Can I do something else to help you sort out this issue?

Comment by Artem Chivchalov [ 19/Dec/12 ]

Upgraded OS to Debian 6 and kernel 2.6.32-5-amd64, the problem persists.

Comment by Eliot Horowitz (Inactive) [ 17/Dec/12 ]

Can you send full log from the server?
Ideally you could:
1) increase log verbosity
2) reindex
3) wait for it to happen again
4) send full log

Comment by Artem Chivchalov [ 15/Dec/12 ]

I have connected mms-agent yesterday, if it can help.

Comment by Artem Chivchalov [ 14/Dec/12 ]

I think so, but I am not totally sure. For now it is running on 2.2.2 with no TTL and I continue to receive this error while deleting items with no reIndex() right after that. Error is 100% reproducable after waiting a while. reIndex() always helps.

Comment by Eliot Horowitz (Inactive) [ 14/Dec/12 ]

Was this running with a TTL on 2.2.0 or 2.2.1?

Generated at Thu Feb 08 03:15:58 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.