[SERVER-6904] Inner Collection (w/ > 10000 entries) with Indexed Fields Dramatically hurt performance when adding new fields on the outer document or entries in the inner collection Created: 30/Aug/12  Updated: 15/Feb/13  Resolved: 21/Jan/13

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 2.2.0-rc0, 2.2.0
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Gabriel Lesperance Assignee: Aaron Staple
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Behavior witnessed on OS X & Ubuntu 12.04


Issue Links:
Duplicate
duplicates SERVER-8192 Optimize btree key generation Closed
duplicates SERVER-8193 Optimize in place updates that modify... Closed
Operating System: ALL
Participants:

 Description   

The following has been tested on mongoDB v2.2.0rc0 & mongoDB v2.2.0:

      1. The test:

First initialize the database:

> use test
> db.dropDatabase()
> db.createCollection('testdocuments')
> db.testdocuments.insert(

{ _id : ObjectId("000000000000000000000001") }

)

Then populate a document's inner collection with objects:

> for ( var i = 1; i < 10000; i++) { db.testdocuments.update(

{ _id : ObjectId("000000000000000000000001") }

, { $push : { 'a' :

{ uid1 : i, uid2: i }

} }) }

Then add a new outer field:

> db.testdocuments.update(

{ _id : ObjectId("000000000000000000000001") }

, { $set :

{ 'aNewOuterField' : 1 }

})

All is well and the update takes normal time. But, if those inner documents are indexed:

> db.testdocuments.ensureIndex(

{ 'a.uid1' : 1 }

)

mongodb-console-> Thu Aug 30 14:17:31 [conn4] build index test.testdocuments

{ a.uid1: 1.0 }

mongodb-console-> Thu Aug 30 14:17:34 [conn4] build index done. scanned 1 total records. 2.244 secs
mongodb-console-> Thu Aug 30 14:17:34 [conn4] insert test.system.indexes keyUpdates:0 locks(micros) w:2244536 2244ms

and we add another outer field :

> db.testdocuments.update(

{ _id : ObjectId("000000000000000000000001") }

, { $set :

{ 'anotherOuterField' : 1 }

})

Then we get the MongoDB console reporting the action as slow:

mongodb-console-> Thu Aug 30 14:20:01 [conn4] update test.testdocuments query:

{ _id: ObjectId('000000000000000000000001') }

update: { $set:

{ anotherOuterField: 1.0 }

} idhack:1 nupdated:1 keyUpdates:0 locks(micros) w:4191036 4191ms

Indeed, the last operation has been reported to have taken more than 4 seconds even though there was no index modification at all.

This might be due to the document being moved but setting the profiler 'on',

> db.setProfilingLevel(1)

and adding another outer field:

> db.testdocuments.update(

{ _id : ObjectId("000000000000000000000001") }

, { $set :

{ 'oneLastOuterField' : 1 }

})

gives us the following:

> db.system.profile.find().pretty()

{
"ts" : ISODate("2012-08-30T18:23:03.053Z"),
"op" : "update",
"ns" : "test.testdocuments",
"query" :

{ "_id" : ObjectId("000000000000000000000001") }

,
"updateobj" : {
"$set" :

{ "oneLastOuterField" : 1 }

},
"idhack" : true,
"nupdated" : 1,
"keyUpdates" : 0,
"numYield" : 0,
"lockStats" : {
"timeLockedMicros" :

{ "r" : NumberLong(0), "w" : NumberLong(4064791) }

,
"timeAcquiringMicros" :

{ "r" : NumberLong(0), "w" : NumberLong(47) }

},
"millis" : 4064,
"client" : "127.0.0.1",
"user" : ""
}

which definitely shows that the delay isn't due to the document being 'moved'.

Adding to the inner collection is not faster either as

> db.testdocuments.update(

{ _id : ObjectId("000000000000000000000001") }

, { $push : { 'a' :

{ uid1 : -1, uid2: -1 }

} })

gives us the following profile

> db.system.profile.find().pretty()

{
"ts" : ISODate("2012-08-30T18:32:34.080Z"),
"op" : "update",
"ns" : "test.testdocuments",
"query" :

{ "_id" : ObjectId("000000000000000000000001") }

,
"updateobj" : {
"$push" : {
"a" :

{ "uid1" : -1, "uid2" : -1 }

}
},
"nscanned" : 1,
"nupdated" : 1,
"keyUpdates" : 1,
"numYield" : 0,
"lockStats" : {
"timeLockedMicros" :

{ "r" : NumberLong(0), "w" : NumberLong(4057496) }

,
"timeAcquiringMicros" :

{ "r" : NumberLong(0), "w" : NumberLong(213) }

},
"millis" : 4057,
"client" : "127.0.0.1",
"user" : ""
}



 Comments   
Comment by Gabriel Lesperance [ 21/Jan/13 ]

No problem. Thanks for the support.

Comment by Aaron Staple [ 21/Jan/13 ]

Hi Gabriel - I'm going to resolve this as a duplicate of SERVER-8192 and SERVER-8193. Please feel free to reopen if you have additional questions.

Comment by Aaron Staple [ 16/Jan/13 ]

Hi Gabriel,

When an update is requested that can't be accomplished by simply modifying the value of an existing field in place (think $inc modifying a field in place), and the document does not need to move to a new disk location, the update code will do the following:

1) extract all index keys for the old version of the document
2) extract all index keys for the new version of the document
3) find the index keys that were added or removed, and update the indexes appropriately

The slowdown you are seeing results from steps 1 and 2. After you add your a.uid index, steps 1 and 2 each need to extract 10k keys and store them in an std::set. (Before you added the index in your test, steps 1 and 2 did not need to be run for this index.)

I've filed two tickets to work on relevant optimizations:

SERVER-8192 is for optimizing the code that extracts the index keys from a document
SERVER-8193 is for optimizing update's index handling code in cases where there are no mods on fields belonging to the index

Comment by Gabriel Lesperance [ 09/Jan/13 ]

Any progress on this?

Generated at Thu Feb 08 03:13:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.