[SERVER-1779] "finding median" too slow. Created: 12/Sep/10  Updated: 16/Nov/21  Resolved: 13/Sep/10

Status: Closed
Project: Core Server
Component/s: Performance
Affects Version/s: 1.6.2, 1.7.0
Fix Version/s: 1.7.1

Type: Bug Priority: Major - P3
Reporter: Sergei Tulentsev Assignee: Eliot Horowitz (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

ubuntu 9.10
24G RAM

the db is on a RAID0 of 2 disks


Operating System: Linux
Participants:

 Description   

Here's some excerpt from the logs
Sun Sep 12 10:51:14 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 24430203 } and { : 25662603 } took 3161 ms.
Sun Sep 12 10:51:36 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 26814034 } and { : 27937206 } took 14448 ms.
Sun Sep 12 10:52:10 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 22297203 } and { : 23324407 } took 2012 ms.
Sun Sep 12 10:52:55 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 60706232 } and { : 62958603 } took 23454 ms.
Sun Sep 12 10:53:05 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 25662603 } and { : 26814034 } took 3050 ms.
Sun Sep 12 10:53:09 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 23324407 } and { : 24430203 } took 2804 ms.
Sun Sep 12 10:53:34 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 37270223 } and { : 38933407 } took 24029 ms.
Sun Sep 12 10:54:19 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 40766434 } and { : 42625800 } took 13718 ms.
Sun Sep 12 10:54:53 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 58696810 } and { : 60706232 } took 11311 ms.
Sun Sep 12 10:55:11 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 30362400 } and { : 31572033 } took 9520 ms.
Sun Sep 12 10:55:35 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 72665435 } and { : 96292200 } took 7396 ms.
Sun Sep 12 10:55:53 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 68727613 } and { : 72665435 } took 7880 ms.
Sun Sep 12 10:56:04 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 38933407 } and { : 40766434 } took 10045 ms.
Sun Sep 12 10:56:43 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 35630403 } and { : 37270223 } took 7486 ms.
Sun Sep 12 14:31:02 [conn10] Finding median for index:

{ uiq: 1.0 }

between { : 65599205 } and { : 68727613 } took 95402 ms.
Sun Sep 12 14:32:34 [conn10] Finding median for index:

{ uiq: 1.0 }

between { : 62958603 } and { : 65599205 } took 72608 ms.
Sun Sep 12 14:34:19 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 56850022 } and { : 58696810 } took 58778 ms.
Sun Sep 12 14:35:20 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 55276200 } and { : 56850022 } took 50957 ms.
Sun Sep 12 14:43:17 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 45860418 } and { : 47944851 } took 47859 ms.
Sun Sep 12 14:44:10 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 51170454 } and { : 52417803 } took 38282 ms.
Sun Sep 12 14:44:47 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 53762457 } and { : 55276200 } took 27610 ms.
Sun Sep 12 14:45:55 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 42625800 } and { : 44283041 } took 34589 ms.
Sun Sep 12 14:46:19 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 47944851 } and { : 49876203 } took 17147 ms.
Sun Sep 12 14:47:07 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 49876203 } and { : 51170454 } took 23866 ms.
Sun Sep 12 14:47:48 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 44283041 } and { : 45860418 } took 18224 ms.
Sun Sep 12 14:48:53 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 52417803 } and { : 53762457 } took 21154 ms.

The indexes surely fit in memory. The whole DB does!

> db.answers.stats()
{
"sharded" : true,
"ns" : "pravdorub_production.answers",
"count" : 273039209,
"size" : 33919877756,
"avgObjSize" : 124.23079410547223,
"storageSize" : 39745093120,
"nindexes" : 2,
"nchunks" : 92,
"shards" : {
"brussels" : {
"ns" : "pravdorub_production.answers",
"count" : 136990615,
"size" : 17019609560,
"avgObjSize" : 124.23923755652896,
"storageSize" : 19872546560,
"numExtents" : 40,
"nindexes" : 2,
"lastExtentSize" : 1991168256,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 14175980928,
"indexSizes" :

{ "_id_" : 6682338992, "uiq_1" : 7493641936 }

,
"ok" : 1
},
"moscow" : {
"ns" : "pravdorub_production.answers",
"count" : 136048594,
"size" : 16900268196,
"avgObjSize" : 124.22229219068592,
"storageSize" : 19872546560,
"numExtents" : 40,
"nindexes" : 2,
"lastExtentSize" : 1991168256,
"paddingFactor" : 1,
"flags" : 1,
"totalIndexSize" : 13670329728,
"indexSizes" :

{ "_id_" : 6047942320, "uiq_1" : 7622387408 }

,
"ok" : 1
}
},
"ok" : 1
}

> db.stats()
{
"raw" : {
"brussels/moskau8:27018" :

{ "collections" : 3, "objects" : 137156015, "avgObjSize" : 124.23939636916398, "dataSize" : 17040180512, "storageSize" : 19872554752, "numExtents" : 42, "indexes" : 2, "indexSize" : 14192291200, "fileSize" : 45009076224, "ok" : 1 }

,
"moscow/moskau7:27018" :

{ "collections" : 3, "objects" : 136252914, "avgObjSize" : 124.22246986952514, "dataSize" : 16925673504, "storageSize" : 19872554496, "numExtents" : 42, "indexes" : 2, "indexSize" : 13691178368, "fileSize" : 45009076224, "ok" : 1 }

},
"objects" : 273408929,
"avgObjSize" : 124.23096107442782,
"dataSize" : 33965854016,
"storageSize" : 39745109248,
"numExtents" : 84,
"indexes" : 4,
"indexSize" : 27883469568,
"fileSize" : 90018152448,
"ok" : 1
}



 Comments   
Comment by Eliot Horowitz (Inactive) [ 13/Sep/10 ]

They won't block.
Its possible it could make the split point non-ideal - but the system already expects that can happen.

Comment by Sergei Tulentsev [ 13/Sep/10 ]

Here's a question: after splitVector yielded, what happens to writes that affect records in the key range? Will they wait in the queue, not affecting other queries? Or will they block?

Comment by Eliot Horowitz (Inactive) [ 13/Sep/10 ]

Made splitVector yield - so even when it takes a long time - won't block.

Comment by auto [ 13/Sep/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: yield in splitVector so even if it takes a while won't block SERVER-1779
http://github.com/mongodb/mongo/commit/5a4bfedb88fe94ed1a8de5eeeaf0b4985c22b4d8

Comment by Sergei Tulentsev [ 13/Sep/10 ]

That file doesn't even show up in the project tree. Interesting.

And here is the new data from the field. 4 splits in a row.

Mon Sep 13 19:17:12 [conn15] Finding the split vector for pravdorub_production.answers over

{ uiq: 1.0 }

keyCount: 845625 numSplits: 4 took 7238 ms.
Mon Sep 13 19:17:12 [conn15] query admin.$cmd ntoreturn:1 command: { splitVector: "pravdorub_production.answers", keyPattern:

{ uiq: 1.0 }

, min:

{ uiq: 22297203 }

, max:

{ uiq: 22764634 }

, maxChunkSizeBytes: 209715200 } reslen:137 7239ms

Mon Sep 13 19:17:21 [conn15] Finding the split vector for pravdorub_production.answers over

{ uiq: 1.0 }

keyCount: 845625 numSplits: 4 took 5989 ms.
Mon Sep 13 19:17:21 [conn15] query admin.$cmd ntoreturn:1 command: { splitVector: "pravdorub_production.answers", keyPattern:

{ uiq: 1.0 }

, min:

{ uiq: 25036348 }

, max:

{ uiq: 25662603 }

, maxChunkSizeBytes: 209715200 } reslen:137 5990ms

Mon Sep 13 19:17:28 [conn15] Finding the split vector for pravdorub_production.answers over

{ uiq: 1.0 }

keyCount: 845625 numSplits: 4 took 5955 ms.
Mon Sep 13 19:17:28 [conn15] query admin.$cmd ntoreturn:1 command: { splitVector: "pravdorub_production.answers", keyPattern:

{ uiq: 1.0 }

, min:

{ uiq: 25662603 }

, max:

{ uiq: 26296816 }

, maxChunkSizeBytes: 209715200 } reslen:137 5955ms

Mon Sep 13 19:17:39 [conn15] Finding the split vector for pravdorub_production.answers over

{ uiq: 1.0 }

keyCount: 845625 numSplits: 4 took 10032 ms.
Mon Sep 13 19:17:39 [conn15] query admin.$cmd ntoreturn:1 command: { splitVector: "pravdorub_production.answers", keyPattern:

{ uiq: 1.0 }

, min:

{ uiq: 26296816 }

, max:

{ uiq: 26814034 }

, maxChunkSizeBytes: 209715200 } reslen:137 10033ms

On another shard, however, it's a bit worse.
Mon Sep 13 19:17:38 [conn11] Finding the split vector for pravdorub_production.answers over

{ uiq: 1.0 }

keyCount: 845625 numSplits: 4 took 73671 ms.
Mon Sep 13 19:17:38 [conn11] query admin.$cmd ntoreturn:1 command: { splitVector: "pravdorub_production.answers", keyPattern:

{ uiq: 1.0 }

, min:

{ uiq: 21323400 }

, max:

{ uiq: 21776428 }

, maxChunkSizeBytes: 209715200 } reslen:137 74327ms

Just to remind: this is happening under heavy write load. One mongos is mongoimporting data as fast as he can.

Comment by Eliot Horowitz (Inactive) [ 13/Sep/10 ]

xcode...

erh@erh-tm1 ~/work/mongo -> git grep "doing delete inline"
s/d_migrate.cpp: log() << "doing delete inline" << endl;

Comment by Sergei Tulentsev [ 13/Sep/10 ]

Nope, mongos didn't get upgraded.
Will report later.

By the way, I tried searching in the sources by the message text, like "Finding median" or "doing delete inline" and founc nothing. Either my XCode can't search or these strings are not there. Which is it?

Comment by Eliot Horowitz (Inactive) [ 13/Sep/10 ]

Did you upgrade mongos and mongod?
Shouldn't be getting that medianKey message at all with this build

Comment by Sergei Tulentsev [ 13/Sep/10 ]

Yes, I saw your commit.

sergio@cs2607:~$ mongod --version
db version v1.7.1-pre-, pdfile version 4.5
Mon Sep 13 18:10:46 git version: 03a596e18b839a452b935141c329e4a1a6fa9b9f

Comment by Eliot Horowitz (Inactive) [ 13/Sep/10 ]

Can you send the git hash of the version you're running.
It got pushed at 7:30 am eastern today

Comment by Sergei Tulentsev [ 13/Sep/10 ]

There's nothing new in logs from the last nightly. I guess, your change didn't make it in.

Mon Sep 13 16:39:57 [conn6] Finding median for index:

{ uiq: 1.0 }

between { : 51170454 } and { : 52417803 } took 29713 ms.
Mon Sep 13 16:39:57 [conn6] query admin.$cmd ntoreturn:1 command: { medianKey: "pravdorub_production.answers", keyPattern:

{ uiq: 1.0 }

, min:

{ uiq: 51170454 }

, max:

{ uiq: 52417803 }

} reslen:75 29743ms
Mon Sep 13 16:39:57 [conn6] Finding size for ns: pravdorub_production.answers between { : 51170454 } and { : 52417803 } took 432 ms.

When you test it on your laptop, do you have some other activity going on? Like heavy writing?

I don't really mind if it takes a minute to find median key. I just hate that it blocks the whole thing.

We definitely need some better granularity for locks.

Comment by Eliot Horowitz (Inactive) [ 13/Sep/10 ]

Counted b-trees are a possibility - but a long term thing to look at.

Your case still doesn't make sense because on my laptop - i can scan the equivalent number of index nodes in < 2 secs consistently.

Can you try last night's nightly? The change above should tell us what the problem is even if it doesn't fix it.

Comment by Hans Meiser [ 13/Sep/10 ]

I guess the main problem is the search for the media. In d_split a complete index scan is made (exactly 1 + 1/2) to find the median.
I think this can be slow with many entries in btree.
Could you use a solution as mentioned in http://www.chiark.greenend.org.uk/~sgtatham/algorithms/cbtree.html?
This also could speedup the skip command with large n
If you only store counts in the nodes then overhead is negligible.

Comment by auto [ 12/Sep/10 ]

Author:

{'login': 'erh', 'name': 'Eliot Horowitz', 'email': 'eliot@10gen.com'}

Message: use splitVector instead of datasize and mediankey. 1 command vs. 2 and should be faster and more extensible.
still some code cleanup to do probably
SERVER-1550 SERVER-1779
http://github.com/mongodb/mongo/commit/2418da510eaf2fc5d1661ba816942086dfc9090e

Generated at Thu Feb 08 02:58:01 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.