[SERVER-18823] WT find performance is 1/2 that of MMv1 when using a multikey index Created: 04/Jun/15  Updated: 15/Sep/15  Resolved: 08/Jun/15

Status: Closed
Project: Core Server
Component/s: Index Maintenance, WiredTiger
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: David Hows Assignee: David Hows
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Backwards Compatibility: Fully Compatible
Sprint: Quint Iteration 5
Participants:

 Description   

Reproducer:

var tags = [ 'a','b','c','d'];
db.test.ensureIndex( { tags : 1, string : 1} )
for(x=0;x<1000000;x++){
	var tg = [];
	for(t=0;t<Math.floor(Math.random()*10);t++){tg.push(tags[x%4]+t)}
	db.test.insert( { tags : tg, string : tags[x%4] })
}
db.test.find( { tags : "b0", string : "b" } ).explain(verbose="executionStats")

When running the above code, the find will take twice as long under WT as it does on MMv1.

Logs from WT

2015-06-04T16:37:36.363+1000 I QUERY    [conn3] query test.test query: { query: { tags: "b0", string: "b" }, $explain: true } ntoreturn:0 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:1934 reslen:1517 locks:{ Global: { acquireCount: { r: 3870 } }, Database: { acquireCount: { r: 1935 } }, Collection: { acquireCount: { r: 1935 } } } 603ms
2015-06-04T16:37:36.893+1000 I QUERY    [conn3] query test.test query: { query: { tags: "b0", string: "b" }, $explain: true } ntoreturn:0 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:1934 reslen:1517 locks:{ Global: { acquireCount: { r: 3870 } }, Database: { acquireCount: { r: 1935 } }, Collection: { acquireCount: { r: 1935 } } } 526ms
2015-06-04T16:37:38.330+1000 I QUERY    [conn3] query test.test query: { query: { tags: "b0", string: "b" }, $explain: true } ntoreturn:0 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:1934 reslen:1517 locks:{ Global: { acquireCount: { r: 3870 } }, Database: { acquireCount: { r: 1935 } }, Collection: { acquireCount: { r: 1935 } } } 490ms
2015-06-04T16:37:39.602+1000 I QUERY    [conn3] query test.test query: { query: { tags: "b0", string: "b" }, $explain: true } ntoreturn:0 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:1934 reslen:1517 locks:{ Global: { acquireCount: { r: 3870 } }, Database: { acquireCount: { r: 1935 } }, Collection: { acquireCount: { r: 1935 } } } 490ms
2015-06-04T16:37:40.484+1000 I QUERY    [conn3] query test.test query: { query: { tags: "b0", string: "b" }, $explain: true } ntoreturn:0 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:1934 reslen:1517 locks:{ Global: { acquireCount: { r: 3870 } }, Database: { acquireCount: { r: 1935 } }, Collection: { acquireCount: { r: 1935 } } } 493ms

Logs from MMv1

2015-06-04T16:38:53.158+1000 I QUERY    [conn1] query test.test query: { query: { tags: "b0", string: "b" }, $explain: true } ntoreturn:0 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:2159 reslen:1517 locks:{ Global: { acquireCount: { r: 4320 } }, MMAPV1Journal: { acquireCount: { r: 2160 } }, Database: { acquireCount: { r: 2160 } }, Collection: { acquireCount: { R: 2160 } } } 536ms
2015-06-04T16:38:53.542+1000 I QUERY    [conn1] query test.test query: { query: { tags: "b0", string: "b" }, $explain: true } ntoreturn:0 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:2110 reslen:1517 locks:{ Global: { acquireCount: { r: 4222 } }, MMAPV1Journal: { acquireCount: { r: 2111 } }, Database: { acquireCount: { r: 2111 } }, Collection: { acquireCount: { R: 2111 } } } 287ms
2015-06-04T16:38:54.520+1000 I QUERY    [conn1] query test.test query: { query: { tags: "b0", string: "b" }, $explain: true } ntoreturn:0 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:2110 reslen:1517 locks:{ Global: { acquireCount: { r: 4222 } }, MMAPV1Journal: { acquireCount: { r: 2111 } }, Database: { acquireCount: { r: 2111 } }, Collection: { acquireCount: { R: 2111 } } } 289ms
2015-06-04T16:38:55.278+1000 I QUERY    [conn1] query test.test query: { query: { tags: "b0", string: "b" }, $explain: true } ntoreturn:0 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:2110 reslen:1517 locks:{ Global: { acquireCount: { r: 4222 } }, MMAPV1Journal: { acquireCount: { r: 2111 } }, Database: { acquireCount: { r: 2111 } }, Collection: { acquireCount: { R: 2111 } } } 295ms
2015-06-04T16:38:56.008+1000 I QUERY    [conn1] query test.test query: { query: { tags: "b0", string: "b" }, $explain: true } ntoreturn:0 ntoskip:0 keyUpdates:0 writeConflicts:0 numYields:2110 reslen:1517 locks:{ Global: { acquireCount: { r: 4222 } }, MMAPV1Journal: { acquireCount: { r: 2111 } }, Database: { acquireCount: { r: 2111 } }, Collection: { acquireCount: { R: 2111 } } } 282ms



 Comments   
Comment by Martin Bligh [ 15/Sep/15 ]

Was a while since we did this, but I think this is roughly what we were using for a test in the end, just to make the repro easy:

var num_queries = 10
print('QUERYING ' + num_queries)
var beforeMs = new Date().getTime()
for(x = 0; x < num_queries; x++) {
	db.test.find( { tags : "b0", string : "b" } ).itcount()
}
var intervalMs = new Date().getTime() - beforeMs
print((1000*num_queries)/intervalMs + ' queries/s')

Comment by Martin Bligh [ 08/Jun/15 ]

Discussed with Michael:

                "winningPlan" : {
                        "stage" : "FETCH",
                        "inputStage" : {
                                "stage" : "IXSCAN",
                                "keyPattern" : {
                                        "tags" : 1,
                                        "string" : 1
                                },
                                "indexName" : "tags_1_string_1",
                                "isMultiKey" : true,
                                "indexVersion" : 1,
                                "direction" : "forward",
                                "indexBounds" : {
                                        "tags" : [
                                                "[\"b0\", \"b0\"]"
                                        ],
                                        "string" : [
                                                "[\"b\", \"b\"]"
                                        ]
                                }
                        }
                },

For each item in the index we're walking, WT maps to a record ID that we have to search the collection for, whereas mmap maps directly to a diskLoc, so this is one scenario where we expect WT to be slower in the single-threaded case.

Comment by Alexander Gorrod [ 05/Jun/15 ]

It appears to me as though there is a performance difference for explain, rather than the underlying query. I populated a database using the script in the reproduction steps.

Then I run the following script:

var num_queries = 100000
var num_explained_queries = 1000
var first = new Date().getTime()
for(x=0;x<num_queries;x++){
	db.test.find( { tags : "b0", string : "b" } )
}
var second = new Date().getTime()
var interval = second - first
print('Did ' + num_queries + ' queries in: ' + interval + ' ms')
 
for(x=0;x<num_explained_queries;x++){
	db.test.find( { tags : "b0", string : "b" } ).explain(verbose="executionStats")
}
var third = new Date().getTime()
var interval = third - second
print('Did ' + num_explained_queries + ' explained queries in: ' + interval + ' ms')

With WiredTiger I see:

$ ../../mongo/mongo 18823_run.js 
MongoDB shell version: 3.1.4-pre-
connecting to: test
Did 100000 queries in: 717 ms
Did 1000 explained queries in: 30572 ms

With mmapv1 I see:

$ ../../mongo/mongo 18823_run.js 
MongoDB shell version: 3.1.4-pre-
connecting to: test
Did 100000 queries in: 792 ms
Did 1000 explained queries in: 13498 ms

In summary: The queries ran 10% faster when using WiredTiger, but the explain command was twice as slow. The explain part of the reproducer dominates the query itself.

Generated at Thu Feb 08 03:48:51 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.