[SERVER-13041] count queries that can be index only are slower in 2.6 Created: 05/Mar/14  Updated: 11/Jul/16  Resolved: 07/Mar/14

Status: Closed
Project: Core Server
Component/s: Performance, Querying
Affects Version/s: 2.6.0-rc0
Fix Version/s: 2.6.0-rc2

Type: Bug Priority: Major - P3
Reporter: Mark Callaghan Assignee: hari.khalsa@10gen.com
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File c2.js     File i.py    
Operating System: ALL
Steps To Reproduce:

Create a 10M doc collection with 3 columns (_id, c1, c2). Values inserted were from (0,0,0), (1,1,1), .... to (9999999, 9999999, 9999999).

import pymongo
import os
import time
 
def f(id, nrows):
    print 'hello', id, nrows
    client = pymongo.Connection(w=0)
    db = client.i
    coll = db.i
    for batch in xrange(nrows / 10000):
      docs = []
      for x in xrange(10000):
        row = (batch * 10000) + x
        docs.append({"_id": row, "c1":row, "c2":row})
      coll.insert(docs)
      print 'inserted ', batch
 
if __name__ == '__main__':
    f(1, 10000000)

Then create an index on c1...

db.i.ensureIndex({c1:1})

Then run these queries...

use i
db.i.find().count()
db.i.find({ _id : { $gt : 4000000 } }).count()
db.i.find({ c1 : { $gt : 4000000 } }).count()
db.i.find({ c1 : { $gt : 4000000 } }).hint({ c1 : 1 }).count()
db.i.find({ c1 : { $gt : 4000000 } }, { _id:0 }).count()
db.i.find({ c1 : { $gt : 4000000 } }, { _id:0 }).hint({ c1 : 1 }).count()
 
db.i.find({ c2 : { $gt : 4000000 } }).count()
db.i.find({ c2 : { $gt : 4000000 } }, { _id:0 }).count()

then look at the mongod error log for slow query results.

Participants:

 Description   

Count queries that can be index only are slower in 2.6 rc0 than in 2.4.9. For a 10M doc collection with an index on c1 I ran these queries. Queries 1,2 can use the PK on _id. Queries 2,3,4,5 can use the index on c1. Queries 6,7 should do a full scan.

1) db.i.find({ _id : { $gt : 4000000 } }).count()
2) db.i.find({ c1 : { $gt : 4000000 } }).count()
3) db.i.find({ c1 : { $gt : 4000000 } }).hint({ c1 : 1 }).count()
4) db.i.find({ c1 : { $gt : 4000000 } }, { _id:0 }).count()
5) db.i.find({ c1 : { $gt : 4000000 } }, { _id:0 }).hint({ c1 : 1 }).count()
6) db.i.find({ c2 : { $gt : 4000000 } }).count()
7) db.i.find({ c2 : { $gt : 4000000 } }, { _id:0 }).count()

I ran the set of queries twice and looked at times for the second run. All data is cached in RAM.

For 2.4.9 the times are ~750 miillis each for the first 5 queries and then ~2.4 seconds for the last 2 queries.

For 2.6 rc0 the times are ~2 seconds each for the first 5 queries and then ~2.9 seconds for the last 2 queries that should do a full scan. The index-only count queries are much slower in 2.6. Tull scan count queries are also slower (2.9 vs 2.4 seconds) but the difference is less significant.

Alas, I can't do find().count().explain()

Sorry for mixing javascript and python, I am new to javascript. I also don't have the latest version of pymongo.



 Comments   
Comment by hari.khalsa@10gen.com [ 07/Mar/14 ]

According to our benchmarks 2.6 should now be faster than 2.4. I'm going to resolve this ticket but do reopen if it's not solved for you. Thanks again for reporting it.

Comment by Davide Italiano [ 06/Mar/14 ]

In my test case scenarios this commit definitely fixes the regression:

POST:
Threads: 1 ops/sec: 9.65344e+06
Threads: 2 ops/sec: 1.84128e+07
Threads: 4 ops/sec: 1.91791e+07
Threads: 6 ops/sec: 2.6462e+07
Threads: 8 ops/sec: 3.38295e+07
 
PRE:
Threads: 1 ops/sec: 4.20415e+06
Threads: 2 ops/sec: 7.41015e+06
Threads: 4 ops/sec: 1.07181e+07
Threads: 6 ops/sec: 1.11049e+07
Threads: 8 ops/sec: 1.34372e+07
 

Also, this is faster than 2.4.9, results here for completeness:

Threads: 1 ops/sec: 8.95095e+06
Threads: 2 ops/sec: 1.4573e+07
Threads: 4 ops/sec: 2.32288e+07
Threads: 6 ops/sec: 2.22272e+07
Threads: 8 ops/sec: 2.84657e+07

FYI, to reproduce, get mongo-perf from github, build it and run

./benchmark --test command --raw --multi-db

and wait for the output of Commands::CountsIntIDRange

Comment by Mark Callaghan [ 06/Mar/14 ]

Thanks, will repeat tests on RC2

Comment by hari.khalsa@10gen.com [ 06/Mar/14 ]

Hi mdcallag, thanks for the heads up. I've submitted a change that should improve indexed count performance.

To be honest, there was some superfluous code in the fast-count execution stage that was left-over from when the count stage was created by chipping away at the index scan stage. My change removes that cruft. I'll have more concrete numbers about the impact when our performance suites run internally. Feel encouraged to test the change out and report back here. Unfortunately this change didn't make the RC1 cutoff so you'll have to pull the latest code from master to test it yourself.

Thanks again!

Comment by Githook User [ 06/Mar/14 ]

Author:

{u'username': u'hkhalsa', u'name': u'Hari Khalsa', u'email': u'hkhalsa@10gen.com'}

Message: SERVER-13041 remove some slow leftover ixscan junk from fast count stage
Branch: master
https://github.com/mongodb/mongo/commit/d1dc7cf2b213d77103658ccd2ea4816b33a27f6a

Comment by Mark Callaghan [ 05/Mar/14 ]

From explain output for that last batch of queries.

  • 2.4.9 uses BtreeCursor for all but the first (which gets BasicCursor)
  • 2.6 rc0 uses BtreeCursor for all but the first and the fourth (which get BasicCursor)
Comment by Mark Callaghan [ 05/Mar/14 ]

Replaced the .count() with a predicate on c2 that matches zero rows and repeated tests. In this case 2.6 did better.

use i
db.i.find({c2 : { $lt : 0} }).explain()
db.i.find({ _id : { $gt : 4000000 }, c2 : { $lt : 0} }).explain()
db.i.find({ c1 : { $gt : 4000000 }, c2 : { $lt : 0} }).explain()
db.i.find({ c1 : { $gt : 4000000 }, c2 : { $lt : 0} }).hint({ c1 : 1 }).explain()
db.i.find({ c1 : { $gt : 4000000 }, c2 : { $lt : 0} }, { _id:0 }).explain()
db.i.find({ c1 : { $gt : 4000000 }, c2 : { $lt : 0} }, { _id:0 }).hint({ c1 : 1 }).explain()
 
db.i.find({c2 : { $lt : 0} })
db.i.find({ _id : { $gt : 4000000 }, c2 : { $lt : 0} })
db.i.find({ c1 : { $gt : 4000000 }, c2 : { $lt : 0} })
db.i.find({ c1 : { $gt : 4000000 }, c2 : { $lt : 0} }).hint({ c1 : 1 })
db.i.find({ c1 : { $gt : 4000000 }, c2 : { $lt : 0} }, { _id:0 })
db.i.find({ c1 : { $gt : 4000000 }, c2 : { $lt : 0} }, { _id:0 }).hint({ c1 : 1 })

And from the tail of the mongod error log the result for the 3rd and 5th queries are interesting to me because I have hinted the index scan and 2.6 is faster. That might be from a different change that avoids evaluating a predicate implied by the index scan.

for 2.4.9
query: { c2: { $lt: 0.0 } } ntoreturn:0 ntoskip:0 nscanned:10000000 keyUpdates:0 locks(micros) r:2609711 nreturned:0 reslen:20 2609ms
query: { _id: { $gt: 4000000.0 }, c2: { $lt: 0.0 } } ntoreturn:0 ntoskip:0 nscanned:5999999 keyUpdates:0 locks(micros) r:8731502 nreturned:0 reslen:20 8731ms
query: { c1: { $gt: 4000000.0 }, c2: { $lt: 0.0 } } ntoreturn:0 ntoskip:0 nscanned:5999999 keyUpdates:0 locks(micros) r:8912593 nreturned:0 reslen:20 8912ms
query: { query: { c1: { $gt: 4000000.0 }, c2: { $lt: 0.0 } }, $hint: { c1: 1.0 } } ntoreturn:0 ntoskip:0 nscanned:5999999 keyUpdates:0 locks(micros) r:5516241 nreturned:0 reslen:20 5516ms
query: { c1: { $gt: 4000000.0 }, c2: { $lt: 0.0 } } ntoreturn:0 ntoskip:0 nscanned:5999999 keyUpdates:0 locks(micros) r:8869878 nreturned:0 reslen:20 8869ms
query: { query: { c1: { $gt: 4000000.0 }, c2: { $lt: 0.0 } }, $hint: { c1: 1.0 } } ntoreturn:0 ntoskip:0 nscanned:5999999 keyUpdates:0 locks(micros) r:5628963 nreturned:0 reslen:20 5628ms
 
for 2.6.0 rc0
query: { c2: { $lt: 0.0 } } planSummary: COLLSCAN ntoreturn:0 ntoskip:0 keyUpdates:0 numYields:4 locks(micros) r:5256568 nreturned:0 reslen:20 2795ms
query: { _id: { $gt: 4000000.0 }, c2: { $lt: 0.0 } } planSummary: IXSCAN { _id: 1 } ntoreturn:0 ntoskip:0 keyUpdates:0 numYields:2 locks(micros) r:9018143 nreturned:0 reslen:20 4600ms
query: { c1: { $gt: 4000000.0 }, c2: { $lt: 0.0 } } planSummary: IXSCAN { c1: 1.0 } ntoreturn:0 ntoskip:0 keyUpdates:0 numYields:5 locks(micros) r:8510758 nreturned:0 reslen:20 4606ms
query: { query: { c1: { $gt: 4000000.0 }, c2: { $lt: 0.0 } }, $hint: { c1: 1.0 } } planSummary: IXSCAN { c1: 1.0 } ntoreturn:0 ntoskip:0 keyUpdates:0 numYields:5 locks(micros) r:9050400 nreturned:0 reslen:20 4640ms
query: { c1: { $gt: 4000000.0 }, c2: { $lt: 0.0 } } planSummary: COLLSCAN ntoreturn:0 ntoskip:0 keyUpdates:0 numYields:5 locks(micros) r:5754720 nreturned:0 reslen:20 2918ms
query: { query: { c1: { $gt: 4000000.0 }, c2: { $lt: 0.0 } }, $hint: { c1: 1.0 } } planSummary: IXSCAN { c1: 1.0 } ntoreturn:0 ntoskip:0 keyUpdates:0 numYields:9 locks(micros) r:8809494 nreturned:0 reslen:20 4652ms

Comment by Mark Callaghan [ 05/Mar/14 ]

Just learned not to paste code, reformatting kills it. Attached queries in one file and python for loading in another.

Generated at Thu Feb 08 03:30:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.