[SERVER-20281] Looks like optimizer choose improper index or full-scan sometimes. Created: 03/Sep/15  Updated: 08/Feb/23  Resolved: 26/Sep/15

Status: Closed
Project: Core Server
Component/s: MMAPv1, Performance
Affects Version/s: 2.4.14
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: 아나 하리 Assignee: Sam Kleinman (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Steps To Reproduce:

Don't know.

Participants:

 Description   

Looks like optimizer choose improper index or doing full scan sometimes.
There's a lot of disk read(page faults) when this case happen (Normal status there's only a few hundreds of disk read, but suddenly disk reads is going up to 10k).
But I am not sure what execution plan optimizer choose, because looks like there's no way to distinguish execution plan of current running query.

But execution plan is good when I execute this query manually. But looks like sometimes query running different way. Weird thing is that I can not find documents which is matching "creator_id" even during this pattern of query is running.

Is there any way to tell the execution plan this query choose?

  • Linux version : 2.6.32-358.6.2.el6.x86_64
  • Total disk data size : 1.3TB (total index size is about 1/3)
  • Total memory : 390GB
  • Indexes

    db01:PRIMARY> db.system.indexes.find().pretty()
    {
        "v" : 1,
        "key" : {
            "_id" : 1
        },
        "ns" : "db1.coll1",
        "name" : "_id_"
    }
    {
        "v" : 1,
        "key" : {
            "creator_id" : 1,
            "fd1" : 1,
            "fd2" : 1,
            "fd3" : 1,
            "fd4" : -1
        },
        "ns" : "db1.coll1",
        "name" : "creator_id_1_fd1_1_fd2_1_fd3_1_fd4_-1"
    }
    {
        "v" : 1,
        "key" : {
            "creator_id" : 1,
            "fd5" : 1,
            "fd6" : 1,
            "_id" : -1
        },
        "ns" : "db1.coll1",
        "name" : "creator_id_1_fd5_1_fd6_1__id_-1"
    }
    {
        "v" : 1,
        "key" : {
            "creator_id" : "hashed"
        },
        "ns" : "db1.coll1",
        "name" : "creator_id_hashed"
    }
    

  • Slow query profiling

    db01:PRIMARY> db.system.profile.find().limit(1).pretty()
    {
        "op" : "getmore",
        "ns" : "db1.coll1",
        "query" : {
            "$comment" : "{\"cmt\": []}",
            "$query" : {
                "_id" : {
                    "$gt" : ObjectId("51c161965ddbb86926acf403"),
                    "$lt" : ObjectId("ffffffffffffffffffffffff")
                },
                "creator_id" : ObjectId("51c1601bb14bd287915f6f76")
            },
            "$orderby" : {
                "_id" : 1
            }
        },
        "cursorid" : NumberLong("7719472921984034942"),
        "ntoreturn" : 50,
        "keyUpdates" : 0,
        "numYield" : 847485,
        "lockStats" : {
            "timeLockedMicros" : {
                "r" : NumberLong("41763789634"),
                "w" : NumberLong(0)
            },
            "timeAcquiringMicros" : {
                "r" : NumberLong("21953017915"),
                "w" : NumberLong(7)
            }
        },
        "nreturned" : 6,
        "responseLength" : 152,
        "millis" : 21953838,
        "ts" : ISODate("2015-07-16T00:58:03.175Z"),
        "client" : "127.0.0.10",
        "allUsers" : [ ],
        "user" : ""
    }
    

  • MongoDB Log

... getmore db1.coll1 query: { $comment: "{"cmt": []}", $query: { _id: { $gt: ObjectId('51c161965ddbb86926acf403'), $lt: ObjectId('ffffffffffffffffffffffff') }, creator_id: ObjectId('51c1601bb14bd287915f6f76') }, $orderby: { _id: 1 } } cursorid:7327170102889815944 ntoreturn:50 keyUpdates:0 numYields: 899085 locks(micros) r:48861004385 nreturned:12 reslen:284 25602157ms
```



 Comments   
Comment by Ramon Fernandez Marina [ 26/Sep/15 ]

matt.lee, I'm going to close this issue since we haven't heard back from your for a while. As Dan mentioned you can use explain() to find out why a specific query plan was chosen. And as Sam mentioned, if this behavior turns out to be a bug in 2.4 the recommended approach is to upgrade to the latest stable release (3.0.6 at the time of this writing).

Regards,
Ramón.

Comment by Sam Kleinman (Inactive) [ 09/Sep/15 ]

Hi Matt,

I wanted to check in and see if you were able to reproduce this issue on a different version, or if you were able to find what you needed using the 2.4 explain() method.

Regards,
sam

Comment by Daniel Pasette (Inactive) [ 04/Sep/15 ]

You can still run an explain on the query to understand which index was chosen in 2.4

Here's the documentation for the 2.4 branch:
https://docs.mongodb.org/v2.4/reference/operator/meta/explain/

Comment by 아나 하리 [ 03/Sep/15 ]

Hi Sam.

So there's no way to introspecting what optimizer doing in MongoDB 2.4.

Regards,
Matt.

Comment by Sam Kleinman (Inactive) [ 03/Sep/15 ]

I just want to confirm that you're seeing this issue in 2.4 of MongoDB. The query system underwent many improvements in the 2.6 release. Have you tried to reproduce this issue in 2.6 or 3.0?

Additionally, query introspection (i.e. the profiler, the explain operation, and slowms logging) was improved substantially for the 3.0 release which should provide some of the additional data that will help you address these issues.

Regards,
sam

Comment by 아나 하리 [ 03/Sep/15 ]

And slow query logging of mongodb log file doesn't tell me how many documents are scanned(nscanned field) for this case. Is it normal ?

Generated at Thu Feb 08 03:53:43 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.