[SERVER-16759] Query uses unexpected index Created: 07/Jan/15  Updated: 06/Apr/23  Resolved: 10/Jan/15

Status: Closed
Project: Core Server
Component/s: Index Maintenance
Affects Version/s: 2.6.6
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Alexander S. Assignee: Ramon Fernandez Marina
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

We didn't have such problems with 2.4 so this is more like a regression in 2.6.6.

We have the next indexes:

[
        {
                "v" : 1,
                "name" : "_id_",
                "key" : {
                        "_id" : 1
                },
                "ns" : "crm_production.sunspot_index_queue_entries"
        },
        {
                "v" : 1,
                "key" : {
                        "record_class_name" : 1,
                        "record_id" : 1
                },
                "name" : "record_class_name_1_record_id_1",
                "ns" : "crm_production.sunspot_index_queue_entries"
        },
        {
                "v" : 1,
                "key" : {
                        "priority" : -1,
                        "run_at" : 1,
                        "lock" : 1,
                        "record_class_name" : 1
                },
                "name" : "priority_-1_run_at_1_lock_1_record_class_name_1",
                "ns" : "crm_production.sunspot_index_queue_entries"
        }
]

And the next query:

'2015-01-07T04:46:25.014-0500 [conn1866] query crm_production.sunspot_index_queue_entries 
  query: { 
    $query: { priority: { $gte: -100 }, run_at: { $lte: new Date(1420623970574) }, lock: null, record_class_name: { $in: [ "Site" ] } },
    $orderby: { priority: -1, run_at: 1 } 
  }
  planSummary: 
    IXSCAN { record_class_name: 1.0, record_id: 1.0 }, 
    IXSCAN { record_class_name: 1.0, record_id: 1.0 } cursorid:366254546045 ntoreturn:100 ntoskip:0 nscanned:1421527 nscannedObjects:1421527 keyUpdates:0 numYields:578 locks(micros) r:27845385 nreturned:100 reslen:11362 14434ms'

Previously the query didn't took more than a few ms, now 15 seconds in average.



 Comments   
Comment by Ramon Fernandez Marina [ 10/Jan/15 ]

Hi heaven, thanks for letting us know. I'm going to resolve this ticket as "gone away", but if the issue appears again after you've refilled the collection please reopen the ticket and send us the information requested above so we can investigate if there's a bug here.

Regards,
Ramón.

Comment by Alexander S. [ 10/Jan/15 ]

It seems the problem gone after re-creating the index. Need a few more days to refill the collection and retry.

Comment by Ramon Fernandez Marina [ 08/Jan/15 ]

Hi heaven,

we'll need some more information to diagnose this behavior. Could you please follow the steps below and upload the resulting information?

  • Increase the log level by running:

    db.adminCommand( { setParameter: 1, logLevel: 2 } )

  • Display query shapes from the plan cache:

    db.user_data.getPlanCache().listQueryShapes()

  • Run getPlansByQuery on the offending query:

    db.user_data.getPlanCache().getPlansByQuery({ priority: { $gte: -100 }, run_at: { $lte: new Date(1420623970574) }, lock: null, record_class_name: { $in: [ "Site" ] } })

  • Run the same query with explain(true):

    db.user_data.find({priority: { $gte: -100 }, run_at: { $lte: new Date(1420623970574) }, lock: null, record_class_name: { $in: [ "Site" ] } }).explain(true)

  • Re-set the log level to the previous value (probably 0 if you didn't change this before) to prevent logs from taking too much space:

     
    db.adminCommand( { setParameter: 1, logLevel: 0 } )

You can put each piece of information in a file, zip all files along with the mongod.log log files from your cluster and post it to this ticket. If the resulting file is too large (hundreds of megabytes) please let me know and I'll send you different upload instructions.

Thanks,
Ramón.

Comment by Alexander S. [ 08/Jan/15 ]

Any suggestions please? This really hurts us.

We also tried:

{
                "v" : 1,
                "key" : {
                        "priority" : -1,
                        "record_class_name" : 1,
                        "run_at" : 1,
                        "lock" : 1
                },
                "name" : "priority_-1_record_class_name_1_run_at_1_lock_1",
                "ns" : "crm_production.sunspot_index_queue_entries"
        }

And the next query:

  query: { 
    $query: { record_class_name: { $in: [ "Site" ] }, run_at: { $lte: new Date(1420708758000) }, lock: null },
    $orderby: { priority: -1, record_class_name: 1, run_at: 1 }
  }

This worked with 2.4, I remember we had over 50 000 000 records in the queue and didn't experience any performance issues. But none of the above work in 2.6.6.

Comment by Alexander S. [ 07/Jan/15 ]

Please change the type from feature to bug.

UPD: sometimes it does work, sometimes — doesn't. We didn't change anything in our code but just started getting performance alerts after upgrade. db.upgradeCheckAllDBs() shows everything is fine with the database.

Generated at Thu Feb 08 03:42:10 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.