[SERVER-21178] Super slow query and increased memory usage on inefficient range queries Created: 28/Oct/15  Updated: 17/Nov/15  Resolved: 17/Nov/15

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 3.0.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Hoyt Ren Assignee: David Storch
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongo_index_bound_test.txt     Text File mongo_query_explain.txt    
Issue Links:
Duplicate
duplicates SERVER-20616 Plan ranker sampling from the beginni... Backlog
Related
related to SERVER-15086 Allow for efficient range queries ove... Closed
Operating System: ALL
Steps To Reproduce:

1.insert same data as the structure.
2.build index.
3.run query against the index.

Participants:

 Description   

It costs more than 20 mins to query 1 hour's data, which is about 10,000 doucuments, and the server's 100GB memory can be used up by these queries.

Details:
I setup a server with wiredtiger engine, config like:

wiredTiger:
    engineConfig:
      cacheSizeGB: 100
      statisticsLogDelaySecs: 0
      journalCompressor: snappy
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: snappy
    indexConfig:
      prefixCompression: true

I have a collection named 'orders'. The documents look like:

{
  order_id:485548,
  order_status:
  [
    {
      update_time:ISODate("2015-10-28T02:45:00Z"),
      status:1
    },
    {
      update_time:ISODate("2015-10-28T03:18:00Z"),
      status:2
    }
  ]
}

I built an index:

"key" : {"order_status.update_time" : -1,  "order_status.status" : -1}

then I run this query:

db.order.find({ "order_status" : { "$elemMatch" : { "update_time" : { "$gte" : ISODate("2015-10-28T02:00:00Z"), "$lt" : ISODate("2015-10-28T03:00:00Z") }, "status" : 2 } } }).explain();

I get:

"indexBounds" :
{
  "order_status.update_time" :
  [
    "[new Date(1446001200000), true)"
  ],
  "order_status.status" :
  [
    "[2.0, 2.0]"
  ]
}

I guess this means scan all data before '2015-10-28T03:00:00Z', which is more than 40GB data.

after I change the query to

db.order.find({ "order_status" : { "$elemMatch" : { "update_time" : { "$gte" : ISODate("2015-10-28T02:00:00Z") }, "status" : 2 } } }).explain();

I get:

"indexBounds" :
{
  "order_status.update_time" :
  [
    "[new Date(9223372036854775807), new Date(1445997600000)]"
  ],
  "order_status.status" :
  [
    "[2.0, 2.0]"
  ]
}

This looks better and runs much faster (finished within a second), but how can I get my original query work?



 Comments   
Comment by David Storch [ 17/Nov/15 ]

Hi Hoyt Ren,

Thanks for the additional output. The information contained in the explain makes it clear why the slower plan is being selected by the query engine: this is a variation of SERVER-20616. The way that the plan ranker samples from the data during its trial execution period misleads the engine by making it look like the "(new Date(1447750800000), true)" index bounds are more selective.

Fixes for either SERVER-20616 or SERVER-15086 would resolve your problem. Short of those fixes, you can work around the problem in your application. Assuming you cannot change your schema, indexing, or query patterns, you could remove the $lt predicate from your query so that the MongoDB server receives the following:

db.order.find({'order_status':{$elemMatch:{'update_time':{$gte:ISODate('2015-11-12T16:00:00+08')},'status':2}}},{_id:0,order_id:1,'buyers.uid':1})

On the client side, you will then need to filter out any results where "update_time" exceeds the $lt upper bound.

I am closing this ticket as a duplicate of SERVER-20616, since that is the underlying problem causing the engine to choose the slower plan.

Best,
Dave

Comment by Hoyt Ren [ 17/Nov/15 ]

Hi David,

Thanks for teaching me about the 'allPlansExecution' option. I attached the output as 'mongo_query_explain.txt'. It cost about 20min to generate this output, surely the problem is still there.

Sorry for making a writen-mistake in earlier comment that ISODate('2015-09-12T17:00:00+08')", should be ISODate('2015-11-12T17:00:00+08'), hope this doesn't confuse you.

Comment by David Storch [ 16/Nov/15 ]

Hi Hoyt Ren,

In order to help me diagnose, can you please post the full explain output from both queries you mention above? Please use the allPlansExecution explain mode in order to ensure that all relevant diagnostic information is included. The syntax is

db.collection.find(<predicate>, <projection>).explain("allPlansExecution");

I will put this ticket into the "waiting" state pending the requested additional information.

Best,
Dave

Comment by Hoyt Ren [ 12/Nov/15 ]

After I upgrade my server to 3.0.7, follow https://docs.mongodb.org/manual/tutorial/upgrade-revision/#upgrade-sharded-cluster
I find the load of the server is still high and the query still slow sometimes. Then, I do this test:

db.order.find({'order_status':{$elemMatch:{'update_time':{$gte:ISODate('2015-11-12T16:00:00+08'),$lt:ISODate('2015-09-12T17:00:00+08')},'status':2}}}, _id:0,order_id:1,'buyers.uid':1}).explain();

I get:

"indexBounds" : {
	"order_status.update_time" : [
		"(new Date(1442048400000), true)"
	],
	"order_status.status" : [
		"[2.0, 2.0]"
	]
}

I change the query to:

db.order.find({'order_status':{$elemMatch:{'update_time':{$gte:ISODate('2015-11-12T16:00:00+08')},'status':2}}},{_id:0,order_id:1,'buyers.uid':1}).explain();

I get:

"indexBounds" : {
	"order_status.update_time" : [
		"[new Date(9223372036854775807), new Date(1447315200000)]"
	],
	"order_status.status" : [
		"[2.0, 2.0]"
	]
}

when I issue:

db.version()

I get:

3.0.7

It seems that on the production server, the patch don't work.

I haven't rebuild the index because I can't interrupt the service. Any suggestions, what I can try?

Comment by Ramon Fernandez Marina [ 29/Oct/15 ]

Thanks for the update on 3.0.7, glad to hear it performs better for your use case. I'm going to close this ticket as a duplicate of SERVER-15086, feel free to vote for it and watch it for updates.

Regards,
Ramón.

Comment by Hoyt Ren [ 29/Oct/15 ]

The detailed output of the test attached as 'mongo_index_bound_test.txt'.

Comment by Hoyt Ren [ 29/Oct/15 ]

Hello Ramon Fernandez,

Thank you for the working solution and quick help. Now I understand this is an old story and talked frequently these days. However we haven't completely resolve it, we can always make hotfix.

As I tested 3.0.7, it's much better. When I built index ASC, mongo automatically got right direction of the bounds according to the amount of records, and the query become much faster. However when I built index DESC, it seems always scan from largest, even if scan from smallest should be much faster. This doesn't bother me becasue I usually expect mongo scan from largest, however mongo's behavior is still weird.

I guess [new Date(9223372036854775807)] and [true] are same thing here.

I attached my testing details. I removed irrelative lines of the output. For the test, the all data range is from "2015-10-28T16:00:00Z" to "2015-10-29T08:00:00Z". Hope this could help.

I will upgrade my production servers as soon as possible, and expected you guys could completely fix this problem later, good work.

Comment by Ramon Fernandez Marina [ 28/Oct/15 ]

Hoyt Ren, unfortunately the behavior with index bounds that your observing is a known limitation with multikey indices described in SERVER-15086, one that we hope to remove in the near future.

However we've made some recent improvements in 3.0.7 that may help with your specific case: SERVER-16042. Would you be able to try out this scenario in 3.0.7 and see if the query planner chooses better index bounds?

Thanks,
Ramón.

Generated at Thu Feb 08 03:56:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.