[SERVER-17810] "matchTested" exec stage statistic is misleading Created: 30/Mar/15  Updated: 05/Feb/16  Resolved: 09/Jun/15

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: None
Fix Version/s: 3.1.5

Type: Improvement Priority: Major - P3
Reporter: David Storch Assignee: Qingyang Chen
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Minor Change
Sprint: Quint Iteration 5
Participants:

 Description   

Four query execution stages keep a counter called matchTested: AND_SORTED, FETCH, IXSCAN, and OR.

It would seem reasonable to assume that matchTested means the number of results that were tested against a filter (the number that passed the filter plus the number that were rejected by the filter). In fact, all of the stages count matchTested as just the number of results that pass the filter.

Not only is this confusing naming, but it is a less useful statistic: the number passing the filter can already be inferred via the advanced counter. We should consider changing the meaning of the counter to align with its naming.

Edit: For IXSCAN, FETCH, and OR, the value of matchTested can be inferred by the value of the advanced counter. As such, this information is redundant: we should remove it for these three stages. For AND_SORTED, we should change its semantics per above.



 Comments   
Comment by Githook User [ 09/Jun/15 ]

Author:

{u'name': u'Qingyang Chen', u'email': u'qingyang.chen@10gen.com'}

Message: SERVER-17810 remove matchTested specificStat

Closes #976

Signed-off-by: David Storch <david.storch@10gen.com>
Branch: master
https://github.com/mongodb/mongo/commit/ee2e87ef994fb486b05cbec24eb16d95b3226136

Comment by J Rassi [ 04/Jun/15 ]

Sounds good. Edited the description to reflect discussion.

Comment by David Storch [ 04/Jun/15 ]

I meant for all stages that keep a matchTested statistic, but on second thought we might want to keep it for AND_SORTED. It's usually pretty obvious how many documents are getting filtered out at each step using just the advanced numbers for each stage in the tree. It's helpful for AND_SORTED because it can help you distinguish between docs dropped because they weren't in the intersection versus docs in the intersection set that got dropped because they didn't pass the filter.

The stages that keep such a statistic are:

  • IXSCAN
  • FETCH
  • AND_SORTED
  • OR
Comment by J Rassi [ 04/Jun/15 ]

david.storch: do you mean remove the matchTested statistic for all stages, or just for the FETCH stage?

Comment by Qingyang Chen [ 04/Jun/15 ]

I've made some changes that should fix this problem, and one of them is in fetch.cpp. My fix is essentially just taking the ++_specificStats.matchTested out of the Filter::passes block.

In fetch.cpp (L210): It seems that matchTested would equal docsExamined if a filter existed and thus doesn't seem to be any added information.

Comment by David Storch [ 04/Jun/15 ]

I would be happy to resolve this by removing the matchTested statistic.

Comment by J Rassi [ 04/Jun/15 ]

I'm not convinced that we should even be keeping track of matchTested for FETCH. If the FETCH has no filter, it seems like matchTested would always be zero; if the FETCH has a filter, it seems like it would always be equal to the value of the child's advanced counter. david.storch, is this correct, and what do you think?

Comment by Qingyang Chen [ 04/Jun/15 ]

FETCH doesn't seem to be currently outputting matchTested in executionStats. Should I add this line as well?

Generated at Thu Feb 08 03:45:38 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.