[SERVER-17263] and_common-inl.h:37 assertion error on query Created: 12/Feb/15  Updated: 20/May/15  Resolved: 20/May/15

Status: Closed
Project: Core Server
Component/s: Querying
Affects Version/s: 2.6.4
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: Jonathan Su Assignee: J Rassi
Resolution: Incomplete Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File and_common-inl.log    
Issue Links:
Duplicate
is duplicated by SERVER-17262 working_set.cpp:68 assertion error on... Closed
Operating System: ALL
Participants:

 Description   

At around the same time as https://jira.mongodb.org/browse/SERVER-17262 started happening for us, we are also getting this error when performing a regular query:

Assertion failure dest->hasLoc() src/mongo/db/exec/and_common-inl.h 37

This is happening in production in a sharded cluster setup. There hasn't been any change to how we query or store data for a long time, but both assertion errors started happening recently and consistently. Not being able to query is breaking a critical path for us right now.

Any insights you may be able to share on this would be much appreciated.

Thank you!



 Comments   
Comment by Ramon Fernandez Marina [ 07/Apr/15 ]

jonathan@cloudwords.com, we haven't heard back you from some time. We'd like to be able to reproduce this issue on our end, would you be able to provide the information requested by Jason above?

Thanks,
Ramón.

Comment by Jonathan Su [ 13/Feb/15 ]

Hi Jason,

Thank you for your response. We are testing out switching off internalQueryPlannerEnableIndexIntersection. Interestingly, when I switch that off it revealed a different issue with the sort query exceeding the in memory limit, so I suppose that was previously relying on the use of index intersection. It doesn't make sense though why that would be the case because the data set and the query/sort ran without such problem prior to 2.6 (and index intersection).

I am also trying to get the query plan detail for you but have not been successful yet. Will post here when I get it.

Jonathan

Comment by J Rassi [ 12/Feb/15 ]

To reason about what caused this error, I'll need to see what the full execution tree looks like for this query. I see from the stack trace that this plan is being invoked by the CachedPlanRunner, so we'd be to get the full execution tree out of the plan cache (but only when the issue is being actively reproduced). To isolate the problem, could you reproduce this error and paste into a comment the output of the running the following against the affected mongod?

var query = { $and: [ { c: "7976" }, { $or: [ { _id: null }, { $and: [ { sr: { $regex: "^en-za$", $options: "i" } }, { $or: [ { tt: { $elemMatch: { l: { $regex: "^es-ve$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-py$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-sv$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-pr$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-bo$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-cr$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-gt$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-do$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-la$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-cl$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-uy$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-es$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-ni$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-co$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-mx$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-pa$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-pe$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-ec$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-hn$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-us$", $options: "i" } } } }, { tt: { $elemMatch: { l: { $regex: "^es-ar$", $options: "i" } } } } ] }, { pp: { $all: [ { n: "cw_gp", v: "4239" } ] } } ] } ] } ] };
var sort = { h: 1 };
db.getSiblingDB("tmx").tmx.getPlanCache().listQueryShapes();
db.getSiblingDB("tmx").tmx.getPlanCache().getPlansByQuery(query,{},sort);

Separately, note that for the workaround above you may need to explicitly flush the plan cache for the affected collection after setting the value of the "internalQueryPlannerEnableIndexIntersection" parameter (a process restart will also flush the plan cache, but you'll need to set the parameter upon each restart). You can flush the plan cache by running the following: db.getSiblingDB("tmx").tmx.getPlanCache().clear().

~ Jason Rassi

Comment by J Rassi [ 12/Feb/15 ]

Hi,

From the stack trace, I suspect this is a novel issue affecting the hash-based index intersection query execution stage (also known as "AND_HASH"). As a temporary workaround for your production system, you should disable index intersection query plans by issuing the following command against all affected mongod instances. The query planner will still be able to resolve all application queries with single-index plans.

db.adminCommand({setParameter:1, internalQueryPlannerEnableIndexIntersection: false})

Upgrading to 2.6.5 or greater (2.6.7 is the latest release in the 2.6 series) is another viable workaround for this issue, as AND_HASH is disabled by default in those releases.

~ Jason Rassi

Generated at Thu Feb 08 03:43:50 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.