[SERVER-55098] [SBE] jstests/noPassthrough/read_majority.js hangs indefinitely Created: 09/Mar/21  Updated: 29/Oct/23  Resolved: 18/Apr/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 5.0.0-rc0

Type: Bug Priority: Major - P3
Reporter: Mihai Andrei Assignee: Justin Seyster
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Backwards Compatibility: Fully Compatible
Operating System: ALL
Steps To Reproduce:

buildscripts/resmoke.py --suites=no_passthrough --mongodSetParameters='{featureFlagSBE: true}' jstests/noPassthrough/read_majority.js

Sprint: Query Execution 2021-04-05, Query Execution 2021-04-19, Query Execution 2021-05-03
Participants:

 Comments   
Comment by Githook User [ 17/Apr/21 ]

Author:

{'name': 'Justin Seyster', 'email': 'justin.seyster@mongodb.com', 'username': 'jseyster'}

Message: SERVER-55098 read_majority.js test fixup
Branch: master
https://github.com/mongodb/mongo/commit/46f91e5bd299bdfb6618c3fe8a82df68eadd30b8

Comment by Justin Seyster [ 16/Apr/21 ]

An update to the update: The difference between the classic executor and SBE that results in the hang is that the classic executor acquires its 
AutoGetCollectionForReadMaybeLockFree using just the collection's NamespaceString, whereas SBE uses a NamespaceStringOrUUID that is populated with both. I don't know the details as to why, but passing the UUID to the locking function causes it to wait for the next snapshot. I think that when locking by just the name, the locking functions don't know about the rename and so don't know that they should wait for the rename to be majority committed before proceeding. Regardless, the query engine eventually discovers the rename and bails, so this case is still covered.
The lock acquisition that Mihai's patch adds is also by namespace only, which is why that patch stops this test from waiting indefinitely.

Comment by Justin Seyster [ 15/Apr/21 ]

An update on this: Mihai's proposed fix for SERVER-55658 also fixes this hang. I have absolutely no idea why! Regardless, I don't see any evidence of incorrect behavior with or without that fix, so I'd say there's nothing more to do here.

I'll figure out how to clean up the red herring "isIxscan" failures, and we should be able to call this one good.

Comment by Justin Seyster [ 15/Apr/21 ]

After some time investigating, I have a rough handle one why we're seeing this behavior. The most important data point, though, is in the comments:

https://github.com/mongodb/mongo/blob/7590a2807ec192957ae00c2e72d171d4da3ab2b2/jstests/noPassthrough/read_majority.js#L202-L205

 

For this operation, timing out is actually desired behavior. It seems this test is failing because SBE is doing a better job. I think the best move will be to modify this test so that both the classic behavior (returning an empty results set) and the SBE behavior (timing out) pass the test.

Comment by Martin Neupauer [ 30/Mar/21 ]

However, all the explain related failures are simple red herring. When I manually disable those asserts then we nicely hang.

Comment by Martin Neupauer [ 30/Mar/21 ]

It is not hanging on me.

However, it fails with this assert 

assert(isIxscan(db, getExplainPlan(\{version: 1})));

That does not make sense as the explain looks like this (i.e. it contains IXSCAN):

{{{{ "queryPlan" : { "stage" : "FETCH", "planNodeId" : 2, "inputStage" : { "stage" : "IXSCAN", "planNodeId" : 1, "keyPattern" :

{ "version" : 1 }

, "indexName" : "version_1", "isMultiKey" : false, "multiKeyPaths" : { "version" : [ ] }, "isUnique" : false, "isSparse" : false, "isPartial" : false, "indexVersion" : 2, "direction" : "forward", "indexBounds" : { "version" : [ "[1.0, 1.0]" ] } } }, "slotBasedPlan" : { "slots" : "$$RESULT=s5 $$RID=s6 env:

{ timeZoneDB = s1 (TimeZoneDatabase(America/Porto_Velho...Eire)) }

", "stages" : "[2] nlj [] [s2]     left         [1] nlj [] [s3, s4]             left                 [1] project [s3 = KS(2B020104), s4 = KS(2B02FE04)]                 [1] limit 1                 [1] coscan             right                 [1] ixseek s3 s4 s2 [] @\"80ae21a1-c3e1-4080-9d4e-df77e2ed0f23\" @\"version_1\" true                         right         [2] limit 1         [2] seek s2 s5 s6 [] @\"80ae21a1-c3e1-4080-9d4e-df77e2ed0f23\" true     " }}}}

Generated at Thu Feb 08 05:35:27 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.