[SERVER-79779] AsyncResultsMerger leaks shard cursor when getMore fails due to not primary error Created: 07/Aug/23  Updated: 29/Oct/23  Resolved: 25/Sep/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.2.0-rc0, 7.0.3

Type: Bug Priority: Major - P3
Reporter: Jordi Serra Torrens Assignee: Foteini Alvanaki
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File repro-server-79779.js    
Issue Links:
Backports
Depends
Problem/Incident
causes PYTHON-3953 PyMongo should send killCursors on Ma... Closed
Related
is related to SERVER-81338 Improve the approach of sending killC... Open
Assigned Teams:
Query Execution
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v7.1, v7.0
Sprint: QE 2023-08-21, QE 2023-09-04, QE 2023-09-18, QE 2023-10-02
Participants:
Linked BF Score: 119

 Description   

After mongos has established a cursor within a multi-document transaction, if a shard steps down then subsequent requests will fail with NotWritablePrimary error (or they can also fail later due to InterruptedDueToReplStateChange). In this case, AsyncResultsMerger will not attempt to clean up the shard cursors, because NotWritablePrimary/InterruptedDueToReplStateChange are not part of this list or errors. This will cause the shard cursor to be leaked.

NotWritablePrimary and InterruptedDueToReplStateChange (or the NotPrimaryError category?) should be made part of that list.

Edit: 'LockTimeout' errors can also occur, and AsyncResultsMerger will also not attempt to clean up cursors in this case.



 Comments   
Comment by Githook User [ 28/Sep/23 ]

Author:

{'name': 'Foteini Alvanaki', 'email': 'foteini.alvanaki@mongodb.com', 'username': ''}

Message: SERVER-79779 avoid tautological compare
Branch: v7.0
https://github.com/mongodb/mongo/commit/600d218aa077119b7943f5b7812ffc5c540a65bc

Comment by Githook User [ 26/Sep/23 ]

Author:

{'name': 'Billy Donahue', 'email': 'billy.donahue@mongodb.com', 'username': 'BillyDonahue'}

Message: SERVER-79779 avoid tautological compare
Branch: master
https://github.com/mongodb/mongo/commit/6c74a2981a51caf7254f804601e0ecb7c06be3eb

Comment by Githook User [ 25/Sep/23 ]

Author:

{'name': 'Foteini Alvanaki', 'email': 'foteini.alvanaki@mongodb.com', 'username': ''}

Message: SERVER-79779 ARM sends killCursors to all non-exhausted cursors which have not already been killed
Branch: v7.0
https://github.com/mongodb/mongo/commit/8b0c71dd9434f504bcae792473e43df5fd3112e1

Comment by Githook User [ 22/Sep/23 ]

Author:

{'name': 'Foteini Alvanaki', 'email': 'foteini.alvanaki@mongodb.com', 'username': ''}

Message: SERVER-79779 ARM sends killCursors to all non-exhausted cursors which have not already been killed
Branch: master
https://github.com/mongodb/mongo/commit/f94d52e533bdcb3b7ff761fa33e7e028e6f7087a

Comment by Foteini Alvanaki [ 14/Aug/23 ]

Even though the change to check for isNotPrimary category error has been merged, there have been test failures again. I am investigating in which cases cursors are left open.

Comment by Githook User [ 11/Aug/23 ]

Author:

{'name': 'Foteini Alvanaki', 'email': 'foteini.alvanaki@mongodb.com', 'username': ''}

Message: SERVER-79779 clean cursor when error is isNotPrimaryError category
Branch: master
https://github.com/mongodb/mongo/commit/9577c29b45dd2d65a3cc4ad20cd0b2bf004b5f08

Comment by Kevin Cherkauer [ 08/Aug/23 ]

foteini.alvanaki@mongodb.com assigning this to you as it a hot BF so it needs an owner, and it doesn't look like you have a BF currently.

Comment by Kevin Cherkauer [ 08/Aug/23 ]

jordi.serra-torrens@mongodb.com Thank you, I moved these back to QE and will look for an owner.

Comment by Jordi Serra Torrens [ 07/Aug/23 ]

I'm wondering what's the reasoning for the current short list of errors for which AsyncResultsMerger issues killCursors; and whether we could have a more holistic approach to it.

Generated at Thu Feb 08 06:41:52 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.