[SERVER-55173] Segmentation fault in WiredTigerSession::releaseCursor Created: 11/Mar/21 Updated: 29/Oct/23 Resolved: 28/Mar/22 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | 4.4.4, 4.2.14, 5.0.0-rc3 |
| Fix Version/s: | 5.3.2, 5.0.8, 6.0.0-rc0, 4.4.15, 4.2.22 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Kelsey Schubert | Assignee: | Yuhong Zhang |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | Atlas_Failure_Analysis, query-director-triage | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||
| Backport Requested: |
v5.3, v5.0, v4.4, v4.2
|
||||||||||||||||||||||||||||
| Sprint: | Query Execution 2021-06-14, Query Execution 2021-06-28, Query Execution 2021-07-12, Query Execution 2021-07-26, QE 2021-08-09, QE 2021-08-23, QE 2021-09-06, QE 2021-09-20, Execution Team 2021-11-01, Execution Team 2021-11-15, Execution Team 2021-11-29, Execution Team 2021-12-13, Execution Team 2021-12-27, Execution Team 2022-01-10, Execution Team 2022-01-24, Execution Team 2022-02-07, Execution Team 2022-02-21, Execution Team 2022-03-07, Execution Team 2022-03-21, Execution Team 2022-04-04 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Case: | (copied to CRM) | ||||||||||||||||||||||||||||
| Linked BF Score: | 10 | ||||||||||||||||||||||||||||
| Description |
|
| Comments |
| Comment by Githook User [ 11/Jun/22 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Yuhong Zhang', 'email': 'yuhong.zhang@mongodb.com', 'username': 'YuhongZhang98'}Message: (cherry picked from commit bc940d6b0adc9254b62e9daf9c92d2c92f8b083d) | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 09/May/22 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Yuhong Zhang', 'email': 'yuhong.zhang@mongodb.com', 'username': 'YuhongZhang98'}Message: (cherry picked from commit bc940d6b0adc9254b62e9daf9c92d2c92f8b083d) | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 19/Apr/22 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Yuhong Zhang', 'email': 'yuhong.zhang@mongodb.com', 'username': 'YuhongZhang98'}Message: (cherry picked from commit bc940d6b0adc9254b62e9daf9c92d2c92f8b083d) | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 15/Apr/22 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Yuhong Zhang', 'email': 'yuhong.zhang@mongodb.com', 'username': 'YuhongZhang98'}Message: (cherry picked from commit bc940d6b0adc9254b62e9daf9c92d2c92f8b083d) | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Githook User [ 28/Mar/22 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Author: {'name': 'Yuhong Zhang', 'email': 'yuhong.zhang@mongodb.com', 'username': 'YuhongZhang98'}Message: | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Kyle Suarez [ 27/Sep/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
After a good discussion, we are pursuing matthew.saltz's suggestion to wait in the WiredTigerCursorCache for all cursors to be returned to the cache before proceeding to shut down. Out of our several proposed solutions, this one is the simplest, and we will get quick feedback on whether or not this proposal is viable. Sending this to the Storage Execution team backlog as we've agreed it makes the most sense for them to explore this change. CC louis.williams | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Sanjeeth Mallesh [ 02/Sep/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
We're encountering same error on `4.2.12` &Â `4.2.15` version Mongod. Signal 11 is raised and mongo service entering failed/crashed state. Segementation Fault is observed on all other related nodes having same mongod `4.2.12` &Â `4.2.15`
 | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Ethan Zhang (Inactive) [ 10/Jun/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Per discussion during my meeting with Dave, sending this to him to investigate and Dave can hand it off with what he finds later. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Louis Williams [ 17/May/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Thanks robert@appsignal.com, this confirms my suspicion in my previous comment. This cursorFreer needs to be run while still holding a lock. | |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Robert Beekman [ 17/May/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
We're seeing the same error when shutting down a `4.2.12` version Mongod. This error happened on multiple servers when shutting down (e.g. all 3 nodes in the same replica set had this segfault).
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| Comment by Louis Williams [ 19/Mar/21 ] | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
The cursorFreer, calls into the storage engine to reset the cursor without holding locks, which technically isn't allowed. Is this happening during a rollback or shutdown? |