[SERVER-21311] Segfault in RockDB running grow_hash_table.js and sortj.js in the jsCore_small_oplog_rs suite Created: 05/Nov/15  Updated: 20/Mar/16  Resolved: 20/Mar/16

Status: Closed
Project: Core Server
Component/s: Storage
Affects Version/s: None
Fix Version/s: 3.2.0-rc3

Type: Bug Priority: Major - P3
Reporter: Spencer Brody (Inactive) Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: HTML File grow_hash_table.html     HTML File sortj.html    
Issue Links:
Related
related to SERVER-21617 Re-enable jstestfuzz task on the Rock... Closed
is related to SERVER-21322 Disable jstestfuzz on the RocksDB var... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Participants:
Linked BF Score: 0

 Comments   
Comment by Igor Canadi [ 23/Nov/15 ]

Thanks Kamran!

Comment by Kamran K. [ 22/Nov/15 ]

igor, I filed a separate ticket to re-enable jstestfuzz (SERVER-21617) now that this bug seems to be patched.

Comment by Igor Canadi [ 10/Nov/15 ]

Thanks Mathias, I'll keep digging. I was able to repro the failure and also confirmed that the failure doesn't reproduce anymore with snapshot manager turned off – https://github.com/mongodb-partners/mongo-rocks/commit/617675b5c44bcdfa9503d0e0375b5eab8fa2165a

Comment by Mathias Stearn [ 10/Nov/15 ]

igor The SnapshotThread should be completely shutdown before we get to the storage engine's cleanShutdown(). If you find it isn't, that is a bug that we need to fix.

Comment by Spencer Brody (Inactive) [ 10/Nov/15 ]

Hi Igor,
I don't work on the storage engine code so I'm not sure, perhaps redbeard0531 or geert.bosch would be better able to answer your questions.
Mathias/Geert, can you please take a look at these questions from Igor?

Comment by Igor Canadi [ 10/Nov/15 ]

I temporarily disabled RocksDB's support for snapshot manager (read committed feature). That should hopefully keep the tests green while we investigate.

BTW do you think jstestfuzz failed because of the same issue? The stack trace in the failure is vague, I'm assuming because it's a release build. If it's the same failure, would you mind re-enabling the test?

Comment by Igor Canadi [ 10/Nov/15 ]

Spencer – is it possible that SnapshotThread is running when cleanShutdown() is called?

I'm still trying to repro, but the assertion seems to come from SnapshotThread and this is my most naive explanation

Comment by Igor Canadi [ 06/Nov/15 ]

Thanks Spencer, I will take a look!

Comment by Spencer Brody (Inactive) [ 05/Nov/15 ]

Logs attached. igor, can you take a look?

Generated at Thu Feb 08 03:56:57 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.