WiredTiger object freed without properly updating associated Python object

    • Type: Bug
    • Resolution: Unresolved
    • Priority: Major - P3
    • None
    • Affects Version/s: None
    • Component/s: Test Python
    • None
    • Storage Engines, Storage Engines - Foundations
    • None
    • 5

      • Currently, SWIG layer establishes connection between WT native objects (session, cursor) and associated Python objects via internal pointer at the C side. This pointer holds address of the corresponding Python object.
      • When parent object is closed, for example, a connection, then all child objects are closed implicitly. This is when the associated Python object is set to None to prevent hard crashes in the Python code, in case the dangling pointer is accessed.
      • When an object is closed explicitly from Python code, then the internal pointer it set to NULL, to keep Python object reference count correct.
      • The problem arrises, when close method of a cursor (or a session) returns EBUSY, which is technically an error, but must leave the object in a valid state. When the caller attempts calling the close again, then the object must not be freed or closed.

      Most of the time the above scenario works well. However, occasionally WT cursor gets freed without updating the Python side. The result is segfault.

      To reproduce the problem, run the following shell command (from build directory):

      for i in {1..1000}; do echo "[Round $i]"; python3 ../test/suite/run.py test_ovfl01 -v 2 || break; done

      Investigation:

      1. It seems that __curfile_close frees the cursor prematurely.
      2. After it calls __wt_cursor_close, it attempts to release dhandle via __wt_session_release_dhandle, which may return EBUSY.
      3. The EBUSY error then propagated all the way to the WT_CURSOR::close caller. The caller is in the Python test test_ovfl01.
      4. On subsequent call to WT_CURSOR::close, the underlying cursor is already freed. The Python code holds a garbage pointer at this point and segmentation fault occurs.

            Assignee:
            [DO NOT USE] Backlog - Storage Engines Team
            Reporter:
            Alex Blekhman
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: