Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-9405

Fix segfault when opening history store cursor during rollback

    • 8
    • Storage Engines - 2022-09-05, Storage Engines - 2022-09-19
    • v6.1

      Currently there is a segmentation fault hit when opening a hs cursor during rollback. The assertion hit is:

      if (WT_READING_CHECKPOINT(session)) {
               /*
               * Propagate the checkpoint setting to the history cursor. Use the indicated history store
               * checkpoint. If that's null, it means there is no history store checkpoint to read and we
               * aren't supposed to come here.
               */
              WT_ASSERT(session, session->hs_checkpoint != NULL);
              len = strlen("checkpoint=") + strlen(session->hs_checkpoint) + 1;
              WT_RET(__wt_malloc(session, len, &tmp));
              WT_ERR(__wt_snprintf(tmp, len, "checkpoint=%s", session->hs_checkpoint));
              open_cursor_cfg[1] = tmp;
       }
      

      The segmentation fault happens when we are trying to access the members from the session's dhandle. At the faulting assert, the session's handle should already have an allocated dhandle with a checkpoint conveyed through this check:

      #define WT_READING_CHECKPOINT(s) ((s)->dhandle != NULL && WT_DHANDLE_IS_CHECKPOINT((s)->dhandle))
      

      Furthermore taking a closer look at the session's dhandle structure from the core dump there are further inconsistencies.
      The backtrace comes from BF-25450 and is shown as:

      #0  0x00007f00e3b03715 in __strlen_avx2 () from /lib64/libc.so.6
      #1  0x00007f00e39f77c7 in vfprintf () from /lib64/libc.so.6
      #2  0x00007f00e3ab0d4d in __vsnprintf_chk () from /lib64/libc.so.6
      #3  0x00007f00d77ce6f3 in __wt_vsnprintf_len_incr (buf=<optimized out>, size=<optimized out>, retsizep=retsizep@entry=0x7f00aa2666c0, fmt=<optimized out>, ap=ap@entry=0x7f00aa266570) at /usr/include/bits/stdio2.h:80
      #4  0x00007f00d782eeb7 in __wt_snprintf_len_set (buf=buf@entry=0x7f00aa2667b6 "\"session_dhandle_name\":\"", size=<optimized out>, retsizep=retsizep@entry=0x7f00aa2666c0, fmt=fmt@entry=0x7f00d7893805 "\"session_dhandle_name\":\"%s\",")
          at src/third_party/wiredtiger/src/include/misc_inline.h:120
      #5  0x00007f00d782f94f in __eventv (session=0x7f00d0308420, is_json=true, error=error@entry=0, func=func@entry=0x7f00d789ac20 <__PRETTY_FUNCTION__.47640> "__curhs_file_cursor_open", line=line@entry=37, category=category@entry=WT_VERB_DEFAULT, 
          level=WT_VERBOSE_ERROR, fmt=0x7f00d788c2bf "%s", ap=0x7f00aa2677e0) at src/third_party/wiredtiger/src/support/err.c:284
      #6  0x00007f00d765f5ab in __wt_errx_func (session=session@entry=0x7f00d0308420, func=func@entry=0x7f00d789ac20 <__PRETTY_FUNCTION__.47640> "__curhs_file_cursor_open", line=line@entry=37, category=category@entry=WT_VERB_DEFAULT, 
          fmt=fmt@entry=0x7f00d788c2bf "%s") at src/third_party/wiredtiger/src/support/err.c:488
      #7  0x00007f00d763f3b4 in __curhs_file_cursor_open (session=0x7f00d0308420, owner=0x0, cursorp=0x7f00cefc43a0) at src/third_party/wiredtiger/src/cursor/cur_hs.c:37
      #8  0x00007f00d774bd1d in __wt_curhs_open (session=session@entry=0x7f00d0308420, owner=owner@entry=0x0, cursorp=cursorp@entry=0x7f00aa2679c8) at src/third_party/wiredtiger/src/cursor/cur_hs.c:1165
      #9  0x00007f00d7859549 in __rollback_to_stable_btree_hs_truncate (session=0x7f00d0308420, btree_id=37) at src/third_party/wiredtiger/src/txn/txn_rollback_to_stable.c:1409
      #10 0x00007f00d786323e in __rollback_to_stable_btree_apply (session=0x7f00d0308420, uri=0x7f00d4fdac38 "file:index-33--5375033861641645217.wt", 
          config=0x7f00ce0eb5cf "access_pattern_hint=none,allocation_size=4KB,app_metadata=(formatVersion=8),assert=(write_timestamp=on),block_allocation=best,block_compressor=,cache_resident=false,checksum=on,collator=,columns=,dict"..., 
          rollback_timestamp=7102467085296467976) at src/third_party/wiredtiger/src/txn/txn_rollback_to_stable.c:1697
      #11 0x00007f00d78637b9 in __rollback_to_stable_btree_apply_all (rollback_timestamp=7102467085296467976, session=0x7f00d0308420) at src/third_party/wiredtiger/src/txn/txn_rollback_to_stable.c:1771
      #12 __rollback_to_stable (session=0x7f00d0308420, no_ckpt=<optimized out>) at src/third_party/wiredtiger/src/txn/txn_rollback_to_stable.c:1838
      #13 0x00007f00d78645ce in __wt_rollback_to_stable (session=<optimized out>, session@entry=0x7f00d0304020, cfg=cfg@entry=0x7f00aa267ed0, no_ckpt=no_ckpt@entry=false) at src/third_party/wiredtiger/src/txn/txn_rollback_to_stable.c:1878
      #14 0x00007f00d770c164 in __conn_rollback_to_stable (wt_conn=<optimized out>, config=0x0) at src/third_party/wiredtiger/src/conn/conn_api.c:1341
      

      Local Variables:

      (gdb) p *session->dhandle
      $10 = {rwlock = {u = {v = 14829735431805717965, s = {current = 205 '\315', next = 205 '\315', reader = 205 '\315', readers_queued = 205 '\315', readers_active = 3452816845}}, stat_read_count_off = -12851, stat_write_count_off = -12851, 
          stat_app_usecs_off = -12851, stat_int_usecs_off = -12851, stat_session_usecs_off = -12851, cond_readers = 0xcdcdcdcdcdcdcdcd, cond_writers = 0xcdcdcdcdcdcdcdcd}, q = {tqe_next = 0xcdcdcdcdcdcdcdcd, tqe_prev = 0xcdcdcdcdcdcdcdcd}, hashq = {
          tqe_next = 0xcdcdcdcdcdcdcdcd, tqe_prev = 0xcdcdcdcdcdcdcdcd}, name = 0xcdcdcdcdcdcdcdcd <error: Cannot access memory at address 0xcdcdcdcdcdcdcdcd>, name_hash = 14829735431805717965, 
        checkpoint = 0xcdcdcdcdcdcdcdcd <error: Cannot access memory at address 0xcdcdcdcdcdcdcdcd>, checkpoint_order = -3617008641903833651, cfg = 0xcdcdcdcdcdcdcdcd, 
        meta_base = 0xcdcdcdcdcdcdcdcd <error: Cannot access memory at address 0xcdcdcdcdcdcdcdcd>, meta_base_length = 14829735431805717965, orig_meta_base = 0xcdcdcdcdcdcdcdcd <error: Cannot access memory at address 0xcdcdcdcdcdcdcdcd>, 
        session_ref = 3452816845, session_inuse = -842150451, excl_ref = 3452816845, timeofdeath = 14829735431805717965, excl_session = 0xcdcdcdcdcdcdcdcd, dsrc = 0xcdcdcdcdcdcdcdcd, handle = 0xcdcdcdcdcdcdcdcd, type = 3452816845, compact_skip = 205, 
        close_lock = {lock = {__data = {__lock = -842150451, __count = 3452816845, __owner = -842150451, __nusers = 3452816845, __kind = -842150451, __spins = -12851, __elision = -12851, __list = {__prev = 0xcdcdcdcdcdcdcdcd, __next = 0xcdcdcdcdcdcdcdcd}}, 
            __size = '\315' <repeats 40 times>, __align = -3617008641903833651}, name = 0xcdcdcdcdcdcdcdcd <error: Cannot access memory at address 0xcdcdcdcdcdcdcdcd>, stat_count_off = -12851, stat_app_usecs_off = -12851, stat_int_usecs_off = -12851, 
          stat_session_usecs_off = -12851, initialized = -51 '\315'}, stats = {0xcdcdcdcdcdcdcdcd <repeats 23 times>}, stat_array = 0xcdcdcdcdcdcdcdcd, flags = 3452816845, ts_flags = 3452816845}
      (gdb) p session
      $11 = (WT_SESSION_IMPL *) 0x7f00d0308420
      

      Taking a closer look at the session variable, we can see that dhandle memory is actually freed, from the weird memory 0xcdcdcdcdcdcdcdcd, which is how WiredTiger deals with freeing memory.

      Problem
      Summarising the evidence from above, I assume that we are dealing with a dhandle being discarded, and when we are performing rollback, we open the history cursor which attempts to read the discarded dhandle's checkpoint and ends up segmentation faulting.

      I have also narrowed down that this has something to do with the checkpoint cursor change in WT-5927.

            Assignee:
            jie.chen@mongodb.com Jie Chen
            Reporter:
            jie.chen@mongodb.com Jie Chen
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: