Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-6892

Segfault after wiredtiger open with repeatable read built in

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None

      Description

      Creating this to track a possible bug identified in a patch build, the patch was trying to repro the WT_NOTFOUND bug thats been hanging around.

      Essentially I created a mongodb patch build with this diff (applied on top of v4.4):

      src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp
      diff --git a/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp b/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp
      index f1d127e774..c74f7503b1 100644
      --- a/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp
      +++ b/src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp
      @@ -1597,6 +1597,15 @@ Status WiredTigerRecordStore::updateRecord(OperationContext* opCtx,
           invariant(c);
           setKey(c, id);
           int ret = wiredTigerPrepareConflictRetry(opCtx, [&] { return c->search(c); });
      +    if (ret == 0) {
      +        std::int64_t key;
      +        c->get_key(c, &key);
      +        c->flags = c->flags | WT_CURSTD_DEBUG_RESET_EVICT;
      +        c->reset(c);
      +        c->set_key(c, key);
      +        c->flags = c->flags & ~WT_CURSTD_DEBUG_RESET_EVICT;
      +        ret = wiredTigerPrepareConflictRetry(opCtx, [&] { return c->search(c); });
      +    }
           invariantWTOK(ret);
           WT_ITEM old_value;
       

      I then patch built it on mongodb-mongo-v4.4 and it reproduced a segfault and a WT_NOTFOUND.

      [cpp_unit_test:storage_wiredtiger_prefixed_record_store_and_index_test] 2020-11-10T10:18:57.291+0000 | 2020-11-10T10:18:57.291Z I  STORAGE  4795906 [main] "WiredTiger opened","attr":{"durationMillis":19}
      [cpp_unit_test:storage_wiredtiger_prefixed_record_store_and_index_test] 2020-11-10T10:18:57.291+0000 | 2020-11-10T10:18:57.291Z I  RECOVERY 23987   [main] "WiredTiger recoveryTimestamp","attr":{"recoveryTimestamp":{"$timestamp":{"t":0,"i":0}}}
      [cpp_unit_test:storage_wiredtiger_prefixed_record_store_and_index_test] 2020-11-10T10:18:57.304+0000 | 2020-11-10T10:18:57.304Z F  -        23083   [main] "Invariant failure","attr":{"expr":"ret","error":"UnknownError: -31803: WT_NOTFOUND: item not found","file":"src/mongo/db/storage/wiredtiger/wiredtiger_record_store.cpp","line":1609}
      [cpp_unit_test:storage_wiredtiger_prefixed_record_store_and_index_test] 2020-11-10T10:18:57.304+0000 | 2020-11-10T10:18:57.304Z F  -        23084   [main] "\n\n***aborting after invariant() failure\n\n"
      [cpp_unit_test:storage_wiredtiger_prefixed_record_store_and_index_test] 2020-11-10T10:18:57.304+0000 | 2020-11-10T10:18:57.304Z F  CONTROL  4757800 [main] "Writing fatal message","attr":{"message":"Got signal: 6 (Aborted).\n"}
       
      [cpp_unit_test:storage_wiredtiger_prefixed_record_store_and_index_test] 2020-11-10T11:42:06.784+0000 | 2020-11-10T11:42:06.783Z I  STORAGE  4795906 [main] "WiredTiger opened","attr":{"durationMillis":22}
      [cpp_unit_test:storage_wiredtiger_prefixed_record_store_and_index_test] 2020-11-10T11:42:06.785+0000 | 2020-11-10T11:42:06.783Z I  RECOVERY 23987   [main] "WiredTiger recoveryTimestamp","attr":{"recoveryTimestamp":{"$timestamp":{"t":0,"i":0}}}
      [cpp_unit_test:storage_wiredtiger_prefixed_record_store_and_index_test] 2020-11-10T11:42:06.799+0000 | 2020-11-10T11:42:06.799Z F  CONTROL  4757800 [main] "Writing fatal message","attr":{"message":"Invalid access at address: 0x13e400000000"}
      [cpp_unit_test:storage_wiredtiger_prefixed_record_store_and_index_test] 2020-11-10T11:42:06.799+0000 | 2020-11-10T11:42:06.799Z F  CONTROL  4757800 [main] "Writing fatal message","attr":{"message":"Got signal: 11 (Segmentation fault).\n"}
      

      The scenario seen in this test doesn't feel like the same as the ones we're investigating. The failing test is a unit test: storage_wiredtiger_prefixed_record_store_and_index_test

        Attachments

          Activity

            People

            Assignee:
            luke.pearson Luke Pearson
            Reporter:
            luke.pearson Luke Pearson
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: