Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-12182

Investigate tier_storage_copy failure in schema_abort

    • Storage Engines

      Running schema_abort with tiered storage occasionally (1 in 10 runs on my virtual workstation) results in an apparent failure:

      $ ./test_schema_abort -T 10 -t 5 -PT -Po dir_store
      Running test command: ./test_schema_abort -T 10 -t 5 -PT -Po dir_store 
      Parent: compatibility: false, in-mem log sync: false, timestamp in use: true, tiered in use: true
      Parent: Create 10 threads; sleep 5 seconds
      CONFIG: test_schema_abort -PT -h WT_TEST.test_schema_abort -s 0 -T 10 -t 5 -PSD2053098,E9969491
      Create checkpoint thread
      Create timestamp thread
      Create 10 writer threads
      Thread 0 starts at 0
      Thread 5 starts at 5000000000
      Thread 7 starts at 7000000000
      Thread 3 starts at 3000000000
      Thread 2 starts at 2000000000
      Thread 1 starts at 1000000000
      Thread 9 starts at 9000000000
      Thread 4 starts at 4000000000
      Thread 6 starts at 6000000000
      Thread 8 starts at 8000000000
      SET STABLE: 7c 124
      Checkpoint 2084440 complete: Flush: NO. Minimum ts 124
      Checkpoint 2084441 complete: Flush: NO. Minimum ts 2949
      Checkpoint 2084442 complete: Flush: NO. Minimum ts 3479
      Checkpoint 2084443 complete: Flush: YES. Minimum ts 5373
      Finished a flush_tier
      Checkpoint 2084444 complete: Flush: NO. Minimum ts 6058
      Checkpoint 2084445 complete: Flush: YES. Minimum ts 8546
      [1702917105:433934][30535:0x7f6658fc2700], tiered-server: [WT_VERB_DEFAULT][ERROR]: void *__tiered_server(void *), 473: storage server error from tier_storage_copy: WT_NOTFOUND: item not found
      Finished a flush_tier
      [1702917105:433978][30535:0x7f6658fc2700], tiered-server: [WT_VERB_DEFAULT][ERROR]: void *__tiered_server(void *), 473: the process must exit and restart: WT_PANIC: WiredTiger library panic
      [1702917105:433994][30535:0x7f6658fc2700], tiered-server: [WT_VERB_DEFAULT][ERROR]: void __wt_abort(WT_SESSION_IMPL *), 28: aborting WiredTiger library
      Kill child
      Open database, run recovery and verify content
      Got stable_val 8596
      812134 records verified
      $ echo $?
      0
      

      It turns out that despite the errors and apparent panic by the child process, schema_abort doesn't register this as an error. I.e., the test returns 0 rather than an error value.

      This ticket is to investigate and fix the tiered storage failure.

            Assignee:
            sue.loverso@mongodb.com Susan LoVerso
            Reporter:
            keith.smith@mongodb.com Keith Smith
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: