Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-932

format infinite loop (new-split branch)

    • Type: Icon: Task Task
    • Resolution: Done
    • WT2.2
    • Affects Version/s: None
    • Component/s: None
    • Labels:

      @agorrod, @michaelcahill: there's a stall in the new-split branch. I was hoping Michael's WT-931 would fix it, but I can still reproduce the problem. Here's the config I'm using, and the more threads, the sooner it fires:

      file_type=row
      data_source=file
      checkpoints=1
      cache=5
      compression=none
      leaf_page_max=12
      internal_page_max=12
      ops=1000000
      rows=1000
      key_max=32
      value_max=32
      

      and we end up with checkpoint in an infinite loop walking the tree:

      #0  __wt_tree_walk (session=0x8024ff180, pagep=0x7ffffddee9d8, flags=320)
          at ../src/btree/bt_walk.c:317
      WT-1  0x0000000000446afd in __wt_sync_file (session=0x8024ff180, syncop=8)
          at ../src/btree/bt_evict.c:655
      WT-2  0x0000000000457477 in __wt_bt_cache_op (session=0x8024ff180, 
          ckptbase=0x8062fe400, op=8) at ../src/btree/bt_sync.c:59
      WT-3  0x00000000004403bc in __checkpoint_worker (session=0x8024ff180, 
          cfg=0x7ffffddeede0, is_checkpoint=1) at ../src/txn/txn_ckpt.c:750
      

      It looks to me like checkpoint is looping between two pages: the "couple" page and the next page (which is a WT_REF_SPLIT page). Checkpoint reads the split page, gets a WT_RESTART return, returns to the "couple" page, does a next, and winds up on the split page again.

      I can reproduce the problem even without deepening the tree, so this is a fundamental issue in splitting (maybe an eviction race with checkpoint, maybe a race inside split itself).

            Assignee:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Reporter:
            keith.bostic@mongodb.com Keith Bostic (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: