Uploaded image for project: 'WiredTiger'
  1. WiredTiger
  2. WT-1698

Reconfiguring eviction opens a race with WT_CONNECTION::close

    Details

    • Type: Task
    • Status: Open
    • Priority: Major - P3
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Backlog
    • Labels:

      Description

      After the eviction.threads_max setting is increased, the eviction server may resize its array of worker contexts and allocate more sessions. This happens asynchronously in the eviction server thread because it owns the eviction worker contexts.

      The problem with this is that we assert when opening a session that we are not in the process of closing the connection. This has been the source of a number of bugs in the past where internal threads can race with connection close if the connection is only open for a short period.

      One possible fix would be to set a flag when the reconfigure happens and have the eviction server clear the flag once it has processed the setting change. WT_CONNECTION::close would have to wait for the flag to be cleared before we start asserting that no new sessions can be opened.

        Issue Links

          Activity

          Hide
          keith.bostic Keith Bostic added a comment -

          @michaelcahill: I don't see any place we assert when opening a session we are not in the process of closing the connection, is that still the case?

          Since connection close waits on the servers to exit, would it be sufficient for connection close to shut out reconfiguration, then wait on the server threads?

          Show
          keith.bostic Keith Bostic added a comment - @michaelcahill: I don't see any place we assert when opening a session we are not in the process of closing the connection, is that still the case? Since connection close waits on the servers to exit, would it be sufficient for connection close to shut out reconfiguration, then wait on the server threads?
          Hide
          michael.cahill Michael Cahill added a comment -

          @keithbostic, we do check whether a connection is closing before allowing a session to open, here https://github.com/wiredtiger/wiredtiger/blob/develop/src/session/session_api.c#L1031

          I would like to keep that if possible, because it has caught a bunch of problems with server threads in the past.

          The problem isn't connection close racing with reconfiguration. The problem is that sometime after a reconfiguration, the eviction server may notice the change and reallocate some memory, open some sessions and start some threads. We don't do the realloc during reconfigure because the eviction server owns that memory.

          If we could shut down the eviction server early in close, we may be able to address this, but evicting whole trees currently waits on the eviction server (to clear the eviction walk point), and I'm hesitant to make connection close too much of a special case. I'm sure this problem is solvable, and there is some way to wait during close for the eviction server to settle down, but I don't see a simple solution.

          Show
          michael.cahill Michael Cahill added a comment - @keithbostic, we do check whether a connection is closing before allowing a session to open, here https://github.com/wiredtiger/wiredtiger/blob/develop/src/session/session_api.c#L1031 I would like to keep that if possible, because it has caught a bunch of problems with server threads in the past. The problem isn't connection close racing with reconfiguration. The problem is that sometime after a reconfiguration, the eviction server may notice the change and reallocate some memory, open some sessions and start some threads. We don't do the realloc during reconfigure because the eviction server owns that memory. If we could shut down the eviction server early in close, we may be able to address this, but evicting whole trees currently waits on the eviction server (to clear the eviction walk point), and I'm hesitant to make connection close too much of a special case. I'm sure this problem is solvable, and there is some way to wait during close for the eviction server to settle down, but I don't see a simple solution.
          Hide
          keith.bostic Keith Bostic added a comment -

          @michaelcahill, I was confused because I expected it to be a test of WT_CONN_CLOSING and it's testing WT_CONN_SERVER_RUN. Is there any reason we aren't testing WT_CONN_SERVER_RUN everywhere, that is, does WT_CONN_CLOSING serve any special purpose?

          I don't see anything in *wt_async_destroy or *wt_lsm_manager_destroy that makes me think session open couldn't test WT_CONN_SERVER_RUN?

          diff --git a/dist/flags.py b/dist/flags.py
          index f1eb6b2..b4171a1 100644
          --- a/dist/flags.py
          +++ b/dist/flags.py
          @@ -87,7 +87,6 @@ flags = {
               'conn' : [
                   'CONN_CACHE_POOL',
                   'CONN_CKPT_SYNC',
          -        'CONN_CLOSING',
                   'CONN_EVICTION_RUN',
                   'CONN_LEAK_MEMORY',
                   'CONN_LOG_SERVER_RUN',
          diff --git a/src/conn/conn_dhandle.c b/src/conn/conn_dhandle.c
          index 7756158..d920b0e 100644
          --- a/src/conn/conn_dhandle.c
          +++ b/src/conn/conn_dhandle.c
          @@ -375,7 +375,7 @@ __conn_btree_open(
                      F_ISSET(dhandle, WT_DHANDLE_EXCLUSIVE) &&
                      !LF_ISSET(WT_DHANDLE_LOCK_ONLY));
           
          -       WT_ASSERT(session, !F_ISSET(S2C(session), WT_CONN_CLOSING));
          +       WT_ASSERT(session, F_ISSET(S2C(session), WT_CONN_SERVER_RUN));
           
                  /*
                   * If the handle is already open, it has to be closed so it can be
          diff --git a/src/conn/conn_open.c b/src/conn/conn_open.c
          index 0a3d35a..da6b605 100644
          --- a/src/conn/conn_open.c
          +++ b/src/conn/conn_open.c
          @@ -110,9 +110,6 @@ __wt_connection_close(WT_CONNECTION_IMPL *conn)
                  F_CLR(conn, WT_CONN_SERVER_RUN);
                  WT_TRET(__wt_async_destroy(session));
                  WT_TRET(__wt_lsm_manager_destroy(session));
          -
          -       F_SET(conn, WT_CONN_CLOSING);
          -
                  WT_TRET(__wt_checkpoint_server_destroy(session));
                  WT_TRET(__wt_statlog_destroy(session, 1));
                  WT_TRET(__wt_sweep_destroy(session));
          

          Show
          keith.bostic Keith Bostic added a comment - @michaelcahill, I was confused because I expected it to be a test of WT_CONN_CLOSING and it's testing WT_CONN_SERVER_RUN . Is there any reason we aren't testing WT_CONN_SERVER_RUN everywhere, that is, does WT_CONN_CLOSING serve any special purpose? I don't see anything in *wt_async_destroy or *wt_lsm_manager_destroy that makes me think session open couldn't test WT_CONN_SERVER_RUN ? diff --git a/dist/flags.py b/dist/flags.py index f1eb6b2..b4171a1 100644 --- a/dist/flags.py +++ b/dist/flags.py @@ -87,7 +87,6 @@ flags = { 'conn' : [ 'CONN_CACHE_POOL', 'CONN_CKPT_SYNC', - 'CONN_CLOSING', 'CONN_EVICTION_RUN', 'CONN_LEAK_MEMORY', 'CONN_LOG_SERVER_RUN', diff --git a/src/conn/conn_dhandle.c b/src/conn/conn_dhandle.c index 7756158..d920b0e 100644 --- a/src/conn/conn_dhandle.c +++ b/src/conn/conn_dhandle.c @@ -375,7 +375,7 @@ __conn_btree_open( F_ISSET(dhandle, WT_DHANDLE_EXCLUSIVE) && !LF_ISSET(WT_DHANDLE_LOCK_ONLY)); - WT_ASSERT(session, !F_ISSET(S2C(session), WT_CONN_CLOSING)); + WT_ASSERT(session, F_ISSET(S2C(session), WT_CONN_SERVER_RUN)); /* * If the handle is already open, it has to be closed so it can be diff --git a/src/conn/conn_open.c b/src/conn/conn_open.c index 0a3d35a..da6b605 100644 --- a/src/conn/conn_open.c +++ b/src/conn/conn_open.c @@ -110,9 +110,6 @@ __wt_connection_close(WT_CONNECTION_IMPL *conn) F_CLR(conn, WT_CONN_SERVER_RUN); WT_TRET(__wt_async_destroy(session)); WT_TRET(__wt_lsm_manager_destroy(session)); - - F_SET(conn, WT_CONN_CLOSING); - WT_TRET(__wt_checkpoint_server_destroy(session)); WT_TRET(__wt_statlog_destroy(session, 1)); WT_TRET(__wt_sweep_destroy(session));
          Hide
          keith.bostic Keith Bostic added a comment -

          I think this problem can occur in two other ways: the async code opens an internal session as part of WT_CONNECTION.async_new_op, and LSM will start the LSM manager thread/session when an LSM tree is opened, in other words, those two code paths can attempt to create an internal session when WT_CONNECTION.close is in process. It shouldn't happen in a well-written application, of course, but that's not terribly reassuring.

          Show
          keith.bostic Keith Bostic added a comment - I think this problem can occur in two other ways: the async code opens an internal session as part of WT_CONNECTION.async_new_op , and LSM will start the LSM manager thread/session when an LSM tree is opened, in other words, those two code paths can attempt to create an internal session when WT_CONNECTION.close is in process. It shouldn't happen in a well-written application, of course, but that's not terribly reassuring.
          Hide
          keith.bostic Keith Bostic added a comment -

          FTR, I just saw what I think is a related core dump with WT-1613's CONFIG:

          (gdb) where
          #0  __evict_worker (arg=0x116bb30) at src/evict/evict_lru.c:391
          WT-1  0x00007ff05c1bbf18 in start_thread () from /lib64/libpthread.so.0
          WT-2  0x00007ff05bad6b2d in clone () from /lib64/libc.so.6
          (gdb) allstack
           
          Thread 3 (Thread 0x7ff05b1d8740 (LWP 28191)):
          #0  0x00007ff05c1bcfb7 in pthread_join () from /lib64/libpthread.so.0
          WT-1  0x000000000042e453 in __wt_thread_join (session=session@entry=0x114e510, 
              tid=<optimized out>) at src/os_posix/os_thread.c:36
          WT-2  0x000000000041a4e8 in __wt_evict_destroy (session=session@entry=0x114e510)
              at src/evict/evict_lru.c:361
          WT-3  0x0000000000411cf0 in __wt_connection_close (conn=conn@entry=0x1185370)
              at src/conn/conn_open.c:157
          WT-4  0x000000000040c2d6 in __conn_close (wt_conn=0x1185370, config=0x0)
              at src/conn/conn_api.c:723
          WT-5  0x000000000040b794 in wts_close () at wts.c:406
          WT-6  0x0000000000404005 in main (argc=<optimized out>, argv=<optimized out>)
              at t.c:222
           
          Thread 2 (Thread 0x7ff05b1d6700 (LWP 24434)):
          #0  0x00007ff05c1bcfb7 in pthread_join () from /lib64/libpthread.so.0
          WT-1  0x000000000042e453 in __wt_thread_join (session=session@entry=0x114e7d0, 
              tid=<optimized out>) at src/os_posix/os_thread.c:36
          WT-2  0x000000000041b9a4 in __evict_server (arg=0x114e7d0)
              at src/evict/evict_lru.c:185
          WT-3  0x00007ff05c1bbf18 in start_thread () from /lib64/libpthread.so.0
          WT-4  0x00007ff05bad6b2d in clone () from /lib64/libc.so.6
           
          Thread 1 (Thread 0x7ff04eff5700 (LWP 31593)):
          #0  __evict_worker (arg=0x116bb30) at src/evict/evict_lru.c:391
          WT-1  0x00007ff05c1bbf18 in start_thread () from /lib64/libpthread.so.0
          WT-2  0x00007ff05bad6b2d in clone () from /lib64/libc.so.6
          

          Thread 3 has already pthread-joined all of the eviction workers, and Thread 2 is attempting to pthread-join an eviction worker.

          Show
          keith.bostic Keith Bostic added a comment - FTR, I just saw what I think is a related core dump with WT-1613 's CONFIG: (gdb) where #0 __evict_worker (arg=0x116bb30) at src/evict/evict_lru.c:391 WT-1 0x00007ff05c1bbf18 in start_thread () from /lib64/libpthread.so.0 WT-2 0x00007ff05bad6b2d in clone () from /lib64/libc.so.6 (gdb) allstack   Thread 3 (Thread 0x7ff05b1d8740 (LWP 28191)): #0 0x00007ff05c1bcfb7 in pthread_join () from /lib64/libpthread.so.0 WT-1 0x000000000042e453 in __wt_thread_join (session=session@entry=0x114e510, tid=<optimized out>) at src/os_posix/os_thread.c:36 WT-2 0x000000000041a4e8 in __wt_evict_destroy (session=session@entry=0x114e510) at src/evict/evict_lru.c:361 WT-3 0x0000000000411cf0 in __wt_connection_close (conn=conn@entry=0x1185370) at src/conn/conn_open.c:157 WT-4 0x000000000040c2d6 in __conn_close (wt_conn=0x1185370, config=0x0) at src/conn/conn_api.c:723 WT-5 0x000000000040b794 in wts_close () at wts.c:406 WT-6 0x0000000000404005 in main (argc=<optimized out>, argv=<optimized out>) at t.c:222   Thread 2 (Thread 0x7ff05b1d6700 (LWP 24434)): #0 0x00007ff05c1bcfb7 in pthread_join () from /lib64/libpthread.so.0 WT-1 0x000000000042e453 in __wt_thread_join (session=session@entry=0x114e7d0, tid=<optimized out>) at src/os_posix/os_thread.c:36 WT-2 0x000000000041b9a4 in __evict_server (arg=0x114e7d0) at src/evict/evict_lru.c:185 WT-3 0x00007ff05c1bbf18 in start_thread () from /lib64/libpthread.so.0 WT-4 0x00007ff05bad6b2d in clone () from /lib64/libc.so.6   Thread 1 (Thread 0x7ff04eff5700 (LWP 31593)): #0 __evict_worker (arg=0x116bb30) at src/evict/evict_lru.c:391 WT-1 0x00007ff05c1bbf18 in start_thread () from /lib64/libpthread.so.0 WT-2 0x00007ff05bad6b2d in clone () from /lib64/libc.so.6 Thread 3 has already pthread-joined all of the eviction workers, and Thread 2 is attempting to pthread-join an eviction worker.
          Hide
          michael.cahill Michael Cahill added a comment -

          The fix in WT-1792 should fix the stack above, but not the original problem: we could still hit the assertion about opening a session concurrent with connection close after reconfiguring the number of eviction threads.

          The reason for the WT_CONN_CLOSING is that async threads and LSM threads can open handles. We clear WT_CONN_SERVER_RUN to signal to all server threads to shut down. We set WT_CONN_CLOSING after we're sure we shouldn't open more data handles. It isn't pretty, but they can't be combined.

          I'm not concerned about applications that do async operations or LSM operations concurrent with WT_CONNECTION::close – they're clearly violating the API by accessing handles after they are invalidated.

          We could complicate close further by setting some other flag once we're sure all the threads that open sessions have finished, but that's more ugliness like WT_CONN_CLOSING. Or we could just get rid of these assertions and wrap parts of close in api_lock to avoid racing with server threads opening sessions (though we'd also have to set WT_SESSION_INTERNAL inside the lock when opening a session).

          Show
          michael.cahill Michael Cahill added a comment - The fix in WT-1792 should fix the stack above, but not the original problem: we could still hit the assertion about opening a session concurrent with connection close after reconfiguring the number of eviction threads. The reason for the WT_CONN_CLOSING is that async threads and LSM threads can open handles. We clear WT_CONN_SERVER_RUN to signal to all server threads to shut down. We set WT_CONN_CLOSING after we're sure we shouldn't open more data handles. It isn't pretty, but they can't be combined. I'm not concerned about applications that do async operations or LSM operations concurrent with WT_CONNECTION::close – they're clearly violating the API by accessing handles after they are invalidated. We could complicate close further by setting some other flag once we're sure all the threads that open sessions have finished, but that's more ugliness like WT_CONN_CLOSING . Or we could just get rid of these assertions and wrap parts of close in api_lock to avoid racing with server threads opening sessions (though we'd also have to set WT_SESSION_INTERNAL inside the lock when opening a session).
          Hide
          ramon.fernandez Ramon Fernandez added a comment -

          Additional ticket information from GitHub

          This ticket was referenced in the following commits:
          Show
          ramon.fernandez Ramon Fernandez added a comment - Additional ticket information from GitHub This ticket was referenced in the following commits: 8d918f0ef8588056ecf729e72ffdd8bc0a79fd6c

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Days since reply:
                1 year, 48 weeks, 6 days ago
                Date of 1st Reply: