Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-17195

WiredTiger: highly contended write traffic may lock collection

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical - P2
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-rc8
    • Fix Version/s: 3.0.0-rc9, 3.1.0
    • Component/s: WiredTiger
    • Labels:
    • Backwards Compatibility:
      Fully Compatible
    • Operating System:
      ALL
    • Backport Completed:
    • Steps To Reproduce:
      Hide

      I use a special program to run this test, will provide access to it.

      to run it

      read/write with same collection

      ./hammer.test.linux run concurrentUpdateSingleDoc

      read/write with different collection

      HT_DIFFERENT_COL=yes ./hammer.test.linux run concurrentUpdateSingleDoc

      Show
      I use a special program to run this test, will provide access to it. to run it read/write with same collection ./hammer.test.linux run concurrentUpdateSingleDoc read/write with different collection HT_DIFFERENT_COL=yes ./hammer.test.linux run concurrentUpdateSingleDoc

      Description

      Test design

      with wiredTiger, designed following test

      • 64 thread to update a single doc with $inc to create a highly contended write traffic, the doc is {_id: 1}
      • 4 other thread read another doc {_id: 2} from the same collection or a separate collection

      Observation

      When both update and read are to/from the same collection, very often, there will be pause in both read/write (close to 0, operation still gets done occasionally) Can reproduce this in regular basis

      Pause will last quite long for some case, for example, following pause lasts more than 3 minutes

      2015/02/05 21:47:53  -      0       0
      ...
      2015/02/05 21:51:15  -      0       0

      ➜  ~  ./hammer.test.linux run concurrentUpdateSingleDoc | tee t1
      Run test:  concurrentUpdateSingleDoc
      run test  0 : concurrentUpdateSingleDoc
      2015/02/05 21:47:42 start tracing
      2015/02/05 21:47:42  - update    read
      2015/02/05 21:47:43  -  17978    1661
      2015/02/05 21:47:44  -  18132    1654
      2015/02/05 21:47:45  -  18641    1695
      2015/02/05 21:47:46  -  18788    1460
      2015/02/05 21:47:47  -  18947    1441
      2015/02/05 21:47:48  -  17609    2300
      2015/02/05 21:47:49  -  18707    1672
      2015/02/05 21:47:50  -  18622    1711
      2015/02/05 21:47:51  -  18825    1414
      2015/02/05 21:47:52  -   7313     625
      2015/02/05 21:47:53  -      0       0
      2015/02/05 21:47:54  -      0       0
      2015/02/05 21:47:55  -      0       0
      2015/02/05 21:47:56  -      0       0
      2015/02/05 21:47:57  -      1       0
      2015/02/05 21:47:58  -      1       0
      2015/02/05 21:47:59  -      1       0
      2015/02/05 21:48:00  -      0       0
      2015/02/05 21:48:01  -      1       0
      2015/02/05 21:48:02  -      1       0
      2015/02/05 21:48:03  -      2       0
      2015/02/05 21:48:04  -      0       0
      2015/02/05 21:48:05  -      0       0
      2015/02/05 21:48:06  -      1       0
      2015/02/05 21:48:07  -      2       0
      2015/02/05 21:48:08  -      1       0
      2015/02/05 21:48:09  -      0       0
      2015/02/05 21:48:10  -      2       1
      2015/02/05 21:48:11  -      0       0
      2015/02/05 21:48:12  -      0       0
      2015/02/05 21:48:13  -    164      25
      2015/02/05 21:48:14  -      0       0
      2015/02/05 21:48:15  -      0       0
      2015/02/05 21:48:16  -      0       0
      2015/02/05 21:48:17  -      0       0

      When update and read are to/from different collections, there will still be pause for write, but the read is largely Ok and benefits from the reduced traffic from another collection.

      ~  HT_DIFFERENT_COL=yes ./hammer.test.linux run concurrentUpdateSingleDoc
      Read from a separate collection!
      Run test:  concurrentUpdateSingleDoc
      run test  0 : concurrentUpdateSingleDoc
      2015/02/05 21:52:12 start tracing
      2015/02/05 21:52:12  - update    read
      2015/02/05 21:52:13  -  17277    2287
      2015/02/05 21:52:14  -  18588    1681
      2015/02/05 21:52:15  -  18640    1639
      2015/02/05 21:52:16  -   5766    8497
      2015/02/05 21:52:17  -   5651    7026
      2015/02/05 21:52:18  -   5367    7361
      2015/02/05 21:52:19  -   5407    7556
      2015/02/05 21:52:20  -   5320    8257
      2015/02/05 21:52:21  -   5621    6964
      2015/02/05 21:52:22  -   5360    7299
      2015/02/05 21:52:23  -   5458    6815
      2015/02/05 21:52:24  -   5366    7550
      2015/02/05 21:52:25  -   5370    7481
      2015/02/05 21:52:26  -   5449    8295
      2015/02/05 21:52:27  -   4358    6687
      2015/02/05 21:52:28  -   5214    7275
      2015/02/05 21:52:29  -   4180    6212
      2015/02/05 21:52:30  -   5230    6743
      2015/02/05 21:52:31  -   5294    8363
      2015/02/05 21:52:32  -   5372    7549
      2015/02/05 21:52:33  -   5494    6636
      2015/02/05 21:52:34  -   2181   17581
      2015/02/05 21:52:35  -      0   24946
      2015/02/05 21:52:36  -      0   24915
      2015/02/05 21:52:37  -      0   24767
      2015/02/05 21:52:38  -      0   24723
      2015/02/05 21:52:39  -      0   24309
      2015/02/05 21:52:40  -      0   24201
      2015/02/05 21:52:41  -      1   24879
      2015/02/05 21:52:42  -      0   24789
      2015/02/05 21:52:43  -      0   24684
      2015/02/05 21:52:44  -      1   25125
      2015/02/05 21:52:45  -      0   24900
      2015/02/05 21:52:46  -      1   25148
      2015/02/05 21:52:47  -      0   25011
      2015/02/05 21:52:48  -      1   24925
      2015/02/05 21:52:49  -      0   24981
      2015/02/05 21:52:50  -      0   24892
      2015/02/05 21:52:51  -      1   25020

      Setup

      • single mongod
      • server build info

                "version" : "3.0.0-rc8",
                "gitVersion" : "9d0714cdcffadb203ff68730acedbc40875403da",

      • server config

                "argv" : [
                        "/home/rzhang/wt/bin/rc8/mongod",
                        "--dbpath=/data/ssd-sdc/dbs-wt",
                        "--storageEngine=wiredTiger",
                        "--logpath=/data/ssd-sdc/logs/mongod.log",
                        "--fork"
                ],

      1. hammer.test.linux.gz
        1.66 MB
        Rui Zhang
      1. gdb.png
        431 kB
      2. ts.png
        291 kB

        Activity

        Hide
        sue.loverso Sue LoVerso added a comment - - edited

        I am able to reproduce this. I have opened https://github.com/wiredtiger/wiredtiger/issues/1647 for this issue. Since we're updating the same thing over and over, WT ends up with a very long list of update structures that we're walking and the size of that list means the walks start taking a very long time.

        We'll evaluate the fix for inclusion in the next WT drop.

        Show
        sue.loverso Sue LoVerso added a comment - - edited I am able to reproduce this. I have opened https://github.com/wiredtiger/wiredtiger/issues/1647 for this issue. Since we're updating the same thing over and over, WT ends up with a very long list of update structures that we're walking and the size of that list means the walks start taking a very long time. We'll evaluate the fix for inclusion in the next WT drop.
        Hide
        sallgeud Chad Kreimendahl added a comment -

        I see this as "fixed" and also labeled 3.1. Is that because the next wiredtiger iteration won't be included until 3.1 or will this update be included in 3.0?

        Show
        sallgeud Chad Kreimendahl added a comment - I see this as "fixed" and also labeled 3.1. Is that because the next wiredtiger iteration won't be included until 3.1 or will this update be included in 3.0?
        Hide
        michael.cahill Michael Cahill added a comment -

        Hi Chad Kreimendahl, sorry for the confusion – I've updated the fix version, this will be addressed in 3.0.

        Show
        michael.cahill Michael Cahill added a comment - Hi Chad Kreimendahl , sorry for the confusion – I've updated the fix version, this will be addressed in 3.0.
        Hide
        pasette Dan Pasette added a comment -

        Sorry to be a bit pedantic with the jira hygiene, but we can't mark it "resolved" until we get a new drop from WT. Planned for tmrw morning. Then we'll resolve and backport.

        Show
        pasette Dan Pasette added a comment - Sorry to be a bit pedantic with the jira hygiene, but we can't mark it "resolved" until we get a new drop from WT. Planned for tmrw morning. Then we'll resolve and backport.
        Hide
        rui.zhang Rui Zhang added a comment -

        test rc9, run the the script for 120 second without any drop of throughput, and the overall throughput for update is higher than that of rc8.

        rc9 show around 28k throughput and without drop

        insert query update delete getmore command % dirty % used flushes  vsize   res qr|qw ar|aw netIn netOut conn     time
            *0    *0  28042     *0       0     1|0     0.0    0.0       0 349.0M 81.0M  0|15   1|2    4m     2m   22 19:31:12
            *0    *0  28117     *0       0     1|0     0.0    0.0       0 349.0M 81.0M  0|16   1|2    4m     2m   22 19:31:13

        rc8 peak at around 22k, and TPS drop to close to 0 after around 30 second

        insert query update delete getmore command % dirty % used flushes  vsize   res qr|qw ar|aw       netIn      netOut conn     time
            *0    *0  22629     *0       0     1|0     0.0    0.0       0 348.0M 80.0M  0|13   1|6          3m          2m   22 19:32:17
            *0    *0  22510     *0       0     1|0     0.0    0.0       0 348.0M 80.0M  0|13   1|6          3m          2m   22 19:32:18
            *0    *0  22408     *0       0     1|0     0.0    0.0       0 348.0M 80.0M  0|13   1|6          3m          2m   22 19:32:19
            *0    *0  22494     *0       0     1|0     0.0    0.0       0 349.0M 80.0M  0|15   1|4          3m          2m   22 19:32:20
            *0    *0  13963     *0       0     2|0     0.0    0.0       0 349.0M 81.0M   0|0  1|20          2m          1m   22 19:32:21
            *0    *0      4     *0       0     1|0     0.0    0.0       0 350.0M 81.0M   0|0  1|20        675b         16k   22 19:32:22
            *0    *0     *0     *0       0     1|0     0.0    0.0       0 350.0M 82.0M   0|0  1|20         79b         16k   22 19:32:23

        Show
        rui.zhang Rui Zhang added a comment - test rc9, run the the script for 120 second without any drop of throughput, and the overall throughput for update is higher than that of rc8. rc9 show around 28k throughput and without drop insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time *0 *0 28042 *0 0 1|0 0.0 0.0 0 349.0M 81.0M 0|15 1|2 4m 2m 22 19:31:12 *0 *0 28117 *0 0 1|0 0.0 0.0 0 349.0M 81.0M 0|16 1|2 4m 2m 22 19:31:13 rc8 peak at around 22k, and TPS drop to close to 0 after around 30 second insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn time *0 *0 22629 *0 0 1|0 0.0 0.0 0 348.0M 80.0M 0|13 1|6 3m 2m 22 19:32:17 *0 *0 22510 *0 0 1|0 0.0 0.0 0 348.0M 80.0M 0|13 1|6 3m 2m 22 19:32:18 *0 *0 22408 *0 0 1|0 0.0 0.0 0 348.0M 80.0M 0|13 1|6 3m 2m 22 19:32:19 *0 *0 22494 *0 0 1|0 0.0 0.0 0 349.0M 80.0M 0|15 1|4 3m 2m 22 19:32:20 *0 *0 13963 *0 0 2|0 0.0 0.0 0 349.0M 81.0M 0|0 1|20 2m 1m 22 19:32:21 *0 *0 4 *0 0 1|0 0.0 0.0 0 350.0M 81.0M 0|0 1|20 675b 16k 22 19:32:22 *0 *0 *0 *0 0 1|0 0.0 0.0 0 350.0M 82.0M 0|0 1|20 79b 16k 22 19:32:23

          People

          • Votes:
            1 Vote for this issue
            Watchers:
            25 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved: