Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-75052

5.0 shard replica startup takes much longer than 4.x due to WT checkpoint

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 5.0.15
    • Component/s: None
    • ALL
    • Hide

      Run a 5.0 sharded cluster with many databases and collections, and heavy read and write volume.

      Show
      Run a 5.0 sharded cluster with many databases and collections, and heavy read and write volume.

      In 4.x, out shard replicas consistently took around 2 minutes to start up. In 5.0, they are taking anywhere from 3.5 minutes to 12 minutes to start up. The additional time to start is primarily consumed by a WT checkpoint. It appears that 5.0 performs a checkpoint as part of the startup process, whereas 4.x waited until after startup was complete to takes the first checkpoint.

       

      Here's an abridged example of a 4.2 startup:

      2022-11-22T02:45:20.979+0000 I  CONTROL  [main] ***** SERVER RESTARTED *****
      2022-11-22T02:47:17.352+0000 I  REPL     [rsBackgroundSync] Changed sync source from empty to shard1repl3.example.com:27017
      // TOTAL: 1m57s
      //...
      2022-11-22T02:49:07.362+0000 I  STORAGE  [WTCheckpointThread] WiredTiger message [1669085347:362181][77440:0x7f956c557700], file:enelglobalservicessrl3/collection-11873-6569341069224170235.wt, WT_SESSION.checkpoint: Checkpoint has been running for 20 seconds and wrote: 65000 pages (3797 MB)
      2022-11-22T02:49:11.100+0000 I  STORAGE  [WTCheckpointThread] WiredTiger message [1669085351:100075][77440:0x7f956c557700], WT_SESSION.checkpoint: Checkpoint ran for 23 seconds and wrote: 83927 pages (5232 MB)
      // CHECKPOINT: 23s

       

      Notice, the startup takes less than 2 minutes, and the first checkpoint runs a couple minutes later.

       

      And here's an abridged example of a 5.0 startup on the exact same machine:

       {"t":{"$date":"2023-03-20T13:33:45.171+00:00"},"s":"I",  "c":"CONTROL",  "id":20698,   "ctx":"-","msg":"***** SERVER RESTARTED *****"}
      // ...
      {"t":{"$date":"2023-03-20T13:34:53.896+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1679319293:896857][116638:0x7f46be029b80], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 1, snapshot max: 1 snapshot count: 0, oldest timestamp: (1679318801, 3202) , meta checkpoint timestamp: (1679319101, 3202) base write gen: 1070042338"}}
      {"t":{"$date":"2023-03-20T13:41:37.518+00:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"Checkpointer","msg":"WiredTiger message","attr":{"message":"[1679319697:518964][116638:0x7f46adbe4700], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 10, snapshot max: 10 snapshot count: 0, oldest timestamp: (1679318801, 3202) , meta checkpoint timestamp: (1679319101, 3202) base write gen: 1070042338"}}
      // CHECKPOINT: 7m12s
      // ...
      {"t":{"$date":"2023-03-20T13:42:04.502+00:00"},"s":"I",  "c":"REPL",     "id":21088,   "ctx":"BackgroundSync","msg":"Changed sync source","attr":{"oldSyncSource":"empty","newSyncSource":"shard1repl3.example.com:27017"}}
      // STARTUP: 8m19s
      

       

      Notice, the startup takes over 8 minutes, and over 7 minutes of that is spent doing a checkpoint.

      8-12 minute startup time results in 10-14 minute restart time, which is unacceptable. It makes rolling restarts for upgrades and other maintenance take 5-7 times longer. And even more importantly, it prevents us from expediently dealing with cluster availability and performance issues.

      Is it viable to switch back to 4.x behavior of waiting until startup is complete to perform the first wT checkpoint?

       

            Assignee:
            yuan.fang@mongodb.com Yuan Fang
            Reporter:
            ian.springer@salesforce.com Ian Springer
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: