Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-18887

WiredTiger instances taking 2 hours to startup and transition to Secondary state

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Duplicate
    • Affects Version/s: 3.0.0, 3.0.1, 3.0.2, 3.0.3
    • Fix Version/s: None
    • Component/s: Performance, WiredTiger
    • Labels:
      None
    • Environment:
      Windows 2012 R2 VM running in Azure, 7GB RAM
    • Operating System:
      Windows
    • Steps To Reproduce:
      Hide

      Restart the Windows Server MongoDB is running on. On restart the system will take about 2 hours to get to the 'Secondary' state during which times it appears to be scanning every collection and index file (~2000 for us).

      Show
      Restart the Windows Server MongoDB is running on. On restart the system will take about 2 hours to get to the 'Secondary' state during which times it appears to be scanning every collection and index file (~2000 for us).
    • Sprint:
      Platform 7 08/10/15

      Description

      WiredTiger instances are taking a long time to startup to the point where it becomes a Secondary or Primary. The log output up to where the long delay occurs before the transition to the Secondary/Primary state is shown below:

      2015-06-09T00:41:34.920+0000 I CONTROL ***** SERVER RESTARTED *****
      2015-06-09T00:41:35.206+0000 I CONTROL Trying to start Windows service 'MongoDB'
      2015-06-09T00:41:35.224+0000 I STORAGE Service running
      2015-06-09T00:41:35.254+0000 W - [initandlisten] Detected unclean shutdown - m:/mongodb/data\mongod.lock is not empty.
      2015-06-09T00:41:35.254+0000 W STORAGE [initandlisten] Recovering data from the last clean checkpoint.
      2015-06-09T00:41:35.255+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=1G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),

      During this time MongoDB appears to be reading every file in our database, we guess its some kind of file integrity check. I'm guessing this might be normal following an unclean shutdown in which case we need help with the shutdown issue . We've noticed a few Jira Issues relating to abnormal shutdowns due to the Windows Service manager and timeouts but they appear to be getting marked as fixed but as of yet our system keeps getting this problem. I have tried shutting down the service from a command prompt as well as the service manager and normal OS restarts but every one results in a long startup time. I'm not certain they all have produced the 'unclean shutdown' log message though.

      We are running a 5 node replica set, 2 MMAP, 2 WT and 1 arbiter. Our database size is ~280GB (MMAP) / 180GB (WT). We were planning to transition to WiredTiger but this and a few other problems are showstoppers for us at the moment.

        Attachments

        1. datadircontents.txt
          291 kB
        2. dircontents.txt
          291 kB
        3. journaldircontents.txt
          0.4 kB
        4. mongodb_log_000001.zip
          222 kB
        5. mongodb_log_000002.zip
          245 kB
        6. mongowtstartup.log
          2.88 MB

          Issue Links

            Activity

              People

              Assignee:
              mark.benvenuto Mark Benvenuto
              Reporter:
              rogerdawson7 Roger Dawson
              Participants:
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: