Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-67936

Server stuck when systemctl autorestart after mongo crash

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: None
    • Labels:
      None
    • ALL

      Hi,

      I'm running MongoDB 4.4 on Ubuntu 20.

      I've created the following file:

      # /etc/systemd/system/mongod.service.d/always_restart.conf
      [Service]
      Restart=always
      RestartSec=60
      

      MongoDb is on a server with other services, and I suspect an out-of-memory error. Here are the logs from journalctl:

      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]: {"t":{"$date":"2022-07-10T02:22:40.381+00:00"},"s":"F",  "c":"CONTROL",  "id":4522200, "ctx":"conn12260","msg":"Writing to log file failed, aborting appl>
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]: BACKTRACE: {"backtrace":[{"a":"AAAAB0BD15A4","b":"AAAAAD410000","o":"37C15A4"},{"a":"AAAAB0BD3E80","b":"AAAAAD410000","o":"37C3E80","s":"_ZN5mongo15print>
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAB0BD15A4","b":"AAAAAD410000","o":"37C15A4"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAB0BD3E80","b":"AAAAAD410000","o":"37C3E80","s":"_ZN5mongo15printStackTraceERSo","s+":"40"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAB0B8F4EC","b":"AAAAAD410000","o":"377F4EC","s":"_ZN5mongo5logv214FileRotateSink7consumeERKN5boost3log12v2s_mt_posix11record_viewERKNSt>
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAB0BAAEDC","b":"AAAAAD410000","o":"379AEDC","s":"_ZN5boost3log12v2s_mt_posix5sinks13unlocked_sinkIN5mongo5logv216CompositeBackendIJNS5_>
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAB0CB1A1C","b":"AAAAAD410000","o":"38A1A1C","s":"_ZN5boost3log12v2s_mt_posix4core16push_record_moveERNS1_6recordE","s+":"164"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAB0B9C734","b":"AAAAAD410000","o":"378C734","s":"_ZN5mongo5logv26detail9doLogImplEiRKNS0_11LogSeverityERKNS0_10LogOptionsENS_10StringDa>
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAAE57A000","b":"AAAAAD410000","o":"116A000"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAAE581C68","b":"AAAAAD410000","o":"1171C68","s":"_ZN5mongo9transport19ServiceStateMachine4Impl14cleanupSessionERKNS_6StatusE","s+":"F0"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAAE582024","b":"AAAAAD410000","o":"1172024","s":"_ZN5mongo9transport19ServiceStateMachine4Impl15scheduleNewLoopENS_6StatusE","s+":"2AC"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAAE582418","b":"AAAAAD410000","o":"1172418"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAAE5826AC","b":"AAAAAD410000","o":"11726AC","s":"_ZN5mongo9transport19ServiceStateMachine4Impl12startNewLoopERKNS_6StatusE","s+":"19C"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAAE582BC8","b":"AAAAAD410000","o":"1172BC8"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAB041FAA4","b":"AAAAAD410000","o":"300FAA4"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAAEBE5D94","b":"AAAAAD410000","o":"17D5D94","s":"_ZZN5mongo15unique_functionIFvvEE8makeImplIZNS_9transport15ServiceExecutor8scheduleENS>
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAB041FCC0","b":"AAAAAD410000","o":"300FCC0"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAB0423810","b":"AAAAAD410000","o":"3013810"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"AAAAB04238C4","b":"AAAAAD410000","o":"30138C4"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"FFFF81D633F0","b":"FFFF81D5C000","o":"73F0"}
      Jul 10 02:22:40 pp-agg-1638951636 mongod[373116]:   Frame: {"a":"FFFF81CBB0DC","b":"FFFF81BEB000","o":"D00DC"}
      Jul 10 02:22:43 pp-agg-1638951636 systemd[1]: mongod.service: Main process exited, code=exited, status=1/FAILURE
      Jul 10 02:22:43 pp-agg-1638951636 systemd[1]: mongod.service: Failed with result 'exit-code'.
      Jul 10 02:22:48 pp-agg-1638951636 systemd[1]: mongod.service: Scheduled restart job, restart counter is at 1.
      Jul 10 02:22:48 pp-agg-1638951636 systemd[1]: Stopped MongoDB Database Server.
      Jul 10 02:22:48 pp-agg-1638951636 systemd[1]: Started MongoDB Database Server.
      

      The issue is when I run `systemctl status mongod` it says active, and I can even see `mongod` process but I cannot access the server. Also memory usage of the process is very low (about 50Mo). Also the mongod process didn't write any startup_log, when I manually `systemctl restart mongod` it works.

      My guess is the autorestart happen too fast after crash, hence the memory is not released yet. But, the `systemctl status mongod` shouldn't return active.

            Assignee:
            edwin.zhou@mongodb.com Edwin Zhou
            Reporter:
            moroine.bentefrit@gmail.com moroine bentefrit
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: