Uploaded image for project: 'Documentation'
  1. Documentation
  2. DOCS-14239

Investigate changes in SERVER-53852: MongoDB hangs randomly

    XMLWordPrintable

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major - P3
    • Resolution: Won't Do
    • Affects Version/s: None
    • Fix Version/s: 4.4.6, 5.0.0-rc0
    • Component/s: manual, Server
    • Labels:
      None
    • Last comment by Customer:
      true
    • Story Points:
      2
    • Sprint:
      ServerDocs2020: Mar9 - Mar16, ServerDocs2020: Mar16 - Mar23

      Description

      Description

      Downstream Change Summary

      Bug has been fixed which could cause mongod to hang up when there is a failure to write to log file.

      Description of Linked Ticket

      I am running MongoDB 4.4.2 cluster with one Primary, one Secondary and one hidden Secondary. On the hidden Secondary, sometimes (like once every 2 days or so) MongoDB just hangs (once it also happened on the Primary). By "hangs", I mean:

      • I am not able to connect to mongod via mongoshell
      • Secondary stops replicating, and starts lagging (until I restart it manually)
      • but running `rs.status()` on the Primary server shows that hung Secondary is reachable

      I referred to https://jira.mongodb.org/browse/SERVER-34190 which looked like a similar issue (but it was fixed in 3.6.4). So I have attached the files that were requested in that issue:

      1. Output of the gdb command: gdb p $(pidof mongod) -batch -ex 'thread apply all bt' > gdb_`date +"%Y%m-%d_%H-%M-%S"`.txt
      2. Last 500 lines of mongod.log
      3. I have provided the latest files in diagnostic.data folder

      Please let me know if you need anything else or you want me to try running some commands.

      Scope of changes

      Impact to Other Docs

      MVP (Work and Date)

      Resources (Scope or Design Docs, Invision, etc.)

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              andrew.feierabend Andrew Feierabend (Inactive)
              Reporter:
              backlog-server-pm Backlog - Core Eng Program Management Team
              Participants:
              Last commenter:
              Backlog - Core Eng Program Management Team Backlog - Core Eng Program Management Team
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Days since reply:
                29 weeks, 6 days ago
                Date of 1st Reply: