[SERVER-74499] mongod 4.4 not restarting on centOS Created: 01/Mar/23  Updated: 01/Mar/23  Resolved: 01/Mar/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Critical - P2
Reporter: django unchained Assignee: Chris Kelly
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

OS : CentOS Linux release 7.9.2009 (Core)
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 2
RAM: 12 GB


Issue Links:
Duplicate
duplicates SERVER-74345 mongodb-org-server 4.4.19, 5.0.15, 6.... Closed
Operating System: ALL
Participants:

 Description   

I have around 6.0 GB of data on the file system. At one point, the mongod just stopped out of nowhere.

The last message in log while the service was still running is the following:

 

{"t":{"$date":"2023-02-24T07:04:09.259+01:00"},"s":"I",  "c":"CONTROL",  "id":23377,   "ctx":"SignalHandler","msg":"Received signal","attr":{"signal":15,"error":"Terminated"}} 

 

I tried upping the server RAM and performed a --repair on the data, but that did not help. Even using a different data folder while starting up is not working. From the logs, I fail to understand where else to look why mongod is not starting.

Maybe the OS is the culprit here?

mongod.conf:

 

# mongod.conf# for documentation of all options, see:
#   http://docs.mongodb.org/manual/reference/configuration-options/# where to write logging data.
systemLog:
  destination: file
  logAppend: true
  path: /var/log/mongodb/mongod.log# Where and how to store data.
storage:
  dbPath: /data/mongo  journal:
    enabled: true
#  engine:
#  wiredTiger:# how the process runs
processManagement:
  fork: true  # fork and run in background
  pidFilePath: /var/run/mongodb/mongod.pid  # location of pidfile
  timeZoneInfo: /usr/share/zoneinfo# network interfaces
net:
  port: 27017
  bindIp: 0.0.0.0  # Listen to local interface only, comment to listen on all interfaces.
#security:
# BEGIN ANSIBLE MANAGED BLOCK
security:
  authorization: enabled
# END ANSIBLE MANAGED BLOCK 

log after trying to restart:

{"t":{"$date":"2023-03-01T17:31:57.485+01:00"},"s":"I",  "c":"CONTROL",  "id":20698,   "ctx":"main","msg":"***** SERVER RESTARTED *****"}
{"t":{"$date":"2023-03-01T17:31:57.491+01:00"},"s":"I",  "c":"CONTROL",  "id":23285,   "ctx":"main","msg":"Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'"}
{"t":{"$date":"2023-03-01T17:31:57.510+01:00"},"s":"I",  "c":"NETWORK",  "id":4648601, "ctx":"main","msg":"Implicit TCP FastOpen unavailable. If TCP FastOpen is required, set tcpFastOpenServer, tcpFastOpenClient, and tcpFastOpenQueueSize."}
{"t":{"$date":"2023-03-01T17:31:57.510+01:00"},"s":"I",  "c":"STORAGE",  "id":4615611, "ctx":"initandlisten","msg":"MongoDB starting","attr":{"pid":1897,"port":27017,"dbPath":"/data/mongo","architecture":"64-bit","host":"example.com"}}
{"t":{"$date":"2023-03-01T17:31:57.510+01:00"},"s":"I",  "c":"CONTROL",  "id":23403,   "ctx":"initandlisten","msg":"Build Info","attr":{"buildInfo":{"version":"4.4.19","gitVersion":"9a996e0ad993148b9650dc402e6d3b1804ad3b8a","openSSLVersion":"OpenSSL 1.0.1e-fips 11 Feb 2013","modules":[],"allocator":"tcmalloc","environment":{"distmo
d":"rhel70","distarch":"x86_64","target_arch":"x86_64"}}}}
{"t":{"$date":"2023-03-01T17:31:57.510+01:00"},"s":"I",  "c":"CONTROL",  "id":51765,   "ctx":"initandlisten","msg":"Operating System","attr":{"os":{"name":"CentOS Linux release 7.9.2009 (Core)","version":"Kernel 3.10.0-1160.83.1.el7.x86_64"}}}
{"t":{"$date":"2023-03-01T17:31:57.510+01:00"},"s":"I",  "c":"CONTROL",  "id":21951,   "ctx":"initandlisten","msg":"Options set by command line","attr":{"options":{"config":"/etc/mongod.conf","net":{"bindIp":"0.0.0.0","port":27017},"processManagement":{"fork":true,"pidFilePath":"/var/run/mongodb/mongod.pid","timeZoneInfo":"/usr/sha
re/zoneinfo"},"security":{"authorization":"enabled"},"storage":{"dbPath":"/data/mongo","journal":{"enabled":true}},"systemLog":{"destination":"file","logAppend":true,"path":"/var/log/mongodb/mongod.log"}}}}
{"t":{"$date":"2023-03-01T17:31:57.511+01:00"},"s":"I",  "c":"STORAGE",  "id":22270,   "ctx":"initandlisten","msg":"Storage engine to use detected by data files","attr":{"dbpath":"/data/mongo","storageEngine":"wiredTiger"}}
{"t":{"$date":"2023-03-01T17:31:57.511+01:00"},"s":"I",  "c":"STORAGE",  "id":22315,   "ctx":"initandlisten","msg":"Opening WiredTiger","attr":{"config":"create,cache_size=5495M,session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000,close_scan_interval=10,close_handle_minimum=250),statistics_log=(wait=0),verbose=[recovery_progress,checkpoint_progress,compact_progress],"}}
{"t":{"$date":"2023-03-01T17:31:58.294+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1677688318:294080][1897:0x7f8cadc83bc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 388 through 389"}}
{"t":{"$date":"2023-03-01T17:31:58.378+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1677688318:378980][1897:0x7f8cadc83bc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 389 through 389"}}
{"t":{"$date":"2023-03-01T17:31:58.531+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1677688318:531717][1897:0x7f8cadc83bc0], txn-recover: [WT_VERB_RECOVERY | WT_VERB_RECOVERY_PROGRESS] Main recovery loop: starting at 388/5376 to 389/256"}}
{"t":{"$date":"2023-03-01T17:31:58.665+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1677688318:665731][1897:0x7f8cadc83bc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 388 through 389"}}
{"t":{"$date":"2023-03-01T17:31:58.762+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1677688318:762392][1897:0x7f8cadc83bc0], txn-recover: [WT_VERB_RECOVERY_PROGRESS] Recovering log 389 through 389"}}
{"t":{"$date":"2023-03-01T17:31:58.830+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1677688318:830165][1897:0x7f8cadc83bc0], txn-recover: [WT_VERB_RECOVERY | WT_VERB_RECOVERY_PROGRESS] Set global recovery timestamp: (0, 0)"}}
{"t":{"$date":"2023-03-01T17:31:58.830+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1677688318:830222][1897:0x7f8cadc83bc0], txn-recover: [WT_VERB_RECOVERY | WT_VERB_RECOVERY_PROGRESS] Set global oldest timestamp: (0, 0)"}}
{"t":{"$date":"2023-03-01T17:31:58.832+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"initandlisten","msg":"WiredTiger message","attr":{"message":"[1677688318:832669][1897:0x7f8cadc83bc0], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 1, snapshot max: 1 snapshot count: 0, oldest timestamp: (0, 0) , meta checkpoint timestamp: (0, 0) base write gen: 9312832"}}
{"t":{"$date":"2023-03-01T17:31:58.842+01:00"},"s":"I",  "c":"STORAGE",  "id":4795906, "ctx":"initandlisten","msg":"WiredTiger opened","attr":{"durationMillis":1331}}
{"t":{"$date":"2023-03-01T17:31:58.842+01:00"},"s":"I",  "c":"RECOVERY", "id":23987,   "ctx":"initandlisten","msg":"WiredTiger recoveryTimestamp","attr":{"recoveryTimestamp":{"$timestamp":{"t":0,"i":0}}}}
{"t":{"$date":"2023-03-01T17:31:58.850+01:00"},"s":"I",  "c":"STORAGE",  "id":22262,   "ctx":"initandlisten","msg":"Timestamp monitor starting"}
{"t":{"$date":"2023-03-01T17:31:58.853+01:00"},"s":"W",  "c":"CONTROL",  "id":22178,   "ctx":"initandlisten","msg":"/sys/kernel/mm/transparent_hugepage/enabled is 'always'. We suggest setting it to 'never'","tags":["startupWarnings"]}
{"t":{"$date":"2023-03-01T17:31:58.853+01:00"},"s":"W",  "c":"CONTROL",  "id":22181,   "ctx":"initandlisten","msg":"/sys/kernel/mm/transparent_hugepage/defrag is 'always'. We suggest setting it to 'never'","tags":["startupWarnings"]}
{"t":{"$date":"2023-03-01T17:31:58.867+01:00"},"s":"I",  "c":"STORAGE",  "id":20536,   "ctx":"initandlisten","msg":"Flow Control is enabled on this deployment"}
{"t":{"$date":"2023-03-01T17:31:58.870+01:00"},"s":"I",  "c":"FTDC",     "id":20625,   "ctx":"initandlisten","msg":"Initializing full-time diagnostic data capture","attr":{"dataDirectory":"/data/mongo/diagnostic.data"}}
{"t":{"$date":"2023-03-01T17:31:58.871+01:00"},"s":"I",  "c":"REPL",     "id":6015317, "ctx":"initandlisten","msg":"Setting new configuration state","attr":{"newState":"ConfigReplicationDisabled","oldState":"ConfigPreStart"}}
{"t":{"$date":"2023-03-01T17:31:58.873+01:00"},"s":"I",  "c":"NETWORK",  "id":23015,   "ctx":"listener","msg":"Listening on","attr":{"address":"/tmp/mongodb-27017.sock"}}
{"t":{"$date":"2023-03-01T17:31:58.873+01:00"},"s":"I",  "c":"NETWORK",  "id":23015,   "ctx":"listener","msg":"Listening on","attr":{"address":"0.0.0.0"}}
{"t":{"$date":"2023-03-01T17:31:58.873+01:00"},"s":"I",  "c":"NETWORK",  "id":23016,   "ctx":"listener","msg":"Waiting for connections","attr":{"port":27017,"ssl":"off"}}
{"t":{"$date":"2023-03-01T17:31:58.876+01:00"},"s":"I",  "c":"CONTROL",  "id":23377,   "ctx":"SignalHandler","msg":"Received signal","attr":{"signal":15,"error":"Terminated"}}
{"t":{"$date":"2023-03-01T17:31:58.876+01:00"},"s":"I",  "c":"CONTROL",  "id":23378,   "ctx":"SignalHandler","msg":"Signal was sent by kill(2)","attr":{"pid":1,"uid":0}}
{"t":{"$date":"2023-03-01T17:31:58.876+01:00"},"s":"I",  "c":"CONTROL",  "id":23381,   "ctx":"SignalHandler","msg":"will terminate after current cmd ends"}
{"t":{"$date":"2023-03-01T17:31:58.876+01:00"},"s":"I",  "c":"REPL",     "id":4784900, "ctx":"SignalHandler","msg":"Stepping down the ReplicationCoordinator for shutdown","attr":{"waitTimeMillis":10000}}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"COMMAND",  "id":4784901, "ctx":"SignalHandler","msg":"Shutting down the MirrorMaestro"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"SHARDING", "id":4784902, "ctx":"SignalHandler","msg":"Shutting down the WaitForMajorityService"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"CONTROL",  "id":4784903, "ctx":"SignalHandler","msg":"Shutting down the LogicalSessionCache"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"NETWORK",  "id":20562,   "ctx":"SignalHandler","msg":"Shutdown: going to close listening sockets"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"NETWORK",  "id":23017,   "ctx":"listener","msg":"removing socket file","attr":{"path":"/tmp/mongodb-27017.sock"}}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"NETWORK",  "id":4784905, "ctx":"SignalHandler","msg":"Shutting down the global connection pool"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"STORAGE",  "id":4784906, "ctx":"SignalHandler","msg":"Shutting down the FlowControlTicketholder"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"-",        "id":20520,   "ctx":"SignalHandler","msg":"Stopping further Flow Control ticket acquisitions."}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"STORAGE",  "id":4784908, "ctx":"SignalHandler","msg":"Shutting down the PeriodicThreadToAbortExpiredTransactions"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"STORAGE",  "id":4784934, "ctx":"SignalHandler","msg":"Shutting down the PeriodicThreadToDecreaseSnapshotHistoryCachePressure"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"REPL",     "id":4784909, "ctx":"SignalHandler","msg":"Shutting down the ReplicationCoordinator"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"SHARDING", "id":4784910, "ctx":"SignalHandler","msg":"Shutting down the ShardingInitializationMongoD"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"REPL",     "id":4784911, "ctx":"SignalHandler","msg":"Enqueuing the ReplicationStateTransitionLock for shutdown"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"-",        "id":4784912, "ctx":"SignalHandler","msg":"Killing all operations for shutdown"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"-",        "id":4695300, "ctx":"SignalHandler","msg":"Interrupted all currently running operations","attr":{"opsKilled":3}}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"COMMAND",  "id":4784913, "ctx":"SignalHandler","msg":"Shutting down all open transactions"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"REPL",     "id":4784914, "ctx":"SignalHandler","msg":"Acquiring the ReplicationStateTransitionLock for shutdown"}
{"t":{"$date":"2023-03-01T17:31:58.877+01:00"},"s":"I",  "c":"INDEX",    "id":4784915, "ctx":"SignalHandler","msg":"Shutting down the IndexBuildsCoordinator"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"REPL",     "id":4784916, "ctx":"SignalHandler","msg":"Reacquiring the ReplicationStateTransitionLock for shutdown"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"REPL",     "id":4784917, "ctx":"SignalHandler","msg":"Attempting to mark clean shutdown"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"NETWORK",  "id":4784918, "ctx":"SignalHandler","msg":"Shutting down the ReplicaSetMonitor"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"SHARDING", "id":4784921, "ctx":"SignalHandler","msg":"Shutting down the MigrationUtilExecutor"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"CONTROL",  "id":4784925, "ctx":"SignalHandler","msg":"Shutting down free monitoring"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"CONTROL",  "id":20609,   "ctx":"SignalHandler","msg":"Shutting down free monitoring"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":4784927, "ctx":"SignalHandler","msg":"Shutting down the HealthLog"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":4784929, "ctx":"SignalHandler","msg":"Acquiring the global lock for shutdown"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":4784930, "ctx":"SignalHandler","msg":"Shutting down the storage engine"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":22320,   "ctx":"SignalHandler","msg":"Shutting down journal flusher thread"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":22321,   "ctx":"SignalHandler","msg":"Finished shutting down journal flusher thread"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":20282,   "ctx":"SignalHandler","msg":"Deregistering all the collections"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":22261,   "ctx":"SignalHandler","msg":"Timestamp monitor shutting down"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":22317,   "ctx":"SignalHandler","msg":"WiredTigerKVEngine shutting down"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":22318,   "ctx":"SignalHandler","msg":"Shutting down session sweeper thread"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":22319,   "ctx":"SignalHandler","msg":"Finished shutting down session sweeper thread"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":22322,   "ctx":"SignalHandler","msg":"Shutting down checkpoint thread"}
{"t":{"$date":"2023-03-01T17:31:58.878+01:00"},"s":"I",  "c":"STORAGE",  "id":22323,   "ctx":"SignalHandler","msg":"Finished shutting down checkpoint thread"}
{"t":{"$date":"2023-03-01T17:31:58.879+01:00"},"s":"I",  "c":"STORAGE",  "id":4795902, "ctx":"SignalHandler","msg":"Closing WiredTiger","attr":{"closeConfig":"leak_memory=true,"}}
{"t":{"$date":"2023-03-01T17:31:58.881+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"SignalHandler","msg":"WiredTiger message","attr":{"message":"[1677688318:881176][1897:0x7f8ca8c9b700], close_ckpt: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 4, snapshot max: 4 snapshot count: 0, oldest timestamp: (0, 0), meta checkpoint timestamp: (0, 0) base write gen: 9312832"}}
{"t":{"$date":"2023-03-01T17:31:58.900+01:00"},"s":"I",  "c":"STORAGE",  "id":4795901, "ctx":"SignalHandler","msg":"WiredTiger closed","attr":{"durationMillis":21}}
{"t":{"$date":"2023-03-01T17:31:58.900+01:00"},"s":"I",  "c":"STORAGE",  "id":22279,   "ctx":"SignalHandler","msg":"shutdown: removing fs lock..."}
{"t":{"$date":"2023-03-01T17:31:58.900+01:00"},"s":"I",  "c":"-",        "id":4784931, "ctx":"SignalHandler","msg":"Dropping the scope cache for shutdown"}
{"t":{"$date":"2023-03-01T17:31:58.900+01:00"},"s":"I",  "c":"FTDC",     "id":4784926, "ctx":"SignalHandler","msg":"Shutting down full-time data capture"}
{"t":{"$date":"2023-03-01T17:31:58.900+01:00"},"s":"I",  "c":"FTDC",     "id":20626,   "ctx":"SignalHandler","msg":"Shutting down full-time diagnostic data capture"}
{"t":{"$date":"2023-03-01T17:31:58.900+01:00"},"s":"I",  "c":"CONTROL",  "id":20565,   "ctx":"SignalHandler","msg":"Now exiting"}
{"t":{"$date":"2023-03-01T17:31:58.900+01:00"},"s":"I",  "c":"CONTROL",  "id":23138,   "ctx":"SignalHandler","msg":"Shutting down","attr":{"exitCode":0}} 

 



 Comments   
Comment by Chris Kelly [ 01/Mar/23 ]

Hi farhadino@gmail.com,

This looks like SERVER-74345 which affects users who use MongoDB on the Debian or RPM packages, and upgraded to versions 4.4.19, 5.0.15, or 6.0.5. 

To update your configuration, consider running these commands (as root if needed):

 

systemctl stop mongod.service
sed -i.bak '/fork: true/d' /etc/mongod.conf   
systemctl start mongod.service

I'm going to close this ticket as a duplicate - please monitor the linked ticket for further information. 

Christopher

Comment by django unchained [ 01/Mar/23 ]

Sorry, I meant to write Mongo DB version 4.4, not 3.4. Please update the Title of the issue.

Generated at Thu Feb 08 06:27:36 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.