Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-37318

POD can't start in a kubernetes replicaset

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major - P3
    • Resolution: Done
    • 4.0.2
    • None
    • None
    • ALL

    Description

      Hi,

      we have an environment made up of at least 3 preemptible nodes (max 5 nodes) running with Kubernetes on Google Cloud Platform.

      Over that we were working with a MongoDB replicaset made up of 3 PODs, version 3.6 docker image.

      Now we are testing MongoDB version 4.0.2 docker image.

      Yesterday we done a lot of tests without problems, also with different resources (CPU and memory limits) and different Kubernetes node types: these operations required to update, recreate and move PODs from nodepools and other ones. All tests yesterday was ok.

      Today the first POD go always to CrashLoopBackOff, others 2 PODs are secondary.

      In the logs we found this:

      2018-09-26T08:09:00.028+0000 I CONTROL [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
      2018-09-26T08:09:00.028+0000 I CONTROL [main] note: noprealloc may hurt performance in many applications
      2018-09-26T08:09:00.034+0000 I CONTROL [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host="xxx"
      2018-09-26T08:09:00.034+0000 I CONTROL [initandlisten] db version v4.0.2
      2018-09-26T08:09:00.034+0000 I CONTROL [initandlisten] git version: fc1573ba18aee42f97a3bb13b67af7d837826b47
      2018-09-26T08:09:00.034+0000 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016
      2018-09-26T08:09:00.034+0000 I CONTROL [initandlisten] allocator: tcmalloc
      2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] modules: none
      2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] build environment:
      2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] distmod: ubuntu1604
      2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] distarch: x86_64
      2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] target_arch: x86_64
      2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] options: { net:

      { bindIp: "0.0.0.0" }

      , replication: { replSet: "xxx" }, storage: { mmapv1:

      { preallocDataFiles: false, smallFiles: true }

      } }
      2018-09-26T08:09:00.035+0000 W STORAGE [initandlisten] Detected unclean shutdown - /data/db/mongod.lock is not empty.
      2018-09-26T08:09:00.035+0000 I STORAGE [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
      2018-09-26T08:09:00.035+0000 W STORAGE [initandlisten] Recovering data from the last clean checkpoint.
      2018-09-26T08:09:00.035+0000 I STORAGE [initandlisten]
      2018-09-26T08:09:00.035+0000 I STORAGE [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
      2018-09-26T08:09:00.035+0000 I STORAGE [initandlisten] ** See http://dochub.mongodb.org/core/prodnotes-filesystem
      2018-09-26T08:09:00.035+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=7010M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
      2018-09-26T08:09:00.920+0000 I STORAGE [initandlisten] WiredTiger message [1537949340:920297][1:0x7f093a5bca00], txn-recover: Main recovery loop: starting at 358/256
      2018-09-26T08:09:00.920+0000 I STORAGE [initandlisten] WiredTiger message [1537949340:920849][1:0x7f093a5bca00], txn-recover: Recovering log 358 through 359
      2018-09-26T08:09:01.018+0000 I STORAGE [initandlisten] WiredTiger message [1537949341:18578][1:0x7f093a5bca00], txn-recover: Recovering log 359 through 359
      2018-09-26T08:09:01.113+0000 I STORAGE [initandlisten] WiredTiger message [1537949341:113092][1:0x7f093a5bca00], txn-recover: Set global recovery timestamp: 5baa520500000001
      2018-09-26T08:09:01.127+0000 I RECOVERY [initandlisten] WiredTiger recoveryTimestamp. Ts: Timestamp(1537888773, 1)
      2018-09-26T08:09:01.127+0000 I STORAGE [initandlisten] Triggering the first stable checkpoint. Initial Data: Timestamp(1537888773, 1) PrevStable: Timestamp(0, 0) CurrStable: Timestamp(1537888773, 1)
      2018-09-26T08:09:01.336+0000 I STORAGE [initandlisten] Starting WiredTigerRecordStoreThread local.oplog.rs
      2018-09-26T08:09:01.337+0000 I STORAGE [initandlisten] The size storer reports that the oplog contains 18753157 records totaling to 4939639735 bytes
      2018-09-26T08:09:01.337+0000 I STORAGE [initandlisten] Sampling from the oplog between Sep 25 12:59:40:46211 and Sep 25 15:18:21:1 to determine where to place markers for truncation
      2018-09-26T08:09:01.337+0000 I STORAGE [initandlisten] Taking 993 samples and assuming that each section of oplog contains approximately 188664 records totaling to 49694682 bytes
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 12:59:53:9643
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:00:07:2576
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:27:27632
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:44:4341
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:47:47383
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:51:33808
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:55:21357
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:59:7729
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:04:32:23512
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:05:40:4956
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:05:45:5332
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:05:59:23243
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:06:03:28610
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:06:07:34419
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:06:11:38253
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:06:15:39932
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:06:32:45647
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:48:49:15931
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:48:55:2698
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:00:30146
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:06:26530
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:12:18860
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:18:12411
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:29:10614
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:35:1915
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:40:33321
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:46:28232
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:52:25477
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:58:21096
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:50:25:736
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:50:30:35534
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:50:36:30659
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:50:42:28853
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:50:48:25546
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:51:23:25962
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:17:10717
      2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:23:19423
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:35:35771
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:41:6549
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:45:7728
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:49:5615
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:53:3866
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:57:11428
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:13:01:5984
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:15:1151
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:19:1424
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:22:32516
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:26:26611
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:30:29725
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:34:26472
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:38:20402
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:42:21922
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:46:20812
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:29:35488
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:33:34887
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:37:28908
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:41:28395
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:45:32895
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:50:794
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:53:32253
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:57:32519
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:31:04:10062
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:34:29:10436
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:34:36:14880
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:34:46:5600
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:34:57:8110
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:35:08:9095
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:35:19:7910
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:35:59:12359
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:36:10:12750
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:36:21:14260
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:36:32:18461
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:36:43:15327
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:36:54:10347
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:37:11:6788
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:43:17:17640
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:43:28:16675
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:43:40:368
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:43:51:2300
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:44:02:2576
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:46:34:7711
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:46:45:13268
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:46:56:18566
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:47:07:16849
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:47:18:18226
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:47:30:1623
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:10:46:9019
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:11:09:4009
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:11:32:6258
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:11:57:5684
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:12:20:6362
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:12:44:1273
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:13:07:3997
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:13:35:2151
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:14:59:7291
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:15:27:5583
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:15:50:6116
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:16:12:4857
      2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:16:32:351
      2018-09-26T08:09:01.421+0000 W STORAGE [initandlisten] Detected configuration for non-active storage engine mmapv1 when current storage engine is wiredTiger
      2018-09-26T08:09:01.421+0000 I CONTROL [initandlisten]
      2018-09-26T08:09:01.421+0000 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
      2018-09-26T08:09:01.421+0000 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
      2018-09-26T08:09:01.421+0000 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
      2018-09-26T08:09:01.422+0000 I CONTROL [initandlisten]
      2018-09-26T08:09:01.422+0000 I CONTROL [initandlisten]
      2018-09-26T08:09:01.422+0000 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
      2018-09-26T08:09:01.422+0000 I CONTROL [initandlisten] ** We suggest setting it to 'never'
      2018-09-26T08:09:01.422+0000 I CONTROL [initandlisten]
      2018-09-26T08:09:01.720+0000 I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'
      2018-09-26T08:09:01.723+0000 I REPL [initandlisten] Rollback ID is 28
      2018-09-26T08:09:01.723+0000 I REPL [initandlisten] Recovering from stable timestamp: Timestamp(1537888773, 1) (top of oplog: { ts: Timestamp(1537888701, 1), t: 27 }, appliedThrough: { ts: Timestamp(1537888773, 1), t: 28 }, TruncateAfter: Timestamp(0, 0))
      2018-09-26T08:09:01.723+0000 I REPL [initandlisten] Starting recovery oplog application at the stable timestamp: Timestamp(1537888773, 1)
      2018-09-26T08:09:01.723+0000 F REPL [initandlisten] Applied op { : Timestamp(1537888773, 1) } not found. Top of oplog is { : Timestamp(1537888701, 1) }.
      2018-09-26T08:09:01.723+0000 F - [initandlisten] Fatal Assertion 40313 at src/mongo/db/repl/replication_recovery.cpp 345
      2018-09-26T08:09:01.723+0000 F - [initandlisten]

      ***aborting after fassert() failure

       

       We can't access to the POD so we can read files or db.

      How can we resolve our problem?

      Thank you

      Mauro

      Attachments

        Activity

          People

            kelsey.schubert@mongodb.com Kelsey Schubert
            maurotintori Mauro Tintori
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: