-
Type: Bug
-
Resolution: Done
-
Priority: Major - P3
-
None
-
Affects Version/s: 4.0.2
-
Component/s: Replication, Stability, WiredTiger
-
None
-
ALL
Hi,
we have an environment made up of at least 3 preemptible nodes (max 5 nodes) running with Kubernetes on Google Cloud Platform.
Over that we were working with a MongoDB replicaset made up of 3 PODs, version 3.6 docker image.
Now we are testing MongoDB version 4.0.2 docker image.
Yesterday we done a lot of tests without problems, also with different resources (CPU and memory limits) and different Kubernetes node types: these operations required to update, recreate and move PODs from nodepools and other ones. All tests yesterday was ok.
Today the first POD go always to CrashLoopBackOff, others 2 PODs are secondary.
In the logs we found this:
2018-09-26T08:09:00.028+0000 I CONTROL [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none'
2018-09-26T08:09:00.028+0000 I CONTROL [main] note: noprealloc may hurt performance in many applications
2018-09-26T08:09:00.034+0000 I CONTROL [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host="xxx"
2018-09-26T08:09:00.034+0000 I CONTROL [initandlisten] db version v4.0.2
2018-09-26T08:09:00.034+0000 I CONTROL [initandlisten] git version: fc1573ba18aee42f97a3bb13b67af7d837826b47
2018-09-26T08:09:00.034+0000 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016
2018-09-26T08:09:00.034+0000 I CONTROL [initandlisten] allocator: tcmalloc
2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] modules: none
2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] build environment:
2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] distmod: ubuntu1604
2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] distarch: x86_64
2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] target_arch: x86_64
2018-09-26T08:09:00.035+0000 I CONTROL [initandlisten] options: { net:
, replication: { replSet: "xxx" }, storage: { mmapv1:
{ preallocDataFiles: false, smallFiles: true } } }
2018-09-26T08:09:00.035+0000 W STORAGE [initandlisten] Detected unclean shutdown - /data/db/mongod.lock is not empty.
2018-09-26T08:09:00.035+0000 I STORAGE [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'.
2018-09-26T08:09:00.035+0000 W STORAGE [initandlisten] Recovering data from the last clean checkpoint.
2018-09-26T08:09:00.035+0000 I STORAGE [initandlisten]
2018-09-26T08:09:00.035+0000 I STORAGE [initandlisten] ** WARNING: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine
2018-09-26T08:09:00.035+0000 I STORAGE [initandlisten] ** See http://dochub.mongodb.org/core/prodnotes-filesystem
2018-09-26T08:09:00.035+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=7010M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2018-09-26T08:09:00.920+0000 I STORAGE [initandlisten] WiredTiger message [1537949340:920297][1:0x7f093a5bca00], txn-recover: Main recovery loop: starting at 358/256
2018-09-26T08:09:00.920+0000 I STORAGE [initandlisten] WiredTiger message [1537949340:920849][1:0x7f093a5bca00], txn-recover: Recovering log 358 through 359
2018-09-26T08:09:01.018+0000 I STORAGE [initandlisten] WiredTiger message [1537949341:18578][1:0x7f093a5bca00], txn-recover: Recovering log 359 through 359
2018-09-26T08:09:01.113+0000 I STORAGE [initandlisten] WiredTiger message [1537949341:113092][1:0x7f093a5bca00], txn-recover: Set global recovery timestamp: 5baa520500000001
2018-09-26T08:09:01.127+0000 I RECOVERY [initandlisten] WiredTiger recoveryTimestamp. Ts: Timestamp(1537888773, 1)
2018-09-26T08:09:01.127+0000 I STORAGE [initandlisten] Triggering the first stable checkpoint. Initial Data: Timestamp(1537888773, 1) PrevStable: Timestamp(0, 0) CurrStable: Timestamp(1537888773, 1)
2018-09-26T08:09:01.336+0000 I STORAGE [initandlisten] Starting WiredTigerRecordStoreThread local.oplog.rs
2018-09-26T08:09:01.337+0000 I STORAGE [initandlisten] The size storer reports that the oplog contains 18753157 records totaling to 4939639735 bytes
2018-09-26T08:09:01.337+0000 I STORAGE [initandlisten] Sampling from the oplog between Sep 25 12:59:40:46211 and Sep 25 15:18:21:1 to determine where to place markers for truncation
2018-09-26T08:09:01.337+0000 I STORAGE [initandlisten] Taking 993 samples and assuming that each section of oplog contains approximately 188664 records totaling to 49694682 bytes
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 12:59:53:9643
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:00:07:2576
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:27:27632
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:44:4341
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:47:47383
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:51:33808
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:55:21357
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:03:59:7729
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:04:32:23512
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:05:40:4956
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:05:45:5332
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:05:59:23243
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:06:03:28610
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:06:07:34419
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:06:11:38253
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:06:15:39932
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:06:32:45647
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:48:49:15931
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:48:55:2698
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:00:30146
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:06:26530
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:12:18860
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:18:12411
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:29:10614
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:35:1915
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:40:33321
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:46:28232
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:52:25477
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:49:58:21096
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:50:25:736
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:50:30:35534
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:50:36:30659
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:50:42:28853
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:50:48:25546
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 13:51:23:25962
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:17:10717
2018-09-26T08:09:01.405+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:23:19423
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:35:35771
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:41:6549
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:45:7728
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:49:5615
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:53:3866
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:12:57:11428
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:13:01:5984
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:15:1151
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:19:1424
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:22:32516
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:26:26611
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:30:29725
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:34:26472
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:38:20402
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:42:21922
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:17:46:20812
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:29:35488
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:33:34887
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:37:28908
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:41:28395
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:45:32895
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:50:794
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:53:32253
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:20:57:32519
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:31:04:10062
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:34:29:10436
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:34:36:14880
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:34:46:5600
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:34:57:8110
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:35:08:9095
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:35:19:7910
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:35:59:12359
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:36:10:12750
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:36:21:14260
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:36:32:18461
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:36:43:15327
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:36:54:10347
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:37:11:6788
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:43:17:17640
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:43:28:16675
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:43:40:368
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:43:51:2300
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:44:02:2576
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:46:34:7711
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:46:45:13268
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:46:56:18566
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:47:07:16849
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:47:18:18226
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 14:47:30:1623
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:10:46:9019
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:11:09:4009
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:11:32:6258
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:11:57:5684
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:12:20:6362
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:12:44:1273
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:13:07:3997
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:13:35:2151
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:14:59:7291
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:15:27:5583
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:15:50:6116
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:16:12:4857
2018-09-26T08:09:01.406+0000 I STORAGE [initandlisten] Placing a marker at optime Sep 25 15:16:32:351
2018-09-26T08:09:01.421+0000 W STORAGE [initandlisten] Detected configuration for non-active storage engine mmapv1 when current storage engine is wiredTiger
2018-09-26T08:09:01.421+0000 I CONTROL [initandlisten]
2018-09-26T08:09:01.421+0000 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
2018-09-26T08:09:01.421+0000 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
2018-09-26T08:09:01.421+0000 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended.
2018-09-26T08:09:01.422+0000 I CONTROL [initandlisten]
2018-09-26T08:09:01.422+0000 I CONTROL [initandlisten]
2018-09-26T08:09:01.422+0000 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'.
2018-09-26T08:09:01.422+0000 I CONTROL [initandlisten] ** We suggest setting it to 'never'
2018-09-26T08:09:01.422+0000 I CONTROL [initandlisten]
2018-09-26T08:09:01.720+0000 I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data'
2018-09-26T08:09:01.723+0000 I REPL [initandlisten] Rollback ID is 28
2018-09-26T08:09:01.723+0000 I REPL [initandlisten] Recovering from stable timestamp: Timestamp(1537888773, 1) (top of oplog: { ts: Timestamp(1537888701, 1), t: 27 }, appliedThrough: { ts: Timestamp(1537888773, 1), t: 28 }, TruncateAfter: Timestamp(0, 0))
2018-09-26T08:09:01.723+0000 I REPL [initandlisten] Starting recovery oplog application at the stable timestamp: Timestamp(1537888773, 1)
2018-09-26T08:09:01.723+0000 F REPL [initandlisten] Applied op { : Timestamp(1537888773, 1) } not found. Top of oplog is { : Timestamp(1537888701, 1) }.
2018-09-26T08:09:01.723+0000 F - [initandlisten] Fatal Assertion 40313 at src/mongo/db/repl/replication_recovery.cpp 345
2018-09-26T08:09:01.723+0000 F - [initandlisten]
***aborting after fassert() failure
We can't access to the POD so we can read files or db.
How can we resolve our problem?
Thank you
Mauro