-
Type:
Bug
-
Resolution: Done
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
ALL
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Hi,
I'm using MongoDB 4.2.18 and I have a cluster of 3 nodes.
after system updated I got 2/3 replicas up.
one replica can't up with the error: Common point must be at least stable timestamp.
Mongodb will be initialized as cluster replicaset 2022-05-12T08:43:58.818+0000 I CONTROL [main] Automatically disabling TLS 1.0, to force-enable TLS 1.0 specify --sslDisabledProtocols 'none' 2022-05-12T08:43:58.819+0000 W ASIO [main] No TransportLayer configured during NetworkInterface startup 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=a18c0ccd6d96 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] db version v4.2.18 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] git version: f65ce5e25c0b26a00d091a4d24eec1a8b3a4c016 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] OpenSSL version: OpenSSL 1.1.1 11 Sep 2018 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] allocator: tcmalloc 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] modules: none 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] build environment: 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] distmod: ubuntu1804 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] distarch: x86_64 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] target_arch: x86_64 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] 16384 MB of memory available to the process out of 253753 MB total system memory 2022-05-12T08:43:58.902+0000 I CONTROL [initandlisten] options: { net: { bindIp: "*" }, replication: { oplogSizeMB: 5120, replSet: "rs0" }, security: { authorization: "enabled", keyFile: "/run/secrets/replicakeyfile" }, setParameter: { diagnosticDataCollectionEnabled: "false", internalQueryExecMaxBlockingSortBytes: "104857600" }, storage: { journal: { enabled: true } } } 2022-05-12T08:43:58.903+0000 W STORAGE [initandlisten] Detected unclean shutdown - /data/db/mongod.lock is not empty. 2022-05-12T08:43:58.903+0000 I STORAGE [initandlisten] Detected data files in /data/db created by the 'wiredTiger' storage engine, so setting the active storage engine to 'wiredTiger'. 2022-05-12T08:43:58.903+0000 W STORAGE [initandlisten] Recovering data from the last clean checkpoint. 2022-05-12T08:43:58.903+0000 I STORAGE [initandlisten] wiredtiger_open config: create,cache_size=7680M,cache_overflow=(file_max=0M),session_max=33000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000,close_scan_interval=10,close_handle_minimum=250),statistics_log=(wait=0),verbose=[recovery_progress,checkpoint_progress], 2022-05-12T08:43:59.509+0000 I STORAGE [initandlisten] WiredTiger message [1652345039:509891][1:0x7f0061e2bb00], txn-recover: Recovering log 2289 through 2290 2022-05-12T08:43:59.584+0000 I STORAGE [initandlisten] WiredTiger message [1652345039:584257][1:0x7f0061e2bb00], txn-recover: Recovering log 2290 through 2290 2022-05-12T08:43:59.709+0000 I STORAGE [initandlisten] WiredTiger message [1652345039:709408][1:0x7f0061e2bb00], txn-recover: Main recovery loop: starting at 2289/256 to 2290/256 2022-05-12T08:43:59.709+0000 I STORAGE [initandlisten] WiredTiger message [1652345039:709900][1:0x7f0061e2bb00], txn-recover: Recovering log 2289 through 2290 2022-05-12T08:43:59.805+0000 I STORAGE [initandlisten] WiredTiger message [1652345039:805211][1:0x7f0061e2bb00], file:collection-10-199254800462122439.wt, txn-recover: Recovering log 2290 through 2290 2022-05-12T08:43:59.879+0000 I STORAGE [initandlisten] WiredTiger message [1652345039:879915][1:0x7f0061e2bb00], file:collection-10-199254800462122439.wt, txn-recover: Set global recovery timestamp: (1652199868, 2) 2022-05-12T08:43:59.886+0000 I RECOVERY [initandlisten] WiredTiger recoveryTimestamp. Ts: Timestamp(1652199868, 2) 2022-05-12T08:43:59.888+0000 I STORAGE [initandlisten] No table logging settings modifications are required for existing WiredTiger tables. Logging enabled? 0 2022-05-12T08:43:59.890+0000 I STORAGE [initandlisten] Starting OplogTruncaterThread local.oplog.rs 2022-05-12T08:43:59.890+0000 I STORAGE [initandlisten] The size storer reports that the oplog contains 133091 records totaling to 17956205 bytes 2022-05-12T08:43:59.890+0000 I STORAGE [initandlisten] Sampling the oplog to determine where to place markers for truncation 2022-05-12T08:43:59.892+0000 I STORAGE [initandlisten] Sampling from the oplog between Apr 26 13:17:05:1 and May 10 16:24:28:2 to determine where to place markers for truncation 2022-05-12T08:43:59.892+0000 I STORAGE [initandlisten] Taking 3 samples and assuming that each section of oplog contains approximately 397928 records totaling to 53687151 bytes 2022-05-12T08:43:59.892+0000 I STORAGE [initandlisten] WiredTiger record store oplog processing took 2ms 2022-05-12T08:43:59.893+0000 I STORAGE [initandlisten] Timestamp monitor starting 2022-05-12T08:43:59.893+0000 I CONTROL [initandlisten] ** WARNING: You are running this process as the root user, which is not recommended. 2022-05-12T08:43:59.893+0000 I CONTROL [initandlisten] 2022-05-12T08:43:59.894+0000 I CONTROL [initandlisten] 2022-05-12T08:43:59.894+0000 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/enabled is 'always'. 2022-05-12T08:43:59.894+0000 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2022-05-12T08:43:59.894+0000 I CONTROL [initandlisten] 2022-05-12T08:43:59.894+0000 I CONTROL [initandlisten] ** WARNING: /sys/kernel/mm/transparent_hugepage/defrag is 'always'. 2022-05-12T08:43:59.894+0000 I CONTROL [initandlisten] ** We suggest setting it to 'never' 2022-05-12T08:43:59.894+0000 I CONTROL [initandlisten] 2022-05-12T08:43:59.899+0000 I STORAGE [initandlisten] Flow Control is enabled on this deployment. 2022-05-12T08:43:59.901+0000 I FTDC [initandlisten] Initializing full-time diagnostic data capture with directory '/data/db/diagnostic.data' 2022-05-12T08:43:59.905+0000 I REPL [initandlisten] Rollback ID is 1 2022-05-12T08:43:59.906+0000 I REPL [initandlisten] Recovering from stable timestamp: Timestamp(1652199868, 2) (top of oplog: { ts: Timestamp(1652199868, 2), t: 15 }, appliedThrough: { ts: Timestamp(1652199868, 2), t: 15 }, TruncateAfter: Timestamp(0, 0)) 2022-05-12T08:43:59.906+0000 I REPL [initandlisten] Starting recovery oplog application at the stable timestamp: Timestamp(1652199868, 2) 2022-05-12T08:43:59.906+0000 I REPL [initandlisten] No oplog entries to apply for recovery. Start point is at the top of the oplog. 2022-05-12T08:43:59.908+0000 I CONTROL [LogicalSessionCacheRefresh] Sessions collection is not set up; waiting until next sessions refresh interval: Replication has not yet been configured 2022-05-12T08:43:59.909+0000 I CONTROL [LogicalSessionCacheReap] Failed to reap transaction table: NotYetInitialized: Replication has not yet been configured 2022-05-12T08:43:59.909+0000 I NETWORK [listener] Listening on /tmp/mongodb-27017.sock 2022-05-12T08:43:59.909+0000 I NETWORK [listener] Listening on 0.0.0.0 2022-05-12T08:43:59.909+0000 I NETWORK [listener] waiting for connections on port 27017 2022-05-12T08:44:03.153+0000 W REPL [replexec-0] Locally stored replica set configuration does not have a valid entry for the current node; waiting for reconfig or remote heartbeat; Got "NodeNotFound: No host described in new configuration 136369957 for replica set rs0 maps to this node" while validating { _id: "rs0", version: 136369957, protocolVersion: 1, writeConcernMajorityJournalDefault: true, members: [ { _id: 1, host: "10.0.10.130:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 0, host: "10.0.11.89:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "10.0.11.156:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: -1, catchUpTakeoverDelayMillis: 30000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('62572b1492bafe31c889776d') } } 2022-05-12T08:44:03.153+0000 I REPL [replexec-0] New replica set config in use: { _id: "rs0", version: 136369957, protocolVersion: 1, writeConcernMajorityJournalDefault: true, members: [ { _id: 1, host: "10.0.10.130:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 0, host: "10.0.11.89:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "10.0.11.156:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: -1, catchUpTakeoverDelayMillis: 30000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('62572b1492bafe31c889776d') } } 2022-05-12T08:44:03.154+0000 I REPL [replexec-0] This node is not a member of the config 2022-05-12T08:44:03.154+0000 I REPL [replexec-0] transition to REMOVED from STARTUP 2022-05-12T08:44:37.890+0000 I NETWORK [listener] connection accepted from 127.0.0.1:34882 #3 (1 connection now open) 2022-05-12T08:44:37.891+0000 I NETWORK [conn3] received client metadata from 127.0.0.1:34882 conn3: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.2.18" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "18.04" } } 2022-05-12T08:44:37.896+0000 I NETWORK [conn3] end connection 127.0.0.1:34882 (0 connections now open) 2022-05-12T08:44:42.257+0000 I NETWORK [listener] connection accepted from 127.0.0.1:35764 #4 (1 connection now open) 2022-05-12T08:44:42.258+0000 I NETWORK [conn4] received client metadata from 127.0.0.1:35764 conn4: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.2.18" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "18.04" } } 2022-05-12T08:44:42.283+0000 I ACCESS [conn4] Successfully authenticated as principal admin on admin from client 127.0.0.1:35764 2022-05-12T08:44:42.293+0000 I NETWORK [conn4] end connection 127.0.0.1:35764 (0 connections now open) 2022-05-12T08:44:43.242+0000 I NETWORK [listener] connection accepted from 127.0.0.1:36306 #5 (1 connection now open) 2022-05-12T08:44:43.243+0000 I NETWORK [conn5] received client metadata from 127.0.0.1:36306 conn5: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.2.18" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "18.04" } } 2022-05-12T08:44:43.267+0000 I ACCESS [conn5] Successfully authenticated as principal admin on admin from client 127.0.0.1:36306 2022-05-12T08:44:43.274+0000 I NETWORK [conn5] end connection 127.0.0.1:36306 (0 connections now open) 2022-05-12T08:44:43.517+0000 I NETWORK [listener] connection accepted from 127.0.0.1:36452 #6 (1 connection now open) 2022-05-12T08:44:43.518+0000 I NETWORK [conn6] received client metadata from 127.0.0.1:36452 conn6: { application: { name: "MongoDB Shell" }, driver: { name: "MongoDB Internal Client", version: "4.2.18" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "18.04" } } 2022-05-12T08:44:43.541+0000 I ACCESS [conn6] Successfully authenticated as principal admin on admin from client 127.0.0.1:36452 2022-05-12T08:44:43.547+0000 I NETWORK [conn6] end connection 127.0.0.1:36452 (0 connections now open) 2022-05-12T08:44:44.375+0000 I NETWORK [listener] connection accepted from 10.0.10.130:58370 #7 (1 connection now open) 2022-05-12T08:44:44.442+0000 I ACCESS [conn7] Successfully authenticated as principal __system on local from client 10.0.10.130:58370 2022-05-12T08:44:44.442+0000 I NETWORK [conn7] end connection 10.0.10.130:58370 (0 connections now open) 2022-05-12T08:44:44.443+0000 I NETWORK [listener] connection accepted from 10.0.10.130:58376 #8 (1 connection now open) 2022-05-12T08:44:44.444+0000 I NETWORK [conn8] received client metadata from 10.0.10.130:58376 conn8: { driver: { name: "NetworkInterfaceTL", version: "4.2.18" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "18.04" } } 2022-05-12T08:44:44.445+0000 I ACCESS [conn8] Successfully authenticated as principal __system on local from client 10.0.10.130:58376 2022-05-12T08:44:44.446+0000 I CONNPOOL [Replication] Connecting to 10.0.10.130:27017 2022-05-12T08:44:44.458+0000 I NETWORK [listener] connection accepted from 10.0.11.89:38938 #10 (2 connections now open) 2022-05-12T08:44:44.478+0000 I REPL [replexec-0] New replica set config in use: { _id: "rs0", version: 136466117, protocolVersion: 1, writeConcernMajorityJournalDefault: true, members: [ { _id: 1, host: "10.0.10.130:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 0, host: "10.0.11.89:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 2, host: "10.0.11.157:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: {}, slaveDelay: 0, votes: 1 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: -1, catchUpTakeoverDelayMillis: 30000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('62572b1492bafe31c889776d') } } 2022-05-12T08:44:44.478+0000 I REPL [replexec-0] This node is 10.0.11.157:27017 in the config 2022-05-12T08:44:44.478+0000 I REPL [replexec-0] Scheduling a task to begin or continue replication 2022-05-12T08:44:44.478+0000 I REPL [replexec-0] transition to STARTUP2 from REMOVED 2022-05-12T08:44:44.478+0000 I REPL [replexec-4] Starting replication storage threads 2022-05-12T08:44:44.479+0000 I CONNPOOL [Replication] Connecting to 10.0.11.89:27017 2022-05-12T08:44:44.479+0000 I REPL [replexec-1] Member 10.0.10.130:27017 is now in state PRIMARY 2022-05-12T08:44:44.479+0000 I REPL [replexec-4] transition to RECOVERING from STARTUP2 2022-05-12T08:44:44.479+0000 I REPL [replexec-4] Starting replication fetcher thread 2022-05-12T08:44:44.479+0000 I REPL [replexec-4] Starting replication applier thread 2022-05-12T08:44:44.479+0000 I REPL [replexec-4] Starting replication reporter thread 2022-05-12T08:44:44.480+0000 I REPL [rsSync-0] Starting oplog application 2022-05-12T08:44:44.480+0000 I REPL [rsBackgroundSync] waiting for 3 pings from other members before syncing 2022-05-12T08:44:44.480+0000 I REPL [rsSync-0] transition to SECONDARY from RECOVERING 2022-05-12T08:44:44.480+0000 I REPL [rsSync-0] Resetting sync source to empty, which was :27017 2022-05-12T08:44:44.483+0000 I REPL [replexec-3] Member 10.0.11.89:27017 is now in state SECONDARY 2022-05-12T08:44:44.484+0000 I NETWORK [listener] connection accepted from 10.0.11.89:38940 #14 (3 connections now open) 2022-05-12T08:44:44.484+0000 I NETWORK [conn14] received client metadata from 10.0.11.89:38940 conn14: { driver: { name: "NetworkInterfaceTL", version: "4.2.18" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "18.04" } } 2022-05-12T08:44:44.536+0000 I ACCESS [conn10] Successfully authenticated as principal __system on local from client 10.0.11.89:38938 2022-05-12T08:44:44.536+0000 I NETWORK [conn10] end connection 10.0.11.89:38938 (2 connections now open) 2022-05-12T08:44:44.555+0000 I NETWORK [listener] connection accepted from 10.0.11.89:38944 #15 (3 connections now open) 2022-05-12T08:44:44.555+0000 I ACCESS [conn14] Successfully authenticated as principal __system on local from client 10.0.11.89:38940 2022-05-12T08:44:44.555+0000 I NETWORK [conn15] received client metadata from 10.0.11.89:38944 conn15: { driver: { name: "NetworkInterfaceTL", version: "4.2.18" }, os: { type: "Linux", name: "Ubuntu", architecture: "x86_64", version: "18.04" } } 2022-05-12T08:44:44.558+0000 I ACCESS [conn15] Successfully authenticated as principal __system on local from client 10.0.11.89:38944 2022-05-12T08:44:45.222+0000 I NETWORK [listener] connection accepted from 10.0.9.34:59800 #16 (4 connections now open) 2022-05-12T08:44:45.222+0000 I NETWORK [listener] connection accepted from 10.0.9.34:59801 #17 (5 connections now open) 2022-05-12T08:44:45.222+0000 I NETWORK [conn16] received client metadata from 10.0.9.34:59800 conn16: { driver: { name: "mongo-java-driver|sync|spring-data", version: "4.1.2" }, os: { type: "Linux", name: "Linux", architecture: "amd64", version: "3.10.0-1160.49.1.el7.x86_64" }, platform: "Java/AdoptOpenJDK/15.0.2+7", application: { name: "analytics-core-ms" } } 2022-05-12T08:44:45.222+0000 I NETWORK [conn17] received client metadata from 10.0.9.34:59801 conn17: { driver: { name: "mongo-java-driver|sync|spring-data", version: "4.1.2" }, os: { type: "Linux", name: "Linux", architecture: "amd64", version: "3.10.0-1160.49.1.el7.x86_64" }, platform: "Java/AdoptOpenJDK/15.0.2+7", application: { name: "analytics-core-ms" } } 2022-05-12T08:44:45.480+0000 I REPL [rsBackgroundSync] sync source candidate: 10.0.11.89:27017 2022-05-12T08:44:45.480+0000 I CONNPOOL [RS] Connecting to 10.0.11.89:27017 2022-05-12T08:44:45.565+0000 I REPL [rsBackgroundSync] Changed sync source from empty to 10.0.11.89:27017 2022-05-12T08:44:45.612+0000 I REPL [rsBackgroundSync] Starting rollback due to OplogStartMissing: Our last optime fetched: { ts: Timestamp(1652199868, 2), t: 15 }. source's GTE: { ts: Timestamp(1652201127, 2), t: 15 } 2022-05-12T08:44:45.612+0000 I REPL [rsBackgroundSync] Replication commit point: { ts: Timestamp(0, 0), t: -1 } 2022-05-12T08:44:45.612+0000 I REPL [rsBackgroundSync] Rollback using 'recoverToStableTimestamp' method. 2022-05-12T08:44:45.612+0000 I REPL [rsBackgroundSync] Scheduling rollback (sync source: 10.0.11.89:27017) 2022-05-12T08:44:45.612+0000 I ROLLBACK [rsBackgroundSync] transition to ROLLBACK 2022-05-12T08:44:45.612+0000 I REPL [rsBackgroundSync] State transition ops metrics: { lastStateTransition: "rollback", userOpsKilled: 0, userOpsRunning: 6 } 2022-05-12T08:44:45.612+0000 I REPL [rsBackgroundSync] transition to ROLLBACK from SECONDARY 2022-05-12T08:44:45.612+0000 I NETWORK [rsBackgroundSync] Skip closing connection for connection # 15 2022-05-12T08:44:45.612+0000 I NETWORK [rsBackgroundSync] Skip closing connection for connection # 14 2022-05-12T08:44:45.612+0000 I NETWORK [rsBackgroundSync] Skip closing connection for connection # 8 2022-05-12T08:44:45.612+0000 I NETWORK [conn16] end connection 10.0.9.34:59800 (4 connections now open) 2022-05-12T08:44:45.612+0000 I NETWORK [conn17] end connection 10.0.9.34:59801 (3 connections now open) 2022-05-12T08:44:45.612+0000 I ROLLBACK [rsBackgroundSync] Waiting for all background operations to complete before starting rollback 2022-05-12T08:44:45.612+0000 I ROLLBACK [rsBackgroundSync] Finished waiting for background operations to complete before rollback 2022-05-12T08:44:45.612+0000 I ROLLBACK [rsBackgroundSync] finding common point 2022-05-12T08:44:45.677+0000 I ROLLBACK [rsBackgroundSync] Rollback common point is { ts: Timestamp(1652199863, 1), t: 14 } 2022-05-12T08:44:45.677+0000 F ROLLBACK [rsBackgroundSync] Common point must be at least stable timestamp, common point: Timestamp(1652199863, 1), stable timestamp: Timestamp(1652199868, 2) 2022-05-12T08:44:45.677+0000 F - [rsBackgroundSync] Fatal Assertion 51121 at src/mongo/db/repl/rollback_impl.cpp 969 2022-05-12T08:44:45.677+0000 F - [rsBackgroundSync] \n\n***aborting after fassert() failure\n\n