[SERVER-61723] Two primary nodes found in one replica set. Created: 24/Nov/21  Updated: 10/Jun/22  Resolved: 22/Dec/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.2.17
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Zijun Tian Assignee: Edwin Zhou
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

We have a replica set of 5 nodes.

From the log, the primary node was handling some aggregation which caused the CPU utilization to 100% in a short time. And after 10 seconds of no-respond from primary, an election started and a new primary was elected. The result sent to the previous primary node, and the log showed the previous primary stepped down and changed the state to secondary. However, the state did not change due to an unknown reason, and when we use `rs.status()` command on any node in the cluster, we can find two primary nodes at the same time (although other 3 secondary nodes sync to the new primary)

As a result, some users using PyMongo to connect to the cluster met with connection issues while some users did not. I guess it's because some users connected to the wrong primary node (the previous one)

We tried to remove the previous primary and added it back, there would still be two primary nodes. We had to reboot the previous primary and added it back to the cluster, this time it turned to be rollback state, after several minutes, it became a secondary.



 Comments   
Comment by Edwin Zhou [ 22/Dec/21 ]

Hi zijun.tian@tusimple.ai,

I haven't heard from you in a while so I will now close this ticket. Please let us know if you run into this issue again after attempting to profile the database on the latest version of MongoDB v4.2 and are able to collect stack traces.

Best,
Edwin

Comment by Edwin Zhou [ 06/Dec/21 ]

Hi zijun.tian@tusimple.ai,

Thank you for following up and for your effort in this investigation. Knowing that profiling was active during this time heavily narrows this behavior down to SERVER-59226, but we will need stack traces to confirm. I will leave this on waiting for reporter while we await your collection of stack traces.

Best,
Edwin

Comment by Zijun Tian [ 02/Dec/21 ]

Yes, we set the profiling level to 2 at the time. And I will collect the trace if it happens again.

Thanks!

Comment by Edwin Zhou [ 02/Dec/21 ]

Hi zijun.tian@tusimple.ai,

In my initial look at this diagnostic data, we are indeed seeing nodes appearing as two primaries. There appears to be contention on the ReplicationStateTransitionLock on the node that is stepping down. It's possible that this is an occurrence of SERVER-59226 which causes contention on this lock during database profiling. This behavior is fixed in 4.2.18 and I recommend upgrading to the latest version of 4.2.x.

Can you confirm whether you were profiling the database at this time? There may be a different behavior that exhibits this issue, but we need stack traces to confirm that this is the case.

If you area able to reproduce this behavior, or if this behavior happens again, can you collect stack traces on the stalled nodes, that is, the primary that is stalled and is unable to step down.

gdb -p $(pidof mongod) -batch -ex 'thread apply all bt' > gdb_`date +"%Y-%m-%d_%H-%M-%S"`.txt

Best,
Edwin

Comment by Edwin Zhou [ 30/Nov/21 ]

Hi zijun.tian@tusimple.ai,

Thank you again for following up. I'm confirming that I've received your diagnostic data files.

Best,
Edwin

Comment by Zijun Tian [ 30/Nov/21 ]

Hi Edwin,

It seems like I only replace the file one. I have replaced two instances and uploaded them again. Please have a check, thanks!

Comment by Edwin Zhou [ 30/Nov/21 ]

Hi zijun.tian@tusimple.ai,

I apologize for the inconvenience; however, the data still isn't in the portal. Can you confirm that you're replacing the two instances of <filename> in the curl command? One instance is in -F attributes and the second instance is in -F file.

Thanks for your patience.

Best,
Edwin

Comment by Zijun Tian [ 30/Nov/21 ]

Hi Edwin,

I uploaded it again, it's a zip file called "mongo.zip" in size of 167MB.

Can you confirm that?

Thanks!

Comment by Edwin Zhou [ 30/Nov/21 ]

Hi zijun.tian@tusimple.ai,

Thank you for following up. Unfortunately I don't see that the diagnostic data has been uploaded to the upload portal. Can you try again and confirm?

Best,
Edwin

Comment by Zijun Tian [ 30/Nov/21 ]

Hi Edwin,

I've uploaded them.

Comment by Edwin Zhou [ 29/Nov/21 ]

Hi zijun.tian@tusimple.ai,

Thank you for following up and I apologize for the confusion. The metrics files that you see are indeed the ftdc files I previously referred to. You may upload all of the log files and diagnostic.data directory for every node.

Best,
Edwin

Comment by Zijun Tian [ 29/Nov/21 ]

Hi Edwin,

I did not see any ftdc files under the diagnostic.data folder. The only file missing is "metrcis.interim".

Will it be ok that I just upload all log & diagnostic.data of the replica set of 5 nodes?

Comment by Edwin Zhou [ 29/Nov/21 ]

Hi zijun.tian@tusimple.ai,

Thank you for uploading the mongod log file and the timestamp of the incident; however, it's unclear which node this log file is associated with. Can you please also upload the log files from the remaining nodes and the FTDC data from the $dbpath/diagnostic.data directory (the contents are described here) for the entire replicaset?

Best,
Edwin

Comment by Zijun Tian [ 29/Nov/21 ]

Hi @Edwin Zhou

I uploaded the files as a zip file "mongo.zip".

The incident happened on Nov.23 21:21:00 (UTC).

Comment by Edwin Zhou [ 29/Nov/21 ]

Hi zijun.tian@tusimple.ai,

Would you please archive (tar or zip) the mongod.log files and the $dbpath/diagnostic.data directory (the contents are described here) and upload them to this support uploader location?

Files uploaded to this portal are visible only to MongoDB employees and are routinely deleted after some time.

Best,
Edwin

Comment by Zijun Tian [ 24/Nov/21 ]

TSMongo:PRIMARY> rs.status()
{
"set" : "TSMongo",
"date" : ISODate("2021-11-23T21:32:50.146Z"),
"myState" : 1,
"term" : NumberLong(321),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"heartbeatIntervalMillis" : NumberLong(2000),
"majorityVoteCount" : 3,
"writeMajorityCount" : 3,
"optimes" : {
"lastCommittedOpTime" :

{ "ts" : Timestamp(1637702498, 79), "t" : NumberLong(321) }

,
"lastCommittedWallTime" : ISODate("2021-11-23T21:21:38.754Z"),
"readConcernMajorityOpTime" :

{ "ts" : Timestamp(1637702498, 79), "t" : NumberLong(321) }

,
"readConcernMajorityWallTime" : ISODate("2021-11-23T21:21:38.754Z"),
"appliedOpTime" :

{ "ts" : Timestamp(1637702510, 3), "t" : NumberLong(321) }

,
"durableOpTime" :

{ "ts" : Timestamp(1637702510, 3), "t" : NumberLong(321) }

,
"lastAppliedWallTime" : ISODate("2021-11-23T21:21:50.186Z"),
"lastDurableWallTime" : ISODate("2021-11-23T21:21:50.186Z")
},
"lastStableRecoveryTimestamp" : Timestamp(1637702498, 79),
"lastStableCheckpointTimestamp" : Timestamp(1637702498, 79),
"electionCandidateMetrics" : {
"lastElectionReason" : "priorityTakeover",
"lastElectionDate" : ISODate("2021-10-14T22:12:06.781Z"),
"electionTerm" : NumberLong(321),
"lastCommittedOpTimeAtElection" :

{ "ts" : Timestamp(1634249526, 43), "t" : NumberLong(320) }

,
"lastSeenOpTimeAtElection" :

{ "ts" : Timestamp(1634249526, 43), "t" : NumberLong(320) }

,
"numVotesNeeded" : 3,
"priorityAtElection" : 14,
"electionTimeoutMillis" : NumberLong(10000),
"priorPrimaryMemberId" : 10,
"numCatchUpOps" : NumberLong(0),
"newTermStartDate" : ISODate("2021-10-14T22:12:07.346Z"),
"wMajorityWriteAvailabilityDate" : ISODate("2021-10-14T22:12:09.733Z")
},
"electionParticipantMetrics" :

{ "votedForCandidate" : false, "electionTerm" : NumberLong(320), "lastVoteDate" : ISODate("2021-10-14T22:11:43.722Z"), "electionCandidateMemberId" : 10, "voteReason" : "candidate's data is staler than mine. candidate's last applied OpTime: \{ ts: Timestamp(1634249493, 27), t: 319 }

, my last applied OpTime: { ts: Timestamp(1634249503, 26), t: 319 }",
"lastAppliedOpTimeAtElection" :

{ "ts" : Timestamp(1634249503, 26), "t" : NumberLong(319) }

,
"maxAppliedOpTimeInSet" :

{ "ts" : Timestamp(1634249503, 26), "t" : NumberLong(319) }

,
"priorityAtElection" : 14
},
"members" : [
{
"_id" : 6,
"name" : "runner6.tuc.tusimple.ai:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 419865,
"optime" :

{ "ts" : Timestamp(1637702498, 79), "t" : NumberLong(321) }

,
"optimeDurable" :

{ "ts" : Timestamp(1637702498, 79), "t" : NumberLong(321) }

,
"optimeDate" : ISODate("2021-11-23T21:21:38Z"),
"optimeDurableDate" : ISODate("2021-11-23T21:21:38Z"),
"lastHeartbeat" : ISODate("2021-11-23T21:32:48.158Z"),
"lastHeartbeatRecv" : ISODate("2021-11-23T21:32:49.326Z"),
"pingMs" : NumberLong(60),
"lastHeartbeatMessage" : "",
"syncingTo" : "172.30.76.235:27017",
"syncSourceHost" : "172.30.76.235:27017",
"syncSourceId" : 10,
"infoMessage" : "",
"configVersion" : 40
},
{
"_id" : 8,
"name" : "172.30.73.20:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 11319935,
"optime" :

{ "ts" : Timestamp(1637702510, 3), "t" : NumberLong(321) }

,
"optimeDate" : ISODate("2021-11-23T21:21:50Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1634249526, 44),
"electionDate" : ISODate("2021-10-14T22:12:06Z"),
"configVersion" : 40,
"self" : true,
"lastHeartbeatMessage" : ""
},
{
"_id" : 9,
"name" : "172.30.74.112:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 1614789,
"optime" :

{ "ts" : Timestamp(1637702498, 79), "t" : NumberLong(321) }

,
"optimeDurable" :

{ "ts" : Timestamp(1637702498, 79), "t" : NumberLong(321) }

,
"optimeDate" : ISODate("2021-11-23T21:21:38Z"),
"optimeDurableDate" : ISODate("2021-11-23T21:21:38Z"),
"lastHeartbeat" : ISODate("2021-11-23T21:32:50.088Z"),
"lastHeartbeatRecv" : ISODate("2021-11-23T21:32:50.033Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "172.30.76.235:27017",
"syncSourceHost" : "172.30.76.235:27017",
"syncSourceId" : 10,
"infoMessage" : "",
"configVersion" : 40
},
{
"_id" : 10,
"name" : "172.30.76.235:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 11319922,
"optime" :

{ "ts" : Timestamp(1637702498, 79), "t" : NumberLong(321) }

,
"optimeDurable" :

{ "ts" : Timestamp(1637702498, 79), "t" : NumberLong(321) }

,
"optimeDate" : ISODate("2021-11-23T21:21:38Z"),
"optimeDurableDate" : ISODate("2021-11-23T21:21:38Z"),
"lastHeartbeat" : ISODate("2021-11-23T21:32:48.447Z"),
"lastHeartbeatRecv" : ISODate("2021-11-23T21:32:48.405Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1637702509, 2),
"electionDate" : ISODate("2021-11-23T21:21:49Z"),
"configVersion" : 40
},
{
"_id" : 11,
"name" : "runner39.sm.tusimple.ai:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 938547,
"optime" :

{ "ts" : Timestamp(1637702498, 79), "t" : NumberLong(321) }

,
"optimeDurable" :

{ "ts" : Timestamp(1637702498, 79), "t" : NumberLong(321) }

,
"optimeDate" : ISODate("2021-11-23T21:21:38Z"),
"optimeDurableDate" : ISODate("2021-11-23T21:21:38Z"),
"lastHeartbeat" : ISODate("2021-11-23T21:32:50.146Z"),
"lastHeartbeatRecv" : ISODate("2021-11-23T21:32:49.842Z"),
"pingMs" : NumberLong(50),
"lastHeartbeatMessage" : "",
"syncingTo" : "runner6.tuc.tusimple.ai:27017",
"syncSourceHost" : "runner6.tuc.tusimple.ai:27017",
"syncSourceId" : 6,
"infoMessage" : "",
"configVersion" : 40
}
],
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1637702510, 3),
"signature" :

{ "hash" : BinData(0,"SNiJ0h81Pke0Q0fobM3KxmpXbR8="), "keyId" : NumberLong("7019048177479188481") }

},
"operationTime" : Timestamp(1637702510, 3)
}

Generated at Thu Feb 08 05:53:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.