[SERVER-71509] mongodb 4.4 STARTUP2 state still participate in write majorities.--this is bug? Created: 21/Nov/22  Updated: 02/Dec/22  Resolved: 01/Dec/22

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 4.4.18
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: jing xu Assignee: Yuan Fang
Resolution: Done Votes: 0
Labels: Bug
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: ALL
Participants:

 Description   

hello:
from doc:
https://www.mongodb.com/docs/v4.4/reference/write-concern/#causally-consistent-sessions-and-write-concerns
w: "majority" Behavior
Starting in MongoDB 4.4, replica set members in the STARTUP2 state do not participate in write majorities.

but i test,i find that STARTUP2 state still participate in write majorities.
i'm psa replication.when i add new node to it. the new node state is startup2.
but replication writableVotingMembersCount from 2 to 3.
so i insert data with writeConcern that is hang or timeout.

the follwoing is my test:
shard1:PRIMARY> rs.status();

{
        "set" : "shard1",
        "date" : ISODate("2022-11-21T05:02:51.701Z"),
        "myState" : 1,
        "term" : NumberLong(2),
        "syncSourceHost" : "",
        "syncSourceId" : -1,
        "heartbeatIntervalMillis" : NumberLong(2000),
        "majorityVoteCount" : 2,
        "writeMajorityCount" : 2,
        "votingMembersCount" : 3,
        "writableVotingMembersCount" : 2,
 
 
        "members" : [
                {
                        "_id" : 0,
                        "name" : "100.130.10.149:41001",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 200,
                        "optime" : {
                                "ts" : Timestamp(1669006964, 1),
                                "t" : NumberLong(2)
                        },
                        "optimeDurable" : {
                                "ts" : Timestamp(1669006964, 1),
                                "t" : NumberLong(2)
                        },
                        "optimeDate" : ISODate("2022-11-21T05:02:44Z"),
                        "optimeDurableDate" : ISODate("2022-11-21T05:02:44Z"),
                        "lastHeartbeat" : ISODate("2022-11-21T05:02:50.799Z"),
                        "lastHeartbeatRecv" : ISODate("2022-11-21T05:02:49.782Z"),
                        "pingMs" : NumberLong(0),
                        "lastHeartbeatMessage" : "",
                        "syncSourceHost" : "100.130.9.150:41001",
                        "syncSourceId" : 1,
                        "infoMessage" : "",
                        "configVersion" : 1,
                        "configTerm" : 2
                },
                {
                        "_id" : 1,
                        "name" : "100.130.9.150:41001",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 5797,
                        "optime" : {
                                "ts" : Timestamp(1669006964, 1),
                                "t" : NumberLong(2)
                        },
                        "optimeDate" : ISODate("2022-11-21T05:02:44Z"),
                        "syncSourceHost" : "",
                        "syncSourceId" : -1,
                        "infoMessage" : "",
                        "electionTime" : Timestamp(1669005454, 1),
                        "electionDate" : ISODate("2022-11-21T04:37:34Z"),
                        "configVersion" : 1,
                        "configTerm" : 2,
                        "self" : true,
                        "lastHeartbeatMessage" : ""
                },
                {
                        "_id" : 2,
                        "name" : "100.130.10.150:41001",
                        "health" : 1,
                        "state" : 7,
                        "stateStr" : "ARBITER",
                        "uptime" : 5399,
                        "lastHeartbeat" : ISODate("2022-11-21T05:02:50.151Z"),
                        "lastHeartbeatRecv" : ISODate("2022-11-21T05:02:50.159Z"),
                        "pingMs" : NumberLong(0),
                        "lastHeartbeatMessage" : "",
                        "syncSourceHost" : "",
                        "syncSourceId" : -1,
                        "infoMessage" : "",
                        "configVersion" : 1,
                        "configTerm" : 2
                }
        ],
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1669006964, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        },
        "operationTime" : Timestamp(1669006964, 1)
}
 
 
shard1:PRIMARY> rs.printSlaveReplicationInfo()
source: 100.130.10.149:41001
        syncedTo: Mon Nov 21 2022 13:03:54 GMT+0800 (CST)
        0 secs (0 hrs) behind the primary
 
shard1:PRIMARY> db.POCCOLL.insert({_id:1,name:"testWriteConcern"},{writeConcern:{w:"majority",wtimeout:5000}})
WriteResult({ "nInserted" : 1 })
 
 
shard1:PRIMARY> rs.add("100.130.9.149:41001")
{
        "ok" : 1,
        "$clusterTime" : {
                "clusterTime" : Timestamp(1669007243, 1),
                "signature" : {
                        "hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
                        "keyId" : NumberLong(0)
                }
        },
        "operationTime" : Timestamp(1669007243, 1)
}
shard1:PRIMARY> rs.printSlaveReplicationInfo()
source: 100.130.10.149:41001
        syncedTo: Mon Nov 21 2022 13:07:23 GMT+0800 (CST)
        0 secs (0 hrs) behind the primary
source: 100.130.9.149:41001
        syncedTo: Thu Jan 01 1970 08:00:00 GMT+0800 (CST)
        1669007243 secs (463613.12 hrs) behind the primary
 
 
shard1:PRIMARY> rs.status();
{
        "set" : "shard1",
        "date" : ISODate("2022-11-21T05:07:36.186Z"),
        "myState" : 1,
        "term" : NumberLong(2),
        "syncSourceHost" : "",
        "syncSourceId" : -1,
        "heartbeatIntervalMillis" : NumberLong(2000),
        "majorityVoteCount" : 3,
        "writeMajorityCount" : 3,
        "votingMembersCount" : 4,
        "writableVotingMembersCount" : 3,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "100.130.10.149:41001",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 485,
                        "optime" : {
                                "ts" : Timestamp(1669007254, 1),
                                "t" : NumberLong(2)
                        },
                        "optimeDurable" : {
                                "ts" : Timestamp(1669007254, 1),
                                "t" : NumberLong(2)
                        },
                        "optimeDate" : ISODate("2022-11-21T05:07:34Z"),
                        "optimeDurableDate" : ISODate("2022-11-21T05:07:34Z"),
                        "lastHeartbeat" : ISODate("2022-11-21T05:07:35.802Z"),
                        "lastHeartbeatRecv" : ISODate("2022-11-21T05:07:35.807Z"),
                        "pingMs" : NumberLong(0),
                        "lastHeartbeatMessage" : "",
                        "syncSourceHost" : "100.130.9.150:41001",
                        "syncSourceId" : 1,
                        "infoMessage" : "",
                        "configVersion" : 2,
                        "configTerm" : 2
                },
                {
                        "_id" : 1,
                        "name" : "100.130.9.150:41001",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 6082,
                        "optime" : {
                                "ts" : Timestamp(1669007254, 1),
                                "t" : NumberLong(2)
                        },
                        "optimeDate" : ISODate("2022-11-21T05:07:34Z"),
                        "syncSourceHost" : "",
                        "syncSourceId" : -1,
                        "infoMessage" : "",
                        "electionTime" : Timestamp(1669005454, 1),
                        "electionDate" : ISODate("2022-11-21T04:37:34Z"),
                        "configVersion" : 2,
                        "configTerm" : 2,
                        "self" : true,
                        "lastHeartbeatMessage" : ""
                },
                {
                        "_id" : 2,
                        "name" : "100.130.10.150:41001",
                        "health" : 1,
                        "state" : 7,
                        "stateStr" : "ARBITER",
                        "uptime" : 5684,
                        "lastHeartbeat" : ISODate("2022-11-21T05:07:35.802Z"),
                        "lastHeartbeatRecv" : ISODate("2022-11-21T05:07:35.805Z"),
                        "pingMs" : NumberLong(0),
                        "lastHeartbeatMessage" : "",
                        "syncSourceHost" : "",
                        "syncSourceId" : -1,
                        "infoMessage" : "",
                        "configVersion" : 2,
                        "configTerm" : 2
                },
                {
                        "_id" : 3,
                        "name" : "100.130.9.149:41001",
                        "health" : 1,
                        "state" : 5,
                        "stateStr" : "STARTUP2",
                        "uptime" : 12,
                        "optime" : {
                                "ts" : Timestamp(0, 0),
                                "t" : NumberLong(-1)
                        },
                        "optimeDurable" : {
                                "ts" : Timestamp(0, 0),
                                "t" : NumberLong(-1)
                        },
                        "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                        "optimeDurableDate" : ISODate("1970-01-01T00:00:00Z"),
                        "lastHeartbeat" : ISODate("2022-11-21T05:07:35.813Z"),
                        "lastHeartbeatRecv" : ISODate("2022-11-21T05:07:35.327Z"),
                        "pingMs" : NumberLong(0),
                        "lastHeartbeatMessage" : "",
                        "syncSourceHost" : "100.130.9.150:41001",
                        "syncSourceId" : 1,
                        "infoMessage" : "",
                        "configVersion" : 2,
                        "configTerm" : 2
                }
        ],
}

so now i insert data again ,startup2 state participate in write majorities.
so it is bug for it or other reason?

shard1:PRIMARY> db.POCCOLL.insert({_id:3,name:"testWriteConcern"},{writeConcern:{w:"majority",wtimeout:5000}})
WriteResult({
        "nInserted" : 1,
        "writeConcernError" : {
                "code" : 64,
                "codeName" : "WriteConcernFailed",
                "errmsg" : "waiting for replication timed out",
                "errInfo" : {
                        "wtimeout" : true,
                        "writeConcern" : {
                                "w" : "majority",
                                "wtimeout" : 5000,
                                "provenance" : "clientSupplied"
                        }
                }
        }
})



 Comments   
Comment by jing xu [ 02/Dec/22 ]

hi Yuan Fang:
don't close it.i think it is wrong from doc,so that is it a bug or wrong doc?

Comment by Yuan Fang [ 01/Dec/22 ]

Hi 601290552@qq.com,

Thank you for your report, it is correct that "Starting in MongoDB 4.4, replica set members in the STARTUP2 state do not participate in write majorities". With regards to your concern on:

but replication writableVotingMembersCount from 2 to 3

writableVotingMembersCount represents the number of data-bearing members configured with votes: 1. If I understand correctly, the count is not impacted by the state (e.g. STARTUP2) of members.
As for the insert failure, the "waiting for replication timed out" error message indicates that the update is waiting longer than the timeout limit (wtimeout) for the data to replicate to other nodes in the replica set.
To approach this issue, I would encourage you to start by asking our community for help by posting on the MongoDB Developer Community Forums.

If the discussion there leads you to suspect a bug in the MongoDB server, then we'd want to investigate it as a possible bug here in the SERVER project.

Regards,
Yuan

Generated at Thu Feb 08 06:19:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.