[SERVER-31759] Sync Issue on sharded cluster Created: 28/Oct/17  Updated: 07/Jan/18  Resolved: 08/Dec/17

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: 3.4.10
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Kalpesh Chheda [X] Assignee: Mark Agarunov
Resolution: Incomplete Votes: 0
Labels: Sync
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Windows Server 2016


Attachments: Text File MongodbPerServerConfig.txt     PNG File image.png    
Participants:

 Description   

I am having 3 Servers 3 ReplicaSet and 3 Shards. I have manually deleted database and its recreated using server application. Now in 2 server if I access data it is in sync buy 1 server is working as is own not in sync. Please some one help me.

Before deleting it was working fine. All the 3 server database were in sync. If I do rs.status() and sh.status() its output looks ok to me if some one can check this It will be appreciated.

r3:SECONDARY> rs.status()
{
        "set" : "r3",
        "date" : ISODate("2017-10-25T07:23:29.279Z"),
        "myState" : 2,
        "term" : NumberLong(9),
        "syncingTo" : "10.0.0.6:30013",
        "heartbeatIntervalMillis" : NumberLong(2000),
        "optimes" : {
                "lastCommittedOpTime" : {
                        "ts" : Timestamp(1508916202, 1),
                        "t" : NumberLong(9)
                },
                "appliedOpTime" : {
                        "ts" : Timestamp(1508916202, 1),
                        "t" : NumberLong(9)
                },
                "durableOpTime" : {
                        "ts" : Timestamp(1508916202, 1),
                        "t" : NumberLong(9)
                }
        },
        "members" : [
                {
                        "_id" : 0,
                        "name" : "10.0.0.6:30013",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 925313,
                        "optime" : {
                                "ts" : Timestamp(1508916202, 1),
                                "t" : NumberLong(9)
                        },
                        "optimeDurable" : {
                                "ts" : Timestamp(1508916202, 1),
                                "t" : NumberLong(9)
                        },
                        "optimeDate" : ISODate("2017-10-25T07:23:22Z"),
                        "optimeDurableDate" : ISODate("2017-10-25T07:23:22Z"),
                        "lastHeartbeat" : ISODate("2017-10-25T07:23:28.588Z"),
                        "lastHeartbeatRecv" : ISODate("2017-10-25T07:23:28.024Z"),
                        "pingMs" : NumberLong(0),
                        "electionTime" : Timestamp(1507990921, 1),
                        "electionDate" : ISODate("2017-10-14T14:22:01Z"),
                        "configVersion" : 5
                },
                {
                        "_id" : 1,
                        "name" : "10.0.0.4:30013",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 925340,
                        "optime" : {
                                "ts" : Timestamp(1508916202, 1),
                                "t" : NumberLong(9)
                        },
                        "optimeDate" : ISODate("2017-10-25T07:23:22Z"),
                        "syncingTo" : "10.0.0.6:30013",
                        "configVersion" : 5,
                        "self" : true
                },
                {
                        "_id" : 2,
                        "name" : "10.0.0.5:30013",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 925313,
                        "optime" : {
                                "ts" : Timestamp(1508916202, 1),
                                "t" : NumberLong(9)
                        },
                        "optimeDurable" : {
                                "ts" : Timestamp(1508916202, 1),
                                "t" : NumberLong(9)
                        },
                        "optimeDate" : ISODate("2017-10-25T07:23:22Z"),
                        "optimeDurableDate" : ISODate("2017-10-25T07:23:22Z"),
                        "lastHeartbeat" : ISODate("2017-10-25T07:23:28.600Z"),
                        "lastHeartbeatRecv" : ISODate("2017-10-25T07:23:28.558Z"),
                        "pingMs" : NumberLong(0),
                        "syncingTo" : "10.0.0.4:30013",
                        "configVersion" : 5
                }
        ],
        "ok" : 1
}

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "minCompatibleVersion" : 5,
        "currentVersion" : 6,
        "clusterId" : ObjectId("598df72eee032936603c5e81")
}
  shards:
        {  "_id" : "r1",  "host" : "r1/10.0.0.4:30011,10.0.0.5:30011,10.0.0.6:30011",  "state" : 1 }
        {  "_id" : "r2",  "host" : "r2/10.0.0.4:30012,10.0.0.5:30012,10.0.0.6:30012",  "state" : 1 }
        {  "_id" : "r3",  "host" : "r3/10.0.0.4:30013,10.0.0.5:30013,10.0.0.6:30013",  "state" : 1 }
  active mongoses:
        "3.4.7" : 3
 autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
                Balancer lock taken at Sat Oct 14 2017 14:21:46 GMT+0000 (Coordinated Universal Time) by ConfigServer:Balancer
        Failed balancer rounds in last 5 attempts:  3
        Last reported error:  could not find host matching read preference { mode: "primary" } for set r1
        Time of Reported error:  Sat Aug 12 2017 06:47:04 GMT+0000 (Coordinated Universal Time)
        Migration Results for the last 24 hours:
                No recent migrations
  databases:
        {  "_id" : "TestAiCraftDatabase3",  "primary" : "r2",  "partitioned" : false }
        {  "_id" : "XyZDB",  "primary" : "r2",  "partitioned" : false }



 Comments   
Comment by Mark Agarunov [ 08/Dec/17 ]

Hello KalpeshChheda,

We haven’t heard back from you for some time, so I’m going to mark this ticket as resolved. If this is still an issue for you, please provide additional information and we will reopen the ticket.

Thanks,
Mark

Comment by Mark Agarunov [ 01/Dec/17 ]

Hello KalpeshChheda,

We still need additional information to diagnose the problem. If this is still an issue for you, would you please provide the complete log files from all affected mongod and mongos nodes?

Thanks,
Mark

Comment by Mark Agarunov [ 06/Nov/17 ]

Hello KalpeshChheda,

Thank you for the update. The location of the log files should be specified in the configuration file for mongod and mongos. I've created a secure upload portal so that you can send us these files privately.

Thanks,
Mark

Comment by Kalpesh Chheda [X] [ 02/Nov/17 ]

Hi Mark,

Currently I have cleared everything and recreated the cluster again which is working fine now. I have saw bugs related to my issue which are still open So I decided not to delete database from onwards and keep working. But setup is still in testing I can recreate the situation. If you want some other logs please tell me, so I can get before and after logs as well.
It would be nice if you tell me how to collect logs as well. I am using windows server 2016.

Thanks,
Kalpesh

Comment by Mark Agarunov [ 30/Oct/17 ]

Hello KalpeshChheda,

Thank you for the report. To get a better idea of what may be causing this behavior, could you please provide the complete log files from all affected mongod and mongos nodes? This should provide some insight into why this node is not properly replicating.

Thanks,
Mark

Generated at Thu Feb 08 04:28:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.