[SERVER-3126] replset - common point problem Created: 20/May/11  Updated: 02/Aug/18  Resolved: 28/Oct/11

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 1.8.1
Fix Version/s: None

Type: Task Priority: Critical - P2
Reporter: MartinS Assignee: Gregory McKeon (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

as in SERVER-3125


Participants:

 Description   

after changing oplog size in in config and restarting mongodb (in order: node no 1 - secondary, node no 2 - secondary, node no 3 - primary)

Set name: testreplset
Majority up: yes
Member id Up cctime Last heartbeat Votes Priority State Messages optime skew
172.17.0.251:27017 0 1 3.3 mins 1 sec ago 1 1 RECOVERING error RS102 too stale to catch up 4dd6539b:81
172.17.0.252:27017 1 1 3.3 mins 1 sec ago 1 1 PRIMARY 4dd6546d:7c
172.17.0.253:27017 (me) 2 1 3.4 mins 1 1 ROLLBACK rollback 2 error findcommonpoint waiting a while before trying again 4dd65853:8

Recent replset log activity:

Fri May 20 14:15:09 [startReplSets] couldn't connect to localhost:27017: couldn't connect to server localhost:27017
14:15:09 [startReplSets] replSet STARTUP2
14:15:09 [replica set sync] replSet SECONDARY
14:15:09 [rs Manager] replSet can't see a majority, will not try to elect self
14:15:09 [ReplSetHealthPollTask] replSet info 172.17.0.252:27017 is up
14:15:09 [ReplSetHealthPollTask] replSet member 172.17.0.252:27017 PRIMARY
14:15:11 [ReplSetHealthPollTask] replSet info 172.17.0.251:27017 is up
14:15:11 [ReplSetHealthPollTask] replSet member 172.17.0.251:27017 RECOVERING
14:15:12 [replica set sync] replSet we are ahead of the primary, will try to roll back
14:15:12 [replica set sync] replSet rollback 0
14:15:12 [replica set sync] replSet ROLLBACK
14:15:12 [replica set sync] replSet rollback 1
14:15:12 [replica set sync] replSet rollback 2 FindCommonPoint
14:15:12 [replica set sync] replSet info rollback our last optime: May 20 14:02:27:8
14:15:12 [replica set sync] replSet info rollback their last optime: May 20 13:45:49:7c
14:15:12 [replica set sync] replSet info rollback diff in end of log times: 998 seconds
14:15:12 [replica set sync] replSet rollback error RS101 reached beginning of local oplog
14:15:12 [replica set sync] replSet them: 172.17.0.252:27017 scanned: 58620
14:15:12 [replica set sync] replSet theirTime: May 20 13:45:49 4dd6546d:7c
14:15:12 [replica set sync] replSet ourTime: May 20 13:57:33 4dd6572d:1dd
14:15:12 [replica set sync] replSet rollback 2 error RS101 reached beginning of local oplog [2]
14:15:25 [replica set sync] replSet we are ahead of the primary, will try to roll back
14:15:25 [replica set sync] replSet rollback 0
14:15:25 [replica set sync] replSet rollback 1
14:15:25 [replica set sync] replSet rollback 2 FindCommonPoint
14:15:25 [replica set sync] replSet rollback 2 error findcommonpoint waiting a while before trying again
14:15:39 .....
14:15:52 .....
14:16:05 .....
14:16:18 [replica set sync] replSet we are ahead of the primary, will try to roll back
14:16:18 [replica set sync] replSet rollback 0
14:16:18 [replica set sync] replSet rollback 1
14:16:18 [replica set sync] replSet rollback 2 FindCommonPoint
14:16:18 [replica set sync] replSet info rollback our last optime: May 20 14:02:27:8
14:16:18 [replica set sync] replSet info rollback their last optime: May 20 13:45:49:7c
14:16:18 [replica set sync] replSet info rollback diff in end of log times: 998 seconds
14:16:18 [replica set sync] replSet rollback error RS101 reached beginning of local oplog
14:16:18 [replica set sync] replSet them: 172.17.0.252:27017 scanned: 58620
14:16:18 [replica set sync] replSet theirTime: May 20 13:45:49 4dd6546d:7c
14:16:18 [replica set sync] replSet ourTime: May 20 13:57:33 4dd6572d:1dd
14:16:18 [replica set sync] replSet rollback 2 error RS101 reached beginning of local oplog [2]
14:16:32 [replica set sync] replSet we are ahead of the primary, will try to roll back
14:16:32 [replica set sync] replSet rollback 0
14:16:32 [replica set sync] replSet rollback 1
14:16:32 [replica set sync] replSet rollback 2 FindCommonPoint
14:16:32 [replica set sync] replSet rollback 2 error findcommonpoint waiting a while before trying again
14:16:45 .....
14:16:58 .....
14:17:11 .....
14:17:24 [replica set sync] replSet we are ahead of the primary, will try to roll back
14:17:24 [replica set sync] replSet rollback 0
14:17:24 [replica set sync] replSet rollback 1
14:17:24 [replica set sync] replSet rollback 2 FindCommonPoint
14:17:24 [replica set sync] replSet info rollback our last optime: May 20 14:02:27:8
14:17:24 [replica set sync] replSet info rollback their last optime: May 20 13:45:49:7c
14:17:24 [replica set sync] replSet info rollback diff in end of log times: 998 seconds
14:17:25 [replica set sync] replSet rollback error RS101 reached beginning of local oplog
14:17:25 [replica set sync] replSet them: 172.17.0.252:27017 scanned: 58620
14:17:25 [replica set sync] replSet theirTime: May 20 13:45:49 4dd6546d:7c
14:17:25 [replica set sync] replSet ourTime: May 20 13:57:33 4dd6572d:1dd
14:17:25 [replica set sync] replSet rollback 2 error RS101 reached beginning of local oplog [2]
14:17:38 [replica set sync] replSet we are ahead of the primary, will try to roll back
14:17:38 [replica set sync] replSet rollback 0
14:17:38 [replica set sync] replSet rollback 1
14:17:38 [replica set sync] replSet rollback 2 FindCommonPoint
14:17:38 [replica set sync] replSet rollback 2 error findcommonpoint waiting a while before trying again
14:17:51 .....
14:18:04 .....



 Comments   
Comment by Kristina Chodorow (Inactive) [ 28/Oct/11 ]

You can prevent rollback with w:"majority" in v2.0.

Comment by MartinS [ 26/May/11 ]

I don't wont to recover my current data cause it's only test before drive mongo into production.

But I want to not have to use rollback, I'd like mongo to stop working if 1/2 of nodes are stale (like when hosts are down).

Comment by Kristina Chodorow (Inactive) [ 24/May/11 ]

I'll take a look at the logs. If you always use a write concern of w:<majority of servers>, you should never have rollbacks.

Comment by MartinS [ 24/May/11 ]

logs are in SERVER-3125

One important information: I won't to get data back with rollback. That was only test data.

And about primary host election with secondaries in incoplete state:
This should be configurable. I won't use MongoDB to get rollback from time to time. I would like to may configure my replicaset stops when 1/2 of RS nodes is not up to date with rest.

And one configuration directive could be very useful: forcing to keep oplog if changes were not commited on all nodes.

Comment by Kristina Chodorow (Inactive) [ 20/May/11 ]

You cannot, at the moment, change oplog size by just changing the config option. You have to drop the oplog and recreate it with a new size. The oplog size matters a great deal!

> Primary question is: why node that cannot sync became a new primary afrter restart primary host.

If you have logs from all three during the time in question, I can probably interpret them for you.

> And then: Why host with only actual db try to rollback after restart.

Basically, if you have a set with a primary & two secondaries and the primary is ahead, when it goes down, one of the secondaries will be elected primary even if it doesn't have all of the data the primary had. Then this new primary will start taking writes, so when the old primary comes back up, it will have data that it never applied to the secondaries, but it can't apply it automatically because it might conflict with writes that have happened while it was down. So it rolls back its conflicting data. See http://www.snailinaturtleneck.com/blog/2011/01/19/how-to-use-replica-set-rollbacks/ for an example.

Comment by MartinS [ 20/May/11 ]

This is no matter how big this logs are.

or: size no matters :]

Primary question is: why node that cannot sync became a new primary afrter restart primary host.
And then: Why host with only actual db try to rollback after restart.

Comment by Kristina Chodorow (Inactive) [ 20/May/11 ]

Hmm, how big are your oplogs? Can you send the output of db.printReplicationInfo() from the primary and recoving nodes?

Comment by MartinS [ 20/May/11 ]

this time was max 15 minutes, how to force not to delete oplog until all nodes are in the same state?

Comment by Kristina Chodorow (Inactive) [ 20/May/11 ]

This:

14:15:12 [replica set sync] replSet rollback error RS101 reached beginning of local oplog

means you're going to have to resync this server. The servers have been diverged for too long a time to reconcile automatically.

Generated at Thu Feb 08 03:02:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.