[SERVER-3155] --fastsync + rs.add() allows for inconsistent data Created: 26/May/11  Updated: 11/Dec/18  Resolved: 13/Jun/11

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 1.8.1
Fix Version/s: 1.9.1

Type: Bug Priority: Major - P3
Reporter: Gaetan Voyer-Perrault Assignee: Kristina Chodorow (Inactive)
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File rs_fastsync_test.sh    
Issue Links:
Related
related to SERVER-3250 Should rs.initiate() allow more than ... Closed
Operating System: ALL
Participants:

 Description   

The attached script can be run on a single instance for ease of reproduction. Script is complex, basic premise below.

Basic Repro
======
(a) Start replica set nodes with --fastsync and slightly different data
(b) Configuring with rs.init(cfg) => fails
(c) Configuring with rs.init() + rs.add(...) + rs.add(...) => succeeds

Problem #1
======
Step (b) is failing with the following message.
This does not jive with the fact that I can add them one at a time.
{
"errmsg" : "couldn't initiate : member localhost:6901 has data already, cannot initiate set. All members except initiator must be empty.",
"ok" : 0
}

Also, I have --fastsync on. Why can't I start a replica set with known good data?

Problem #2
======
Step (c) is actually succeeding even though the data is different.

Expected resolutions
======
#1: Consistency.
The following two should behave the same:
1. rs.init( [a,b,c] )
2. rs.init( [a] ), rs.reconfig( [a,b] ), rs.reconfig( [a,b,c] )

Both of these should fail if the data does not match.

#2: Bring up new sets with existing data.
If [a,b,c] have the same data files, then rs.init( [a,b,c] ) should work.



 Comments   
Comment by Kristina Chodorow (Inactive) [ 13/Jun/11 ]

I've made a new ticket for the initiate() vs. add() behavior discrepancy.

Comment by Kristina Chodorow (Inactive) [ 27/May/11 ]

I don't think Problem #2 is fixable, the basic premise of fastsync is that you have a known good set of data that the DB doesn't have to check. If you don't, then you have to resync.

Generated at Thu Feb 08 03:02:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.