[SERVER-8139] New replication dep. on minvalid collection causes bad behavior Created: 10/Jan/13 Updated: 11/Jul/16 Resolved: 24/Jan/13 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 2.4.0-rc0 |
| Type: | Improvement | Priority: | Blocker - P1 |
| Reporter: | Scott Hernandez (Inactive) | Assignee: | Eric Milkie |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
With the new behavior of using minvalid to determine if a initial sync has been done, and regular replication should start, leads to some significant problems. Cases:
The upgrade case is bad since we had no need for the minvalid collection and it was not maintained nor guaranteed (esp. on the primary) or if replicas were seeded with a copy of the files without it. |
| Comments |
| Comment by auto [ 24/Jan/13 ] |
|
Author: {u'date': u'2013-01-24T18:24:56Z', u'email': u'milkie@10gen.com', u'name': u'Eric Milkie'}Message: 'h' is not needed in minValid recorded in the database; it is never |
| Comment by auto [ 24/Jan/13 ] |
|
Author: {u'date': u'2013-01-24T16:37:32Z', u'email': u'milkie@10gen.com', u'name': u'Eric Milkie'}Message: |
| Comment by Scott Hernandez (Inactive) [ 16/Jan/13 ] |
|
Eric, So the only logic change is that if the minvalid collection has the 0'd doc then initial sync has not completed and must wipe and restart? At startup, if the minvalid collection is missing, or has any non-zero ts/h fields, then replication works normally, not causing an initial sync. This sounds reasonable and similar to Kristina and my suggestion to keep a different collection with more state about the initial sync (steps) as an indication of the initial sync state (and completion). I see some advantages to keeping more diagnostic information within this collection (not minvalid which is basically a boolean of initial sync active/done) but they effectively provide the same marker that that indicates if the initial sync has started/is-active and is done. |
| Comment by Eric Milkie [ 15/Jan/13 ] |
|
Proposal: Then, we can use this value as part of the initial sync criteria. It will only exist if an initial sync was attempted but never completed. |