[SERVER-39607] MongoDB 3.4 should unconditionally use afterOpTime for initial sync oplog fetching Created: 15/Feb/19  Updated: 06/Dec/22  Resolved: 30/May/19

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.4.19
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Tess Avitabile (Inactive) Assignee: Backlog - Replication Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
Related
is related to SERVER-30927 Use readConcern afterClusterTime for ... Closed
Assigned Teams:
Replication
Operating System: ALL
Steps To Reproduce:

Run the following test on 3.6:

	(function() {
	    'use strict';
	
	    load('jstests/replsets/rslib.js');
	    const basename = 'initial_sync_visibility';
	
	    jsTestLog('Bring up set');
	    const rst = new ReplSetTest({name: basename, nodes: 1});
	    rst.startSet();
	    rst.initiate();
	
	    const primary = rst.getPrimary();
	    const primaryDB = primary.getDB(basename);
 
	    assert.commandWorked(primary.adminCommand({setFeatureCompatibilityVersion: "3.4"}));
	
	    jsTestLog('Create a collection');
	    assert.writeOK(primaryDB['coll'].save({_id: "visible"}));
	    jsTestLog('Make sure synced');
	    rst.awaitReplication();
	
	    jsTestLog('Activate WT visibility failpoint and write an invisible document');
	    assert.commandWorked(primaryDB.adminCommand(
	        {configureFailPoint: 'WTPausePrimaryOplogDurabilityLoop', mode: 'alwaysOn'}));
	    assert.writeOK(primaryDB['coll'].save({_id: "invisible"}));
	
	    jsTestLog('Bring up a new node');
	    const secondary = rst.add({setParameter: 'numInitialSyncAttempts=3', binVersion: "last-stable"});
	    rst.reInitiate();
	    assert.eq(primary, rst.getPrimary(), 'Primary changed after reconfig');
	
	    jsTestLog('Wait for new node to start cloning');
	    secondary.setSlaveOk();
	    const secondaryDB = secondary.getDB(basename);
	    wait(function() {
	        return secondaryDB.stats().collections >= 1;
	    }, 'never saw new node starting to clone, was waiting for collections in: ' + basename);
	
	    jsTestLog('Disable WT visibility failpoint on primary making all visible.');
	    assert.commandWorked(primaryDB.adminCommand(
	        {configureFailPoint: 'WTPausePrimaryOplogDurabilityLoop', mode: 'off'}));
	
	    jsTestLog('Wait for both nodes to be up-to-date');
	    rst.awaitSecondaryNodes();
	    rst.awaitReplication();
	
	    jsTestLog('Check all OK');
	    rst.checkReplicatedDataHashes();
	    rst.stopSet(15);
	})();

Participants:
Linked BF Score: 15

 Description   

MongoDB 3.4 uses afterOpTime for its initial sync oplog fetching query if the featureCompatibilityVersion is 3.4. However, this is dead code because the featureCompatibilityVersion is unset when the query is constructed, so it has its default value of 3.2. This afterOpTime is essential when initial syncing from a 3.6 node due to the oplog visibility rules on 3.6, and without it, the initial sync can fail with OplogStartMissing. The afterOpTime should not affect behavior when syncing from a 3.4 or 3.2 node, though this should be tested.



 Comments   
Comment by Tess Avitabile (Inactive) [ 30/May/19 ]

We will not fix this unless it comes up in the field.

Generated at Thu Feb 08 04:52:33 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.