[SERVER-34853] rollback (or replay during recovery) of emptycapped leads to invariant Created: 04/May/18  Updated: 06/Dec/22  Resolved: 29/Jan/20

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Judah Schvimer Assignee: Backlog - Replication Team
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to SERVER-41875 Should ban "emptyCapped" commands on... Closed
Assigned Teams:
Replication
Operating System: ALL
Steps To Reproduce:

(function() {
    'use strict';   
    load("jstests/replsets/libs/rollback_test.js");   
 
    const dbName = "emptycapped_rollback";
    const collName = "coll";
    const doc1 = {x: 1};
    const doc2 = {x: 2};
    const doc3 = {x: 3};  
 
    const rollbackTest = new RollbackTest();
    let primary = rollbackTest.getPrimary();
    let testDB = primary.getDB(dbName);
    primary.setLogLevel(2, "storage.recovery");    
 
    assert.commandWorked(testDB.createCollection(collName, {capped: true, size: 512 * 512, writeConcern: {w: "majority"}}));
    assert.commandWorked(testDB[collName].insert(doc1, {writeConcern: {w: "majority"}}));   
 
    // Restart the nodes so that they're not marked to always update sizes and so the first node is reelected.
    TestData.rollbackShutdowns = true;
    rollbackTest.restartNode(0, 15);
    rollbackTest.restartNode(1, 15);
    primary = rollbackTest.getPrimary();
    testDB = primary.getDB(dbName);    
 
    assert.commandWorked(primary.adminCommand({configureFailPoint: "disableSnapshotting", mode: "alwaysOn"}));
    assert.commandWorked(testDB.runCommand({emptycapped: collName}));
    assert.commandWorked(testDB[collName].insert(doc2));    
 
    rollbackTest.transitionToRollbackOperations();
    rollbackTest.transitionToSyncSourceOperationsBeforeRollback();
    rollbackTest.transitionToSyncSourceOperationsDuringRollback();
    rollbackTest.transitionToSteadyStateOperations();
    rollbackTest.stop();
}()); 

 

Participants:

 Description   

emptycapped is a test command, so this is low priority, and I do not think any other code can hit this. If an emptycapped command is replayed during replication recovery on a collection that wasn't marked for size adjustment, then it will invariant here.

 

This is both a problem for replication recovery at startup and for recovery after rollback.



 Comments   
Comment by Eric Milkie [ 08/May/18 ]

It's not something that will be going away soon; in fact, we are planning on making a real database "truncate" command that will operate in the same way. I think we can let this ticket sit in the backlog (or resolve it as Wont Fix) until we start work on "truncate".

Comment by Spencer Brody (Inactive) [ 07/May/18 ]

milkie Can you comment on the utility of the emptycapped command for testing? It doesn't seem to be used in many tests, and the ones that it is used for seem to be mainly testing its specific behavior. If we just deleted this command and the tests that use it, would we be losing meaningful test coverage for storage's oplog truncation behavior?

Comment by Judah Schvimer [ 04/May/18 ]

Per kyle.suarez's recommendation, a proposed fix is:

I propose that in RollbackImpl::_processRollbackOp, if we see an emptycapped command, we must

  • clear the diff in _countDiffs for that UUID; and
  • call SizeRecoveryState::markCollectionAsAlwaysNeedsSizeAdjustment() for that UUID

This allows the record store size to be set to 0 when WiredTigerRecordStore::truncate() is called and allows us to avoid the invariant. What makes me slightly worried is that this would be the first caller of a SizeRecoveryState method outside of WiredTigerRecordStore, so we'll need to pay attention to any deadlocks between storage and rollback involving SizeRecoveryState::_mutex.

 

We must also fix this for startup recovery and write a test for both scenarios.

Generated at Thu Feb 08 04:38:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.