[SERVER-52833] Capped collections can contain too many documents after replication recovery Created: 12/Nov/20  Updated: 29/Oct/23  Resolved: 23/Feb/21

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 4.9.0, 4.4.5, 4.2.14, 4.0.25

Type: Bug Priority: Major - P3
Reporter: Gregory Noma Assignee: Gregory Noma
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Related
related to SERVER-34977 subtract capped deletes from fastcoun... Closed
related to SERVER-25025 Improve startup time when there are t... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Backport Requested:
v4.4, v4.2, v4.0
Sprint: Execution Team 2021-02-08, Execution Team 2021-02-22, Execution Team 2021-03-08
Participants:
Linked BF Score: 15

 Description   

Starting in SERVER-25025, we mark a collection as always needing size adjustment only if we are in rollback or replication recovery. However, we only set the inReplicationRecovery flag once we are in ReplicationRecoveryImpl::recoverFromOplog, which occurs after the aforementioned check for replication recovery.

This causes an issue for capped collections when the number of inserts applied during replication recovery is greater than collection's max number of documents. Since we're erroneously not adjusting the count, these documents will not delete other documents already inserted during replication recovery, causing the capped collection to contain more documents than it should until collection validation is run to correct the fast count.

The following jstest reproduces the issue:

/**
 * Reproduces the issue described in SERVER-52833.
 */
(function() {
"use strict";
 
load("jstests/libs/fail_point_util.js");
 
const rst = new ReplSetTest({nodes: 1});
rst.startSet();
rst.initiate();
 
const primary = rst.getPrimary();
const testDB = primary.getDB("test");
const coll = testDB.getCollection(jsTestName());
 
assert.commandWorked(testDB.createCollection(coll.getName(), {capped: true, size: 100, max: 1}));
 
const ts = assert.commandWorked(testDB.runCommand({insert: coll.getName(), documents: [{a: 1}]}))
               .operationTime;
configureFailPoint(primary, "holdStableTimestampAtSpecificTimestamp", {timestamp: ts});
 
assert.commandWorked(coll.insert([{b: 1}, {b: 2}]));
rst.restart(primary);
 
rst.stopSet();
})();



 Comments   
Comment by Githook User [ 07/May/21 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-52833 Perform capped deletes during startup recovery on documents inserted earlier in startup recovery
Branch: v4.0
https://github.com/mongodb/mongo/commit/c363d1da0bad994427214d5086825f163ef21cc8

Comment by Githook User [ 22/Apr/21 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-52833 Perform capped deletes during startup recovery on documents inserted earlier in startup recovery
Branch: v4.2
https://github.com/mongodb/mongo/commit/b67407b986ea715b0b9948c64a90369809ff6da0

Comment by Githook User [ 18/Mar/21 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-52833 Perform capped deletes during startup recovery on documents inserted earlier in startup recovery

(cherry picked from commit 8b9744aea1d1bde164d0b0fcf6d184ec5161013f)
Branch: v4.4
https://github.com/mongodb/mongo/commit/43a023db9a235fc70bacd1d7b4a1485e53c047e7

Comment by Githook User [ 23/Feb/21 ]

Author:

{'name': 'Gregory Noma', 'email': 'gregory.noma@gmail.com', 'username': 'gregorynoma'}

Message: SERVER-52833 Perform capped deletes during startup recovery on documents inserted earlier in startup recovery
Branch: master
https://github.com/mongodb/mongo/commit/8b9744aea1d1bde164d0b0fcf6d184ec5161013f

Generated at Thu Feb 08 05:29:08 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.