SERVER ticket description: Currently the oplog cannot be dropped while running in replset mode, but can be dropped as standalone. Until recently the procedure to resize the oplog included dropping the oplog while in standalone, however, doing this procedure on an uncleanly shutdown 4.0 mongod causes committed writes to be lost (because they only existed in the oplog, and the resize preserves only the final oplog entry, see DOCS-12230 and
SERVER-38174 for more details). It would be much better if attempting this procedure in 4.0 did not result in oplog entries being lost, eg. if dropping the oplog failed.
Completely forbidding oplog drop (even when standalone) would interfere with the use case of restoring a filesystem snapshot as a test standalone. A better alternative would be to forbid dropping the oplog only if local.system.replset contains documents. This way, users who are sure they want to drop the oplog can do so by first removing the documents from local.system.replset (which can't be dropped, but can have its contents removed) and then restarting the standalone. Whereas users who are just trying to perform a manual oplog resize will be stopped before any data loss.
If we choose not to do this, then at the very least we should improve the "standalone-but-replset-config-exists" startup warning to specifically warn against to manually resizing the oplog.
The changes made in this ticket prevent the oplog from being dropped on a standalone node when the WiredTiger storage engine is being used (or any other storage engine that supports the replSetResizeOplog command; currently on the WiredTiger storage engine supports that command). Note that dropping the oplog is already forbidden for nodes running as a part of a replica set.
In the past, dropping the oplog was a step in the procedure to manually resize the oplog. However, dropping the oplog had a few bad side effects, and so we are trying to get users to use the replSetResizeOplog command.
(For further information please see Suganthi's comment on ticket
Dropping the oplog can lead to data inconsistencies, as unapplied oplog entries can be lost. In her attempt to recreate this issue and see inconsistencies, Suganthi encountered a server crash instead due to an fassert: after an unclean shutdown on the MMAPv1 storage engine, on startup recovery the server tries to replay entries from the AppliedThroughTimestamp to the top of the oplog. It checks if the first timestamp it found matches the oplog application start point, and if not, crashes (https://github.com/mongodb/mongo/blob/8f4b0b3817fbf48cc0025632802aec37d21946da/src/mongo/db/repl/replication_recovery.cpp#L134-L138).
More information can be found on Suganthi's comment on ticket