[SERVER-22030] Abort if oplog is uncapped when starting in repl mode Created: 30/Dec/15  Updated: 08/Feb/17  Resolved: 19/May/16

Status: Closed
Project: Core Server
Component/s: Replication
Affects Version/s: 3.0.7
Fix Version/s: 3.2.13, 3.3.8

Type: Bug Priority: Critical - P2
Reporter: Kevin Pulo Assignee: Eric Milkie
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Related
related to SERVER-19997 Secondary crashes with oplog stream w... Closed
related to SERVER-20912 rs.initiate allows the creation of a ... Closed
related to SERVER-25519 repl::checkForCappedOplog will segfau... Closed
related to SERVER-20858 Invariant failure in OplogStones for ... Closed
Backwards Compatibility: Minor Change
Operating System: ALL
Backport Completed:
Sprint: Repl 15 (06/03/16)
Participants:

 Description   

In 3.0.7 (MMAPv1), oplog entries have been observed to be in non-monotonic order on secondaries where the oplog is not a capped collection.

SERVER-20858 prevents non-capped oplogs from being created, however, this does not help if the system already has a non-capped oplog. However, in versions where SERVER-20858 isn't fixed, this is an easy situation to accidentally get into, eg. by following the oplog resize procedure but forgetting/missing the capped: true parameter, or just forgetting/missing the createCollection step entirely (the next step will create the (uncapped) collection when the last oplog entry is inserted).

SERVER-20912 hints at a "preflight check". This ticket is to request that a startup warning be generated if --replSet is specified and a non-capped oplog is found. It would also be good to have rs.printReplicationInfo() print a warning if the oplog isn't capped. Both warnings should direct the user to resize their oplog.



 Comments   
Comment by Githook User [ 08/Feb/17 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-22030 abort if oplog is uncapped when starting in repl mode

(cherry picked from commit 4e7318bcb63eea1c0cbe453bede94d0e908b351c)
Branch: v3.2
https://github.com/mongodb/mongo/commit/05fb02461089988d244c49401d921473530b7a76

Comment by Eric Milkie [ 19/May/16 ]

Note this only affects MMAP storage engine, which does not guarantee insertion order in non-capped collections.

Comment by Githook User [ 19/May/16 ]

Author:

{u'username': u'milkie', u'name': u'Eric Milkie', u'email': u'milkie@10gen.com'}

Message: SERVER-22030 abort if oplog is uncapped when starting in repl mode
Branch: master
https://github.com/mongodb/mongo/commit/4e7318bcb63eea1c0cbe453bede94d0e908b351c

Comment by Eric Milkie [ 17/May/16 ]

Since version 3.0.9, in the 3.0 branch, users no longer have the ability to create an uncapped oplog (even implicitly).

Comment by Kevin Pulo [ 31/Dec/15 ]

I have been able to repro to confirm the impact — an uncapped oplog, at least on 3.0.8 MMAPv1 secondaries, results in oplog entries that are out of order. I wouldn't be surprised if the impact was wider (eg. perhaps also primaries, perhaps also WT, perhaps also 3.2/master), but it doesn't really matter because even this alone is bad enough.

The repro I used was simple:

  • mlaunch a 3.0.8 MMAPv1 3 member replset (1 arbiter), with --smallfiles.
  • Restart the secondary as a standalone on another port, save the last oplog entry, drop the oplog, and insert the saved entry (creating an uncapped oplog). Restart back in the replset.
  • Load up the replset with this script:

    dumpdata

    #!/bin/bash
    size="$1"
    dest="$2"
    mongo --port 27017 --eval 'size='"$size"'; dest="'"$dest"'"; payload = (new Array(size)).join(" "); c = db.getMongo().getCollection(dest); while(1) { for(i=0; i<1000; i++) { c.insert({payload: payload}); } c.drop() }'
    

    And run several instances of it:

    $ ./dumpdata 200 a.a & ./dumpdata 20000 b.b & ./dumpdata 200 c.c & ./dumpdata 200 d.d & ./dumpdata 200 e.e & ./dumpdata 200 f.f &
    

  • After a few minutes, use the following script on the secondary to observe out of order oplog entries:

    checkoplog

    #!/bin/bash
    mongo --port 27018 --eval 'prev = null; db.getSiblingDB("local").oplog.rs.find({}, {ts:1}).hint({$natural:1}).forEach(function (x) { if (prev != null) { if ((x.ts.t < prev.ts.t) || (x.ts.t == prev.ts.t && x.ts.i < prev.ts.i)) { print("PROBLEM: current ts " + tojson(x.ts) + " is <= prev ts " + tojson(prev.ts)); } } prev = x; } )'
    

    $ ./checkoplog
    MongoDB shell version: 3.0.8
    connecting to: 127.0.0.1:27018/test
    PROBLEM: current ts Timestamp(1451530622, 16) is <= prev ts Timestamp(1451530623, 1277)
    PROBLEM: current ts Timestamp(1451530622, 79) is <= prev ts Timestamp(1451530624, 653)
    PROBLEM: current ts Timestamp(1451530622, 334) is <= prev ts Timestamp(1451530624, 26)
    PROBLEM: current ts Timestamp(1451530622, 576) is <= prev ts Timestamp(1451530623, 4848)
    PROBLEM: current ts Timestamp(1451530622, 1042) is <= prev ts Timestamp(1451530623, 3633)
    PROBLEM: current ts Timestamp(1451530623, 498) is <= prev ts Timestamp(1451530623, 1709)
    PROBLEM: current ts Timestamp(1451530623, 2889) is <= prev ts Timestamp(1451530623, 4432)
    PROBLEM: current ts Timestamp(1451530624, 849) is <= prev ts Timestamp(1451530625, 462)
    PROBLEM: current ts Timestamp(1451530624, 4366) is <= prev ts Timestamp(1451530625, 7)
    PROBLEM: current ts Timestamp(1451530626, 873) is <= prev ts Timestamp(1451530626, 2252)
    PROBLEM: current ts Timestamp(1451530627, 3764) is <= prev ts Timestamp(1451530628, 1932)
    PROBLEM: current ts Timestamp(1451530629, 3682) is <= prev ts Timestamp(1451530630, 353)
    PROBLEM: current ts Timestamp(1451530633, 1145) is <= prev ts Timestamp(1451530633, 1691)
    PROBLEM: current ts Timestamp(1451530638, 577) is <= prev ts Timestamp(1451530638, 1891)
    PROBLEM: current ts Timestamp(1451530643, 1403) is <= prev ts Timestamp(1451530643, 2563)
    PROBLEM: current ts Timestamp(1451530650, 2834) is <= prev ts Timestamp(1451530650, 3241)
    ...
    

Comment by Kevin Pulo [ 30/Dec/15 ]

Sure. Since "downgrading" from a replset to a standalone isn't really a thing, there isn't really any possibility of a "vestigial" oplog.

Comment by Eric Milkie [ 30/Dec/15 ]

I think it would be easiest if we simply refuse to start, in repl mode, if the oplog is not capped. Not sure we need a warning for standalone mode, since a user's reason for starting in standalone mode is almost guaranteed to be because mongod is refusing to start in repl mode due to an uncapped oplog.

Comment by Kevin Pulo [ 30/Dec/15 ]

Refusing to start would be fine, as long as it's only when replication (or master/slave) is enabled. Refusing to start irrespective of supplied options is not okay, because it could effectively lock people out of their instance without any way of fixing the problem that's stopping the server from starting (ala SERVER-21378).

In the absence --master, --replSet or replication.replSetName, I guess a startup warning would be most appropriate. Though I could probably be talked out of that (ie. no change from the current behaviour).

Comment by Eric Milkie [ 30/Dec/15 ]

Should it be a warning, or simply refuse to start if the oplog isn't capped?

Generated at Thu Feb 08 03:59:11 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.