Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-31543

rollback uses non-system temporary namespaces for working collections when handling rename conflicts

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.6.0-rc2
    • Affects Version/s: 3.6.0-rc0
    • Component/s: Replication
    • Labels:
    • Fully Compatible
    • ALL
    • Repl 2017-11-13

      When rolling back collection rename operations, the rollback algorithm may need to use temporary namespaces to resolve conflicts with existing collections. Currently, we use temporary <db>.system.tmpXXXXX namespaces for this purpose. This can cause issues for collections with validation enabled because the server internally disallows validation on system namespaces. The proposed fix is to use temporary namespaces in the <db>.rollback.tmpXXXXX namespace.

      --------------

      When rolling back a collMod on a collection coll1 that will later be renamed to coll2 can get stuck when coll2 is a temporary collection, which can happen when coll2 is in a drop-pending state.

      Here's the log from the rollback process, which contains the 3 oplog entries that are rolled back.

      d20003| 2017-10-12T21:12:47.280-0400 I ROLLBACK [rsBackgroundSync] Finding the Common Point
      d20003| 2017-10-12T21:12:47.281-0400 I ROLLBACK [rsBackgroundSync] our last optime:   Oct 12 21:12:37:5
      d20003| 2017-10-12T21:12:47.281-0400 I ROLLBACK [rsBackgroundSync] their last optime: Oct 12 21:12:41:2
      d20003| 2017-10-12T21:12:47.281-0400 I ROLLBACK [rsBackgroundSync] diff in end of log times: -4 seconds
      d20003| 2017-10-12T21:12:47.281-0400 D ROLLBACK [rsBackgroundSync] Updating rollback FixUpInfo for local oplog entry: { ts: Timestamp 1507857157000|5, t: 1, h: -7181224202081172679, v: 2, op: "c", ns: "db_0.$cmd", ui: BinData(4, AA1EAED7602F4A26A40D5C8CDDFF522C), o: { renameCollection: "db_0.two", to: "db_0.three", stayTemp: false, dropTarget: false }, wall: new Date(1507857157681) }
      d20003| 2017-10-12T21:12:47.281-0400 D ROLLBACK [rsBackgroundSync] Updating rollback FixUpInfo for local oplog entry: { ts: Timestamp 1507857157000|4, t: 1, h: -444184082177170960, v: 2, op: "c", ns: "db_0.$cmd", ui: BinData(4, AA1EAED7602F4A26A40D5C8CDDFF522C), o: { collMod: "two", validator: { e: { $ne: 100.0 } }, validationLevel: "moderate", validationAction: "warn" }, o2: { collectionOptions_old: { uuid: BinData(4, AA1EAED7602F4A26A40D5C8CDDFF522C) } }, wall: new Date(1507857157680) }
      d20003| 2017-10-12T21:12:47.281-0400 D ROLLBACK [rsBackgroundSync] Updating rollback FixUpInfo for local oplog entry: { ts: Timestamp 1507857157000|3, t: 1, h: 7609240345702257709, v: 2, op: "c", ns: "db_0.$cmd", ui: BinData(4, C739BDEA66BF4A84887A717DAF27A6E5), o: { drop: "three" }, wall: new Date(1507857157679) }
      d20003| 2017-10-12T21:12:47.281-0400 I ROLLBACK [rsBackgroundSync] Rollback common point is { ts: Timestamp 1507857157000|2, t: 1 }
      d20003| 2017-10-12T21:12:47.281-0400 I ROLLBACK [rsBackgroundSync] Starting refetching documents
      d20003| 2017-10-12T21:12:47.281-0400 I ROLLBACK [rsBackgroundSync] Finished refetching documents. Total size of documents refetched: 0
      d20003| 2017-10-12T21:12:47.281-0400 I ROLLBACK [rsBackgroundSync] Checking the RollbackID and updating the MinValid if necessary
      d20003| 2017-10-12T21:12:47.281-0400 I ROLLBACK [rsBackgroundSync] Setting minvalid to { ts: Timestamp 1507857161000|2, t: 2 }
      d20003| 2017-10-12T21:12:47.281-0400 I ROLLBACK [rsBackgroundSync] Dropping collections to roll back create operations
      d20003| 2017-10-12T21:12:47.281-0400 I ROLLBACK [rsBackgroundSync] Rolling back renameCollection commands and collection drop commands.
      d20003| 2017-10-12T21:12:47.281-0400 D ROLLBACK [rsBackgroundSync] Attempted to rename collection from db_0.system.drop.1507857157i3t1.three to db_0.three but db_0.three exists already. Temporarily renaming collection db_0.three out of the way to db_0.system.tmpUUNYo
      d20003| 2017-10-12T21:12:47.310-0400 W ROLLBACK [rsBackgroundSync] Rollback cannot complete at this time (retrying later): InvalidOptions: Document validators not allowed on system collections. appliedThrough= { ts: Timestamp 1507857157000|5, t: 1 } minvalid= { ts: Timestamp 1507857161000|2, t: 2 }
      

      This bug seems to also be the issue underlying SERVER-31478, where if you try to access a drop-pending collection, lookupNSSByUUID returns an empty string, causing the error shown in that ticket's description.

      The repro has two lines commented out (lines 49 and 50). If those lines are added back in, we get the error message from SERVER-31478. Here's stack trace with oplog entries for reference.

      d20003| 2017-10-12T21:02:49.455-0400 I REPL     [rsBackgroundSync] transition to ROLLBACK
      d20003| 2017-10-12T21:02:49.455-0400 I ROLLBACK [rsBackgroundSync] Starting rollback. Sync source: robert-macbook-pro.local:20001
      d20003| 2017-10-12T21:02:49.457-0400 I ROLLBACK [rsBackgroundSync] Finding the Common Point
      d20003| 2017-10-12T21:02:49.457-0400 I ROLLBACK [rsBackgroundSync] our last optime:   Oct 12 21:02:29:7
      d20003| 2017-10-12T21:02:49.457-0400 I ROLLBACK [rsBackgroundSync] their last optime: Oct 12 21:02:33:2
      d20003| 2017-10-12T21:02:49.457-0400 I ROLLBACK [rsBackgroundSync] diff in end of log times: -4 seconds
      d20003| 2017-10-12T21:02:49.457-0400 D ROLLBACK [rsBackgroundSync] Updating rollback FixUpInfo for local oplog entry: { ts: Timestamp 1507856549000|7, t: 1, h: 3861500656895800221, v: 2, op: "c", ns: "db_0.$cmd", ui: BinData(4, 57BC97DB6F264A5A891590A2D2E3EFCE), o: { renameCollection: "db_0.two", to: "db_0.three", stayTemp: false, dropTarget: false }, wall: new Date(1507856549782) }
      d20003| 2017-10-12T21:02:49.458-0400 D ROLLBACK [rsBackgroundSync] Updating rollback FixUpInfo for local oplog entry: { ts: Timestamp 1507856549000|6, t: 1, h: 2189257796218325053, v: 2, op: "c", ns: "db_0.$cmd", ui: BinData(4, CDD4727C308942C392482D85F70AD907), o: { drop: "three" }, wall: new Date(1507856549781) }
      d20003| 2017-10-12T21:02:49.458-0400 D ROLLBACK [rsBackgroundSync] Updating rollback FixUpInfo for local oplog entry: { ts: Timestamp 1507856549000|5, t: 1, h: 6563969530498729032, v: 2, op: "c", ns: "db_0.$cmd", ui: BinData(4, CDD4727C308942C392482D85F70AD907), o: { create: "three", flags: 3, validationLevel: "strict", validationAction: "warn", idIndex: { v: 2, key: { _id: 1 }, name: "_id_", ns: "db_0.three" } }, wall: new Date(1507856549781) }
      d20003| 2017-10-12T21:02:49.458-0400 D ROLLBACK [rsBackgroundSync] Updating rollback FixUpInfo for local oplog entry: { ts: Timestamp 1507856549000|4, t: 1, h: -2445838164785118562, v: 2, op: "c", ns: "db_0.$cmd", ui: BinData(4, 57BC97DB6F264A5A891590A2D2E3EFCE), o: { collMod: "two", validator: { e: { $ne: 100.0 } }, validationLevel: "moderate", validationAction: "warn" }, o2: { collectionOptions_old: { uuid: BinData(4, 57BC97DB6F264A5A891590A2D2E3EFCE) } }, wall: new Date(1507856549780) }
      d20003| 2017-10-12T21:02:49.458-0400 D ROLLBACK [rsBackgroundSync] Updating rollback FixUpInfo for local oplog entry: { ts: Timestamp 1507856549000|3, t: 1, h: 2320841867216137252, v: 2, op: "c", ns: "db_0.$cmd", ui: BinData(4, 9E0EC528AE3F414DAD8D4294A253D63A), o: { drop: "three" }, wall: new Date(1507856549779) }
      d20003| 2017-10-12T21:02:49.458-0400 I ROLLBACK [rsBackgroundSync] Rollback common point is { ts: Timestamp 1507856549000|2, t: 1 }
      d20003| 2017-10-12T21:02:49.458-0400 I ROLLBACK [rsBackgroundSync] Starting refetching documents
      d20003| 2017-10-12T21:02:49.458-0400 I ROLLBACK [rsBackgroundSync] Finished refetching documents. Total size of documents refetched: 0
      d20003| 2017-10-12T21:02:49.458-0400 I ROLLBACK [rsBackgroundSync] Checking the RollbackID and updating the MinValid if necessary
      d20003| 2017-10-12T21:02:49.458-0400 I ROLLBACK [rsBackgroundSync] Setting minvalid to { ts: Timestamp 1507856553000|2, t: 2 }
      d20003| 2017-10-12T21:02:49.458-0400 I ROLLBACK [rsBackgroundSync] Dropping collections to roll back create operations
      d20003| 2017-10-12T21:02:49.458-0400 E -        [rsBackgroundSync] Assertion: 28539:need a valid database name src/mongo/db/concurrency/d_concurrency.cpp 198
      

            Assignee:
            benety.goh@mongodb.com Benety Goh
            Reporter:
            robert.guo@mongodb.com Robert Guo (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: