[SERVER-31793] Add a hook to check that all collections have UUIDs when featureCompatibilityVersion is 3.6 Created: 01/Nov/17  Updated: 20/Dec/17  Resolved: 20/Dec/17

Status: Closed
Project: Core Server
Component/s: Storage, Testing Infrastructure
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Maria van Keulen Assignee: Kevin Albertson
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-31792 UUIDs should be generated for non-rep... Closed
related to SERVER-32255 UUIDs may be absent from shard second... Closed
related to SERVER-32126 validate() should do basic sanity che... Closed
Backwards Compatibility: Fully Compatible
Sprint: TIG 2018-1-1, TIG 2017-12-18
Participants:

 Description   

The testing infrastructure should have a hook to ensure that all collections of all databases have UUIDs when featureCompatibilityVersion is 3.6 and no collections have UUIDs when featureCompatibilityVersion is 3.4.



 Comments   
Comment by Kevin Albertson [ 18/Dec/17 ]

The work for this is completed in SERVER-32126. The validate command now has UUID checks.

Comment by Max Hirschhorn [ 17/Nov/17 ]

I'd recommend modifying jstests/hooks/validate_collections.js to add this assertion about the UUID because it's already wired up to run between tests and at shutdown.

Comment by Judah Schvimer [ 06/Nov/17 ]

CollMods during rollback will be tested with SERVER-31805.

Comment by Maria van Keulen [ 06/Nov/17 ]

esha.maharishi and I discussed testing for SERVER-31792 and I will include the combinations enumerated in her bullet point list above in the testing for that ticket.

Comment by Max Hirschhorn [ 06/Nov/17 ]

I think Max Hirschhorn's idea is to do this check in the OpObserver if possible.

After discussing with geert.bosch in person, it doesn't seem viable because attempting to acquire the global S lock would be prone to deadlock since we'd still be holding the lock on the admin.system.version collection inside of the OpObserver.

Comment by Max Hirschhorn [ 06/Nov/17 ]

Max Hirschhorn, what's the negative of adding the hook everywhere? We already do this with validate which is a far more expensive check.

I don't think there's an issue with adding it everywhere, but I also don't think we should be relying on this check to test properties of the system that either (a) mongod could check itself, or (b) should have had a targeted test.

It's mainly because assigning UUIDs at startup is fairly complicated, since it differs depending on whether a node is a --shardsvr, --configsvr, or none, and whether it's a standalone or replica set.

It if was known to be complicated, then why didn't we write more exhaustive tests for it? Are there other things around handling UUIDs during upgrade/downgrade that warrant further testing? I asked because I've gotten an impression from the Replication team that we may be missing test cases around replication rollback of collMod operations during the upgrade/downgrade process.

Comment by Judah Schvimer [ 06/Nov/17 ]

Even if we don't call setFCV, every mongod inserts an fCV: 3.4, targetVersion: unset or fCV:3.6, targetVersion: unset document at some point, and in the former case expects no UUIDs and in the latter case expects UUIDs everywhere. I think max.hirschhorn's idea is to do this check in the OpObserver if possible.

Comment by Esha Maharishi (Inactive) [ 06/Nov/17 ]

judah.schvimer, what if setFCV is never called in the test, which is most tests?

I think the important cases are:

  • UUIDs assigned (or not) to admin, local, config, and user-created databases
  • --configsvr, --shardsvr, or none
  • standalone server or replica set
  • setFCV=3.6 called, setFCV=3.4 called, or setFCV never called

We can either explicitly test every legal combination of these (e.g., UUIDs are correctly assigned after startup to admin, local, config, and user databases for a standalone --shardsvr when setFCV never called), which becomes a finite set that someone will have to remember to update if we add/change configuration possibilities, or we can use a hook and assume our test coverage captures all cases and will continue to capture additional cases as they become possible?

Comment by Judah Schvimer [ 06/Nov/17 ]

esha.maharishi, would it not accomplish the same thing by mongod checking for UUIDs on upgrade completion or lack of UUIDs on downgrade completion?
max.hirschhorn, what's the negative of adding the hook everywhere? We already do this with validate which is a far more expensive check.

Comment by Esha Maharishi (Inactive) [ 06/Nov/17 ]

max.hirschhorn, it could be a way to ensure UUIDs get assigned (or not) correctly in all configurations. It's mainly because assigning UUIDs at startup is fairly complicated, since it differs depending on whether a node is a --shardsvr, --configsvr, or none, and whether it's a standalone or replica set.

Alternatively, we could add a test that explicitly checks every combination we can think of.

Comment by Max Hirschhorn [ 06/Nov/17 ]

maria.vankeulen, geert.bosch, is there a reason not to have mongod do this check itself upon fully-upgrading to featureCompatibilityVersion=3.6? Are there non-replicated collections that wouldn't exist upon fully-upgrading to featureCompatibilityVersion=3.6 that would exist at shutdown? I'm not particularly convinced this check would be worth adding to test suites that don't modify the server's featureCompatibilityVersion.

Comment by Judah Schvimer [ 02/Nov/17 ]

This was the result of a discussion about how SERVER-31792 slipped through the cracks. Checking every node for UUIDs at shutdown would have caught this. I don't expect it to have a significant impact on shutdown time since it just runs listCollections on each database, but would make sense for every single node individually at shutdown.

Comment by Maria van Keulen [ 02/Nov/17 ]

Apologies for the confusion; when judah.schvimer, esha.maharishi, and I were discussing this, I believe we wanted it to be run as part of shutting down the MongoDB deployment after each test.

Comment by Max Hirschhorn [ 01/Nov/17 ]

The testing infrastructure should have a hook to ensure that all collections of all databases have UUIDs when featureCompatibilityVersion is 3.6 and no collections have UUIDs when featureCompatibilityVersion is 3.4.

maria.vankeulen, could you expand on when you'd want such a hook to run? Is it something you'd like to run after each test, e.g. as part of shutting down the MongoDB deployment? I'd appreciate some more clarity since it seems like something that should either be (a) a library to use in UUID-related targeted tests, and/or (b) a hook to run with replication-related jstestfuzz*.yml test suites after each test.

Generated at Thu Feb 08 04:28:12 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.