[SERVER-32243] Add an option to have the validate hook skip some collections. Created: 08/Dec/17  Updated: 30/Oct/23  Resolved: 08/Jan/18

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: 3.7.1
Fix Version/s: 3.4.12, 3.6.3, 3.7.1

Type: Improvement Priority: Major - P3
Reporter: Robert Guo (Inactive) Assignee: Jonathan Abrahams
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Backports
Depends
Gantt Dependency
has to be done before SERVER-32704 sys-perf: Skip validating oplog as en... Closed
Backwards Compatibility: Fully Compatible
Backport Requested:
v3.6, v3.4, v3.2
Sprint: TIG 2018-1-15
Participants:

 Description   

For certain workloads that require large oplogs, it may be desirable to not validate the oplog.

This ticket intends to have the validate hook ignore collections that are passed in through TestData: e.g. TestData.validateIgnoredNS=['local.oplog.rs'].

Collection validation in the hook is done here.

In more detail. validate does 3 things: check the document is valid BSON, check indexes are valid, check the WT table structure. On the oplog, validate can only do the BSON check. Because there are no indexes to validate on the oplog and WT can't verify the table on a live replica set, because it can't get exclusive access. Effectively we'd only be missing the BSON validation. Which I think is not as crucial on the oplog as on user collections. The oplog is always read, so we'd be converting documents in it into BSON anyway, the process of which should uncover some invalid BSON issues. Compare this with if a user's document's BSON gets corrupted when written to disk. if the document is never read again, we'd never find out about it, unless we run validate().



 Comments   
Comment by Githook User [ 30/Jan/18 ]

Author:

{'email': 'jonathan@mongodb.com', 'name': 'Jonathan Abrahams', 'username': 'hptabster'}

Message: SERVER-32243 Add an option to have the validate hook skip some collections
Branch: v3.4
https://github.com/mongodb/mongo/commit/e6783ca2cac1e5d16b822e1508a1c025cdbded81

Comment by Githook User [ 30/Jan/18 ]

Author:

{'email': 'jonathan@mongodb.com', 'name': 'Jonathan Abrahams', 'username': 'hptabster'}

Message: SERVER-32243 Add an option to have the validate hook skip some collections

(cherry picked from commit 56ba266ca7eb46bfca0dc15ba0ca2290237db713)
Branch: v3.6
https://github.com/mongodb/mongo/commit/2856b480004af9f5987654420caeda209f85a2a8

Comment by Githook User [ 16/Jan/18 ]

Author:

{'email': 'henrik.ingo@mongodb.com', 'name': 'Henrik Ingo', 'username': 'henrikingo'}

Message: SERVER-32704 sys-perf: Skip validating oplog as enabled by SERVER-32243

(cherry picked from commit 0784425fa2d58b6a2bff3125b50be7f0d6a7f489)
Branch: v3.4
https://github.com/mongodb/mongo/commit/f1d65569536f53123e85dc879a53b40677de91ce

Comment by Githook User [ 15/Jan/18 ]

Author:

{'email': 'henrik.ingo@mongodb.com', 'name': 'Henrik Ingo', 'username': 'henrikingo'}

Message: SERVER-32704 sys-perf: Skip validating oplog as enabled by SERVER-32243

(cherry picked from commit 0784425fa2d58b6a2bff3125b50be7f0d6a7f489)
Branch: v3.6
https://github.com/mongodb/mongo/commit/0ab896759f14515780740e9d9d984fd6103f866c

Comment by Githook User [ 15/Jan/18 ]

Author:

{'email': 'henrik.ingo@mongodb.com', 'name': 'Henrik Ingo', 'username': 'henrikingo'}

Message: SERVER-32704 sys-perf: Skip validating oplog as enabled by SERVER-32243
Branch: master
https://github.com/mongodb/mongo/commit/0784425fa2d58b6a2bff3125b50be7f0d6a7f489

Comment by Jonathan Abrahams [ 10/Jan/18 ]

henrik.ingo Try using

TestData = { skipValidationNamespaces: ['local.oplog.rs'] };

Comment by Robert Guo (Inactive) [ 10/Jan/18 ]

henrik.ingo I realized there needs to be a couple of changes.

1. The name of the option was changed to skipValidationNamespaces from validateIgnoredNS.

2. You're right that TestData needs to be defined as well.

So you'll need something like this:

TestData = { skipValidationNamespaces: ['local.oplog.rs'] };

Comment by Henrik Ingo (Inactive) [ 10/Jan/18 ]

Hi robert.guo, jonathan.abrahams

I'm trying to use this now from the DSI side, by

TestData.validateIgnoredNS=['local.oplog.rs'];
load('jstests/hooks/run_validate_collections');

The first line will give an error. Did you perhaps mean:

TestData = { validateIgnoredNS: ['local.oplog.rs'] };

Comment by Githook User [ 08/Jan/18 ]

Author:

{'name': 'Jonathan Abrahams', 'username': 'hptabster', 'email': 'jonathan@mongodb.com'}

Message: SERVER-32243 Add an option to have the validate hook skip some collections
Branch: master
https://github.com/mongodb/mongo/commit/56ba266ca7eb46bfca0dc15ba0ca2290237db713

Comment by Robert Guo (Inactive) [ 04/Jan/18 ]

Re Kevin: The newly added UUID checks will also be skipped. It's possibly to only validate the UUID by adding another option to the validate() command, but I think the additional complexity and documentation outweigh potential benefits. Since it's unlikely we somehow only fail to generate UUIDs for large collections.

Re david.daly The following script should do the job:

TestData = { skipValidationNamespaces: ['local.oplog.rs'] };
load('jstests/hooks/run_validate_collections');

I'm a bit hesitant to add this script to the jstests directory since it is used for a very specific purpose; putting it in the perf repo might be better aligned with its use case for the time being.

Alternatively, I think there might be a better solution. It should be possible to change this line and line 45 to decouple the name variable from the JS file name. SCRIPT_NAMES can then be made into a dictionary of lambdas that generate bash or mongo shell commands on the fly. I think doing it this way instead of using a different running file will be more flexible and should future-proof needs for JS files without the additional complexity of piping in more configuration options.

Comment by Kevin Duong [ 12/Dec/17 ]

robert.guo How will this impact UUID checking? Seems like there's work needed on the server side?

Generated at Thu Feb 08 04:29:40 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.