[SERVER-37197] Validation failure does not cause test to fail Created: 18/Sep/18  Updated: 29/Oct/23  Resolved: 19/Sep/18

Status: Closed
Project: Core Server
Component/s: Testing Infrastructure
Affects Version/s: 4.1.3
Fix Version/s: 4.1.4

Type: Bug Priority: Critical - P2
Reporter: Ian Boros Assignee: Max Hirschhorn
Resolution: Fixed Votes: 0
Labels: tig-dataconsistency
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Problem/Incident
is caused by SERVER-32642 Return raw command response in the va... Closed
Related
related to SERVER-53854 checkReplicatedDataHashes() may retur... Closed
is related to SERVER-33068 run_check_repl_dbhash.js hook exits w... Closed
Backwards Compatibility: Fully Compatible
Operating System: ALL
Sprint: TIG 2018-09-24
Participants:

 Description   

Here's an evergreen build from the waterfall:
https://evergreen.mongodb.com/task/mongodb_mongo_master_enterprise_rhel_62_64_bit_noPassthroughWithMongod_67ab3ae97432353dea52c74fcc5982ea4d4d7ae6_18_09_18_17_22_50

Take a look at the test all_paths_index_multikey.js

The test passes:
https://logkeeper.mongodb.org/lobster/build/d2e22b32dd16074741aac2c64787c330/test/5ba165d3be07c460f901a988#bookmarks=0%2C74

But validation fails:
https://logkeeper.mongodb.org/lobster/build/d2e22b32dd16074741aac2c64787c330/test/5ba165def84ae80c2001a1d6#bookmarks=0%2C124

Yet the test is not marked as a failure. The same happens if you run it in resmoke locally (the test is marked as successful, but validation fails). This is a pretty serious problem because it means collection validation could be failing for more severe reasons without our knowing.



 Comments   
Comment by Githook User [ 19/Sep/18 ]

Author:

{'name': 'Max Hirschhorn', 'email': 'max.hirschhorn@mongodb.com', 'username': 'visemet'}

Message: SERVER-37197 Fix validateCollectionsThread() to check validate result.

It would otherwise silently ignore cases where collection validation had
failed.
Branch: master
https://github.com/mongodb/mongo/commit/78112be586c67efa877636d596b194650e90cbed

Comment by Max Hirschhorn [ 18/Sep/18 ]

Yet the test is not marked as a failure. The same happens if you run it in resmoke locally (the test is marked as successful, but validation fails). This is a pretty serious problem because it means collection validation could be failing for more severe reasons without our knowing.

Thanks for finding this ian.boros! I agree it is a serious problem.

The changes from c8f5485 as part of SERVER-32642 changed the CollectionValidator#validateCollections() function to return an object where it had previously been returning a boolean. This line wasn't updated to check the "ok" property and so the validateCollectionsThread() function is already returning {ok: 1}.

const dbNames = conn.getDBNames();
for (let dbName of dbNames) {
    if (!validatorFunc(conn.getDB(dbName), {full: true})) {
        return {ok: 0, host: host};
    }
}
return {ok: 1};

Generated at Thu Feb 08 04:45:18 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.