During the investigating of BF-30194, a profile_lookup.js failure which happens when running as part of fcv_upgrade_downgrade_replica_sets_jscore_passthrough, it was uncovered that commands other than setFeatureCompatibilityVersion can be emitted implicitly as part of FCV upgrade/downgrade procedures.
Tests marked as {}require_profiling are already excluded from fcv_upgrade_downgrade_sharding_jscore_passthrough suites due to mongos lacking the system.profile collection, so this could only happen on fcv_upgrade_downgrade_replica_sets_jscore_passthrough.
For BF-30194 in particular, an unaccounted collMod command happens during the time profiling is enabled which increments the command counter on the collection targeted by the test. This later
Looking at the latest available evergreen failure at the moment of investigating, we can reconstruct the following timeline from the logs:
- [CODE] Profiler gets enabled with the filter "filter":{"command.setFeatureCompatibilityVersion":{"$not":{"$exists":true}},"db":"test"}}}}
[j2:prim] {"t": |
{"$date":"2023-11-29T09:30:37.095+00:00"} |
,"s":"I", "c":"COMMAND", "id":48742, "ctx":"conn320","msg":"Profiler settings changed","attr":{"from": |
{"level":0,"slowms":100,"sampleRate":1} |
,"to":{"level":1,"slowms":100,"sampleRate":1,"filter":{"command.setFeatureCompatibilityVersion":{"$not":{"$exists":true}}}},"db":"test"}} |
- [Problem] A collMod on collection foreign happens right after
[j2:prim] {"t": |
{"$date":"2023-11-29T09:30:37.103+00:00"} |
,"s":"I", "c":"COMMAND", "id":5324200, "ctx":"conn319","msg":"CMD: collMod","attr":{"cmdObj": |
{"collMod":"foreign"} |
}} |
- [CODE] The test query containing $lookup executes:
[j2:prim] {"t": |
{"$date":"2023-11-29T09:30:37.103+00:00"} |
,"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn320","msg":"Slow query","attr":{"type":"command","isFromUserConnection":true,"ns":"test.local","collectionType":"normal","appName":"MongoDB Shell","command":{"aggregate":"local","pipeline":\\\\\{"$lookup":{"from":"foreign","as":"res","localField":"a","foreignField":"a"}},"cursor":{},"lsid":{"id":{"$uuid":"d493d41d-7190-4a35-9a12-165c9f1cd823"}},"$clusterTime":{"clusterTime":{"$timestamp":{"t":1701250236,"i":10}},"signature":{"hash":{"$binary":{"base64":"AAAAAAAAAAAAAAAAAAAAAAAAAAA=","subType":"0"}},"keyId":0}},"$db":"test"},"planSummary":"COLLSCAN","planningTimeMicros":233,"keysExamined":0,"docsExamined":4,"cursorExhausted":true,"numYields":0,"nreturned":3,"queryHash":"9EB1333D","planCacheKey":"9B882338","queryFramework":"sbe","reslen":398,"locks":{"FeatureCompatibilityVersion":{"acquireCount":{"r":1}},"Global":{"acquireCount": |
{"r":1} |
}},"readConcern":{"level":"local","provenance":"implicitDefault"},"writeConcern":{"w":"majority","wtimeout":0,"provenance":"implicitDefault"},"storage":{},"remote":"127.0.0.1:58381","protocol":"op_msg","durationMillis":4}} |
- [CODE] Some time later the profiler gets sopped
[j2:prim] {"t": |
{"$date":"2023-11-29T09:30:37.116+00:00"} |
,"s":"I", "c":"COMMAND", "id":48742, "ctx":"conn320","msg":"Profiler settings changed","attr":{"from":{"level":1,"slowms":100,"sampleRate":1,"filter":{"command.setFeatureCompatibilityVersion":{"$not":{"$exists":true}}}},"to":{"level":0,"slowms":100,"sampleRate":1,"filter":{"command.setFeatureCompatibilityVersion":{"$not":{"$exists":true}}}},"db":"test"}} |
- [CODE] Then the test fails on the foreign command count assertion:
[js_test:profile_lookup] 2023-11-29T09:30:37.137Z assert: [1] != [2] are not equal |
[js_test:profile_lookup] doassert@src/mongo/shell/assert.js:20:14 |
[js_test:profile_lookup] assert.eq@src/mongo/shell/assert.js:176:17 |
[js_test:profile_lookup] @jstests\aggregation\sources\lookup\profile_lookup.js:57:8 |
[js_test:profile_lookup] uncaught exception: Error: [1] != [2] are not equal : |
[js_test:profile_lookup] doassert@src/mongo/shell/assert.js:20:14 |
[js_test:profile_lookup] assert.eq@src/mongo/shell/assert.js:176:17 |
[js_test:profile_lookup] @jstests\aggregation\sources\lookup\profile_lookup.js:57:8 |
[js_test:profile_lookup] Error: [1] != [2] are not equal |
[js_test:profile_lookup] failed to load: jstests\aggregation\sources\lookup\profile_lookup.js |
[js_test:profile_lookup] exiting with code -3 |
The test assumed that the pipeline will be the sole command involving collection foreign between the profiler start / stop calls, and therefore expected the command count on foreign to be equal to 1 (classic engine). This however wasn't the case due to the additional `collMod` command.
- related to
-
SERVER-81563 Remove profiling tests from fcv upgrade downgrade passthroguh suites
- Closed