[SERVER-84343] Create multiversion sharding concurrency suite with continuous stepdown Created: 20/Dec/23  Updated: 21/Dec/23

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Tommaso Tocci Assignee: Backlog - Catalog and Routing
Resolution: Unresolved Votes: 0
Labels: car-qw
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Catalog and Routing
Participants:
Story Points: 3

 Description   

I realized that our test coverage is not enough to spot bugs caused by the introduction of backward incompatible metadata.

For instance, adding a phase to a sharding DDL coordinator that is not recognized by previous versions.

The ideal solution would be to add a suite, both for core-passthrough and FSMs, in which we continuously perform the full upgrade/downgrade procedure. Both FCV and binary change. This proposal is tracked by PM-3219.

On the other side there is another easier and intermediate solution that would allow us to catch most of those backward incompatibility bugs.
In fact, we could simply run a sharding continuous stepdown suites (e.g. concurrency_sharded_with_stepdowns) and running it in a implicit multiversion variant (e.g. Enterprise RHEL 8.0 (implicit multiversion & all feature flags))
By causing elections in these variants, we will implicitly make the coordinator node of DDL operations to change binaries while the operation is ongoing. Allowing us to spot possible backward incompatible bugs.

We already have sharded_retryable_writes_downgrade that should cover the core tests, but we are missing the counterpart for concurrency tests.


Generated at Thu Feb 08 06:54:44 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.