[SERVER-48096] PeriodicShardedIndexConsistencyChecker thread on jstests can cause unintended shard refreshes Created: 11/May/20 Updated: 29/Oct/23 Resolved: 29/Jun/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | 4.2.9, 4.4.1, 4.7.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Blake Oler | Assignee: | Tommaso Tocci |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | sharding-wfbf-day | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||||||
| Backport Requested: |
v4.4, v4.2
|
||||||||||||||||||||||||||||
| Sprint: | Sharding 2020-06-29 | ||||||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||||||
| Linked BF Score: | 44 | ||||||||||||||||||||||||||||
| Description |
ProblemThe presence of the PeriodicShardedIndexConsistencyChecker thread causes unintended refreshes on shards in all config server stepdown suites. Js tests that rely on a shard's metadata being stale can sporadically fail due to this thread running on stepup. Possible Solutions
Proven Affected Tests |
| Comments |
| Comment by Githook User [ 18/Aug/20 ] |
|
Author: {'name': 'Tommaso Tocci', 'email': 'tommaso.tocci@mongodb.com', 'username': 'toto-dev'}Message: (cherry picked from commit e755577b7d01a1442f14a26c995632c3cf6f6b14) |
| Comment by Githook User [ 21/Jul/20 ] |
|
Author: {'name': 'Tommaso Tocci', 'email': 'tommaso.tocci@mongodb.com', 'username': 'toto-dev'}Message: (cherry picked from commit e755577b7d01a1442f14a26c995632c3cf6f6b14) |
| Comment by Githook User [ 29/Jun/20 ] |
|
Author: {'name': 'Tommaso Tocci', 'email': 'tommaso.tocci@mongodb.com', 'username': 'toto-dev'}Message: |
| Comment by Tommaso Tocci [ 29/Jun/20 ] |
|
cleanup_orphaned_cmd_prereload.js test has been removed in |
| Comment by Jack Mulrow [ 28/May/20 ] |
|
max.hirschhorn, I'd slightly prefer not to disable the checker in the config stepdown suites since it shouldn't affect the correctness of tests other than those that assert on the staleness of shards, which is really an implementation detail, and it'd be nice to keep as much coverage as possible for that assumption. I don't expect many tests would need to disable the thread and the ones that do probably use fail points we can grep for, so auditing shouldn't be that bad. That said, we do have coverage for the checker with config server stepdowns from the concurrency stepdown suites, and I'd be surprised if any concurrency workload relies on shard staleness, so I'd also be fine disabling the checker in just the sharding_csrs_continuous_config_stepdown suite. |
| Comment by Max Hirschhorn [ 28/May/20 ] |
|
jack.mulrow, do you have thoughts for how we should handle the PeriodicShardedIndexConsistencyChecker? I saw you recently filed |