[SERVER-44886] Remove and re-add shard test wait time is inconsistent with registry refresh timeout Created: 29/Nov/19 Updated: 06/Dec/22 Resolved: 16/Apr/20 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Marcos José Grillo Ramirez | Assignee: | [DO NOT USE] Backlog - Sharding Team |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Assigned Teams: |
Sharding
|
||||||||||||
| Operating System: | ALL | ||||||||||||
| Steps To Reproduce: | 1. Run jstests/sharding/remove2.js test until it fails. |
||||||||||||
| Sprint: | Sharding 2019-12-16, Sharding 2019-12-30, Sharding 2020-01-13, Sharding 2020-01-27, Sharding 2020-02-10, Sharding 2020-02-24, Sharding 2020-03-09, Sharding 2020-03-23, Sharding 2020-04-06, Sharding 2020-04-20 | ||||||||||||
| Participants: | |||||||||||||
| Linked BF Score: | 19 | ||||||||||||
| Description |
|
The remove2.js test waits for 20 seconds hoping that the replica set monitor timeouts and refreshes other shard information, however the timeout for the registry is 30 seconds. This test should wait until the registry is successfully updated with the new shard information. |
| Comments |
| Comment by Dianna Hohensee (Inactive) [ 03/Dec/19 ] |
|
Thanks for pitching in, william.schultz! |
| Comment by William Schultz (Inactive) [ 03/Dec/19 ] |
|
https://github.com/mongodb/mongo/commit/d3b08f2dc93636a04f92d5448fdacfd447704607 should temporarily address this issue. |
| Comment by William Schultz (Inactive) [ 03/Dec/19 ] |
|
Working on a temporary fix for this that I can push very soon. |
| Comment by Dianna Hohensee (Inactive) [ 03/Dec/19 ] |
|
remove2.js has started failing about 90% of the time across many variants since yesterday. esha.maharishi could this be prioritized? |
| Comment by William Schultz (Inactive) [ 02/Dec/19 ] |
|
I have recently been seeing this test fail a lot in my patch builds: https://evergreen.mongodb.com/version/5ddffb9de3c33144eb0bdf71. |
| Comment by Esha Maharishi (Inactive) [ 01/Dec/19 ] |
|
It may be worth checking if the connPoolStats command (which calls ReplicaSetMonitorManager::report, which calls ReplicaSetMonitor::appendInfo for each replica set) can be used in an assert.soon with a high timeout (e.g., 5 minutes) to make this wait less racy. |