[DRIVERS-1577] Workarounds for killAllSessions in unified test runner Created: 22/Feb/21 Updated: 06/May/22 |
|
| Status: | Implementing |
| Project: | Drivers |
| Component/s: | Atlas Testing, Unified Test Runner |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Oleg Pudeyev (Inactive) | Assignee: | Jeremy Mikola |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Epic Link: | DRIVERS-828 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Driver Changes: | Needed | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Quarter: | FY22Q3 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Driver Compliance: |
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
Action for DriversThe unified test format spec clarifies additional error codes that may be ignored when executing killAllSessions to work around Original DescriptionUnified test format currently contains the recommendation to kill all sessions before every test:
killAllSessions command does not work in Atlas (HELP-22193). The closest workaround is killing the user's own sessions, but this currently doesn't work due to server bug ( Until When Note that killing sessions is not strictly necessary in the test runner unless transactions are actually being used, and currently Atlas planned maintenance tests do not use transactions. |
| Comments |
| Comment by Githook User [ 09/Apr/21 ] |
|
Author: {'name': 'Jeremy Mikola', 'email': 'jmikola@gmail.com', 'username': 'jmikola'}Message: DRIVERS-1577 Note caveats for killAllSessions (#952) |
| Comment by Jeremy Mikola [ 08/Apr/21 ] |
| Comment by Jeremy Mikola [ 06/Apr/21 ] |
|
Follow-up: andreas.braun and I spoke about this earlier today and agreed to proceed with amending the unified test runner spec to detect an Atlas connection string and disable killAllSessions accordingly. |
| Comment by Jeremy Mikola [ 02/Apr/21 ] |
|
oleg.pudeyev, andreas.braun: If I understand correctly, this isn't just an issue with Comprehensive Atlas Testing (DRIVERS-828), but also applies to any testing against an Atlas cluster. For DRIVERS-828, the workload executor is a logical place to configure the test runner (e.g. invoke a disableKillAllSessions() method or otherwise pass an option when instantiating the test runner). But I'd worry a driver's general test suite might not have a place to do so if the test suite is simply invoked with an arbitrary connection string. Would it make more sense to just require the unified test runner to inspect the connection string before it decides to invoke killAllSessions and NOP if it sees an Atlas hosts? I think that could be done by checking for "mongodb.net" in the hostname. |
| Comment by Jeremy Mikola [ 30/Mar/21 ] |
|
Just spoke with andreas.braun about this today. We agreed that test runners would do well to implement some internal API that allows for disabling killAllSessions behavior. This was done for Beyond that, we can explore the other two ideas Andreas had above about lazily calling killAllSessions for tests that utilize sessions (before and after the tests run) and allowing killAllSessions to fail and ignore errors. Note that the first suggestion might not work for all cases since dropCollection may block on a previous lingering transaction – and in that case, the current test that would block might not utilize sessions at all so the "lazy" check might not trigger. |
| Comment by Jeremy Mikola [ 05/Mar/21 ] |
|
To confirm, the current language in the unified test format spec was lifted from the transactions test README. I believe that the reason the transaction spec test runner instructed calling killAllSessions before the test suite was because we don't know what state the server is in before starting the test suite, and killing all sessions ensures we have a blank slate. That may not be relevant in the case of Atlas driver testing where we're running workloads (which presently don't even use transactions) on new clusters. One idea andreas.braun and I discussed was shifting the responsibility of killing sessions to the end of each test when the entity map and its resources are destroyed; however, that doesn't address the issue of inheriting dirty state if the unified tests are run as part of a larger test suite. I asked in SERVER-54216 if that bug also applies to killAllSessionsByPattern when using LSID patterns (not user criteria). If not, I would propose that the test runner collects LSIDs of all sessions in the entity map at the end of each test and execute a single killAllSessionsByPattern command to kill them individually. Andreas had a few other suggestions:
Lastly, I opened mongodb/docs#5062 and reported the issue with the current server docs conflicting with the bug in |