[DRIVERS-2004] Introduce prose concurrent stress tests for the Connection Pool Created: 03/Dec/21 Updated: 03/Oct/23 |
|
| Status: | Backlog |
| Project: | Drivers |
| Component/s: | CMAP |
| Fix Version/s: | None |
| Type: | Spec Change | Priority: | Unknown |
| Reporter: | Valentin Kavalenka | Assignee: | Unassigned |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
| Driver Changes: | Needed | ||||||||||||||||||||||||||||||||||||||||||||||||
| Driver Compliance: |
|
||||||||||||||||||||||||||||||||||||||||||||||||
| Description |
SummaryThe existing specification tests does not adequately test the Connection Pool under an intensive concurrent usage: not only they use the Connection Pool lightly and not necessary concurrently, they also try to achieve the same predictable execution in order to observe a specific set of events in a specific order. The concurrent stress tests use a different approach: they try to subject the object under test to an intensive variable usage in order to produce many different execution paths, including otherwise rare ones. This is possible because the assertions in concurrent stress tests are simpler than those in usual tests: they usually expect the absence of unexpected behaviors, e.g., exceptions or dead locks, and potentially compliance with very basic specified guarantees, like not returning null values from non-null methods. MotivationWho is the affected end user?Driver engineers. How does this affect the end user?Having such tests increases the probability of a driver engineer discovering a bug in the Connection Pool before releasing the changes to driver users. How likely is it that this problem or use case will occur?It depends on the complexity of the changes to the Connection Pool. While working on "Avoiding connection storms" ( If the problem does occur, what are the consequences and how severe are they?If we miss a concurrency bug in the Connection Pool, it may cause serious problems in the application that uses the driver, e.g., dead locks, memory leaks. It is impossible to tell more concretely. Is this issue urgent?No. Is this ticket required by a downstream team?No. Is this ticket only for tests?Yes. DetailsI am providing an overview of what the Java MongoDB driver concurrent stress tests that are not specific to the driver do. I think that the description of these tests should be quite vague, and allow individual drivers to create tests that are more appropriate for them and have higher chances of discovering bugs. Even implementations of simpler specification prose tests sometimes differ from the prose description, at least in the Java driver. Concurrent usage stress test.Create a pool with various minPoolSize/maxPoolSize, maxIdleTimeMS, other non-standard options that your pool may have. Utilizing extreme values of options may be helpful. Concurrently use the pool (checkOut/checkIn synchronously/asynchronously) using different numbers of concurrent users. Spontaneously invalidate (clear followed by ready) / spontaneously clear / spontaneously ready while using the pool; vary the probability of such disturbances in different executions. Expectations:
I cannot stress enough how helpful it is to have assertions (checks of expectations that may be violated if and only if the driver code is incorrect) in the Connection Pool code itself. Hand-over mechanism concurrent stress test.While this test may not be relevant to some drivers, I know that it is relevant to some others besides the Java driver. The specification states "the Pool MUST NOT service any newer checkOut requests before fulfilling the original one which could not be fulfilled". This fairness requirement means that a checked in connection must become available only to the checkOut request that has been waiting longer than others. I refer to the mechanism that achieves such behavior as the hand-over mechanism. In some drivers, including the Java driver, it adds enough complexity to be tested additionally. Create a pool with connections not expiring and no background thread populating it. The maxPoolSize must be equal to maxConnecting + openConnectionsCount + wiggleCount. The meaning of the last two terms will soon become clear.
Expectations:
|
| Comments |
| Comment by Boris Dogadov [ 11/Oct/22 ] |
|
patrick.freed@mongodb.com and I think that this test provides is very valuable coverage for any CMAP related changes, and suggest to consider this for the upcoming quarter. |
| Comment by Patrick Freed [ 14/Dec/21 ] |
|
These tests seem like they would be very valuable to have all drivers implement, putting in the backlog for now and will revisit before next quarterly planning. |