[JAVA-3969] Investigate implicit session test failure Created: 27/Jan/21  Updated: 01/Mar/21  Resolved: 01/Mar/21

Status: Closed
Project: Java Driver
Component/s: Reactive Streams, Test Coverage
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Jeffrey Yemin Assignee: Jeffrey Yemin
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
is related to JAVA-4027 Mono#doOnTerminate does not guarantee... Closed
Epic Link: Investigate flaky tests

 Description   

https://evergreen.mongodb.com/task/mongo_java_driver_tests_jdk_secure__version~latest_os~linux_topology~replicaset_auth~auth_ssl~ssl_jdk~jdk8_test_82129d04205d9b493ab910cfe3a4b17cf016b41e_21_01_26_23_53_26

[2021/01/26 02:03:17.415] com.mongodb.reactivestreams.SessionsTest > shouldPassAllOutcomes[server-support.json: Server supports implicit sessions] FAILED
[2021/01/26 02:03:17.415]     java.lang.AssertionError: {"insert": "test", "ordered": true, "$db": "session-tests", "lsid": {"id": {"$binary": {"base64": "30+bYp5xRHKeEJlhLM6GzA==", "subType": "04"}}}, "documents": [{"_id": 2}]}, {"find": "test", "filter": {"_id": -1}, "$db": "session-tests", "$clusterTime": {"clusterTime": {"$timestamp": {"t": 1611626597, "i": 8}}, "signature": {"hash": {"$binary": {"base64": "rCY2jhix6gqpEPNrVXHm16N/4oQ=", "subType": "00"}}, "keyId": 6921880654145650691}}, "lsid": {"id": {"$binary": {"base64": "hNMiwO6jTnG2lGmdTWaHYw==", "subType": "04"}}}} expected:<{"id": {"$binary": {"base64": "30+bYp5xRHKeEJlhLM6GzA==", "subType": "04"}}}> but was:<{"id": {"$binary": {"base64": "hNMiwO6jTnG2lGmdTWaHYw==", "subType": "04"}}}>
[2021/01/26 02:03:17.415]         at org.junit.Assert.fail(Assert.java:89)
[2021/01/26 02:03:17.415]         at org.junit.Assert.failNotEquals(Assert.java:835)
[2021/01/26 02:03:17.415]         at org.junit.Assert.assertEquals(Assert.java:120)
[2021/01/26 02:03:17.415]         at com.mongodb.client.AbstractUnifiedTest.executeOperations(AbstractUnifiedTest.java:561)
[2021/01/26 02:03:17.415]         at com.mongodb.client.AbstractUnifiedTest.shouldPassAllOutcomes(AbstractUnifiedTest.java:373)



 Comments   
Comment by Jeffrey Yemin [ 01/Mar/21 ]

Opened JAVA-4027 to fix the underlying issue

Comment by Jeffrey Yemin [ 01/Mar/21 ]

Decided on fourth option, after discussion with valentin.male.kovalenko@gmail.com. Since we document that Californium is supported, we should ensure that the driver behaves consistently and as expected when using that release. Therefore, we should stop depending on the behavior of Mono#doOnTerminate being consistent across all releases. This is easy to do: just wrap the sinkToCallback in another callback that first releases the binding.

Comment by Jeffrey Yemin [ 01/Mar/21 ]

Mystery solved. We're testing with Californium-S-17 release of Project Reactor. The behavior for Mono#doOnTerminate mentioned in the previous comment was not specified until a later release of Project Reactor. Testing with Dysprosium-SR17, I observe the documented behavior and the test no longer fails.

The question now is what to do about it. Options include:

  1. Disable the test
  2. Build and test with Dysprosium-SR17. This is easy but risky since we may unintentionally rely on a feature only available in Dysprosium-SR17.
  3. Compile with Californium-SR23 and test with Dysprosium-SR17.
Comment by Jeffrey Yemin [ 22/Feb/21 ]

The documentation of Mono#doOnTerminate seems to contradict my findings. It claims that

The Runnable is executed first, then the onNext/onComplete/onError signal is propagated downstream.

Comment by Jeffrey Yemin [ 22/Feb/21 ]

The test is trying to prove that the second operation in a sequence of two operations uses the same server session as was created by the first operation. For a synchronous client, this works fine, but in the reactive streams driver it sometimes fails. The root cause of this appears to be quite straightforward. The sequence of events is as follows:

  1. Test worker thread initiates insertOne, checks out a ServerSession, and blocks on the Mono in SyncMongoCollection, which is awaiting on a CountDownLatch
  2. I/O thread receives results of insert command. Calls onNext method of the Mono. The onNext method counts down the latch on the blocking subscriber in SyncMongoCollection
  3. TestWorker await on the latch wakes up, executes the find operation, and checks out another ServerSession 
  4. I/O thread calls Mono#doOnTerminate, which checks in the ServerSession from the first operation, but it's too late: a second server session was already checked out in the previous step.

I think we could make this pass consistently by releasing the ClientSession prior to executing onNext, instead of using Mono#doOnTerminate. But I'm not sure it's worth it, and we might just want to skip the test.

Generated at Thu Feb 08 09:00:54 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.