[JAVA-4183] Calling cursor.next() and cursor.close() concurrently in Load Balanced mode uses the same connection simultaneously Created: 01/Jun/21  Updated: 28/Oct/23  Resolved: 03/Aug/21

Status: Closed
Project: Java Driver
Component/s: Connection Management
Affects Version/s: 4.3.0
Fix Version/s: 4.3.1

Type: Bug Priority: Major - P3
Reporter: Backlog - Core Eng Program Management Team Assignee: Valentin Kavalenka
Resolution: Fixed Votes: 0
Labels: spec-compliance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Issue split
split to JAVA-4260 AsyncQueryBatchCursor should not try ... Backlog
split to JAVA-4259 Fix LB-related bugs in AsyncQueryBatc... Backlog
Quarter: FY22Q2
Documentation Changes: Not Needed

 Description   

DRIVERS Ticket Description
Script Target - If you can read this text, the script has failed


 Comments   
Comment by Githook User [ 03/Aug/21 ]

Author:

{'name': 'Valentin Kovalenko', 'email': 'valentin.kovalenko@mongodb.com', 'username': 'stIncMale'}

Message: Support calling `QueryBatchCursor.close` concurrently with other `QueryBatchCursor` methods (#765)

JAVA-4183
Branch: 4.3.x
https://github.com/mongodb/mongo-java-driver/commit/bfd25ddfb1497c17dacb852fc534de8b7eaff370

Comment by Githook User [ 30/Jul/21 ]

Author:

{'name': 'Valentin Kovalenko', 'email': 'valentin.kovalenko@mongodb.com', 'username': 'stIncMale'}

Message: Support calling `QueryBatchCursor.close` concurrently with other `QueryBatchCursor` methods (#765)

JAVA-4183
Branch: master
https://github.com/mongodb/mongo-java-driver/commit/7206b7100a3eb22ce7588460de0104c49809d8e2

Comment by Valentin Kavalenka [ 19/Jul/21 ]

Java driver cursor types hierarchy

  • legacy API
    • interface com.mongodb.Cursor
      • com.mongodb.DBCursor (uses MongoBatchCursorAdapter wrapping BatchCursor)
      • com.mongodb.MongoCursorAdapter (adapts MongoCursor to Cursor)
  • modern API
    • interface com.mongodb.client.MongoCursor
      • interface com.mongodb.client.MongoChangeStreamCursor
        • com.mongodb.client.internal.MongoChangeStreamCursorImpl (adapts BatchCursor to MongoChangeStreamCursor)
      • com.mongodb.client.internal.MongoBatchCursorAdapter (adapts BatchCursor to MongoCursor)
      • com.mongodb.client.internal.MongoMappingCursor (wraps MongoCursor)
    • interface com.mongodb.internal.operation.BatchCursor
      • interface com.mongodb.internal.operation.AggregateResponseBatchCursor
        • com.mongodb.internal.operation.ChangeStreamBatchCursor (wraps AggregateResponseBatchCursor, implements retries)
        • com.mongodb.internal.operation.QueryBatchCursor (will likely be renamed to CommandBatchCursor in CSOT)
          • com.mongodb.internal.operation.MapReduceInlineResultsCursor
      • interface com.mongodb.internal.operation.MapReduceBatchCursor
        • com.mongodb.internal.operation.MapReduceInlineResultsCursor
    • interface com.mongodb.internal.async.AsyncBatchCursor
      • interface com.mongodb.internal.async.AsyncAggregateResponseBatchCursor
        • com.mongodb.internal.operation.AsyncChangeStreamBatchCursor (wraps AsyncAggregateResponseBatchCursor, implements retries)
        • com.mongodb.internal.operation.AsyncQueryBatchCursor (will likely be renamed to AsyncCommandBatchCursor in CSOT)
      • com.mongodb.internal.operation.AsyncSingleBatchQueryCursor (adapts QueryResult* to AsyncBatchCursor)
        • com.mongodb.internal.operation.MapReduceInlineResultsAsyncCursor
      • interface com.mongodb.internal.operation.MapReduceAsyncBatchCursor
        • com.mongodb.internal.operation.MapReduceInlineResultsAsyncCursor
      • com.mongodb.internal.operation.ListCollectionsOperation.ProjectingAsyncBatchCursor (wraps AsyncBatchCursor)

Analysis

We can see that only QueryBatchCursor and AsyncQueryBatchCursor implement the cursor logic, while all other classes are wrappers.

AsyncQueryBatchCursor supported concurrent next and close operations even before the support for LB was implemented, and as far a I can tell, that is now supported even for LB mode: killCursorOnClose uses the pinnedConnection and is executed after the concurrent getMore returns, even if it returns with no results. This way users indeed can call close to cancel next if it spins doing getMore. However, the driver does not cancel getMore, and this operation is blocked until either results are available or maxTimeMS is elapsed (see AggregatePublisher.maxAwaitTime, FindPublisher.maxAwaitTime, ChangeStreamPublisher.maxAwaitTime.

QueryBatchCursor does not have the same mechanism, most likely because AggregateResponseBatchCursor is documented as @NotThreadSafe. However, the closed field is marked as volatile and before the LB changes the close method has been getting a new connection to killCursor, which strongly suggests that we allowed calling close concurrently with next to cancel it. Thus, QueryBatchCursor indeed has the bug described in this ticket.

A user has access to QueryBatchCursor via MongoCursor. I suppose for asynchronous API (AsyncQueryBatchCursor.close) the close methods is called as a result of calling Subscription.cancel.

Next steps

  1. Write a test that checks that next can be cancelled by close for both AsyncQueryBatchCursor and QueryBatchCursor.
  2. Fix QueryBatchCursor such that the test mentioned above passes.
  3. Fix the documentation
    1. MongoCursor documentation should clearly state that close can be called concurrently with other methods even though MongoCursor is not thread-safe.
    2. Explicitly mark MongoChangeStreamCursor with @NotThreadSafe similarly to MongoCursor.
    3. Fix examples in MongoCursor and MongoChangeStreamCursor as they do not seem to be correct: MongoCollection.find returns FindIterable, which is not MongoCursor. The only way to get MongoCursor/MongoChangeStreamCursor is by calling watch, which returns ChangeStreamIterable, which in turn has the method ChangeStreamIterable.cursor.
Generated at Thu Feb 08 09:01:26 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.