Uploaded image for project: 'Java Driver'
  1. Java Driver
  2. JAVA-3828

Client Side Operations Timeout

    • Type: Icon: Epic Epic
    • Resolution: Unresolved
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: None
    • Component/s: CSOT, Retryability
    • Labels:
      None
    • Hide

      DRIVERS-555:
      NA

      Show
      DRIVERS-555: NA
    • Hide
      1. What would you like to communicate to the user about this feature?
      2. Would you like the user to see examples of the syntax and/or executable code and its output?
      3. Which versions of the driver/connector does this apply to?
      Show
      1. What would you like to communicate to the user about this feature? 2. Would you like the user to see examples of the syntax and/or executable code and its output? 3. Which versions of the driver/connector does this apply to?
    • 9
    • 31.5
    • 31
    • 200
    • Hide

      Engineer(s): Slava

      Summary: Allow users to configure the timeout on operations by using a single timeout setting.

      2024-04-12: Removing Target End Date

      What was completed over the last two weeks?

      • Error transformations in review
      • Replace Timeout API with lambdas in review
      • Write Concern timeout in progress

      What's the focus over the next two weeks?

      • Address race conditions, bug fixes, finalize test coverage
      • Undeprecate timeout APIs

      Impediments encountered over the last two weeks:

      • As raised in the last update, there is some additional spec work that needs to be done before we feel comfortable releasing CSOT in the Java Driver due to the potential for connection churn (tickets filed by Shane/Matt). Therefore we are going to undeprecate some things for now then pick up remaining spec work once unblocked in Q2.
      • As such, Ross has rolled off of this project and Slava will be doing the immediate remaining task then this project will be put on the backlog until the spec work is completed

       


      2024-03-28: Target date set to 2024-04-12

      What was completed over the last two weeks?

      • Centralize setting maxTimeMS within CommandMessage and support changing timeoutMS at client level
      • Deprecation of timeout config options
      • Evaluating test coverage

      What's the focus over the next two weeks?

      • Address race conditions, bug fixes, finalize test coverage
      • Wrapping up the project

      Impediments encountered over the last two weeks:

      • HELP tickets and a couple bug reports in Kotlin took some time away from CSOT

      2024-03-15: Target date set to 2024-03-29

      What was completed over the last two weeks?

      • Transactions, Convenient API, Client-side Encryption, Change Streams and connection checkout/establishment

      What's the focus over the next two weeks?

      • Wrapping up the project

      Impediments encountered over the last two weeks:

      • HELP ticket came in that required attention and a release and took time away from CSOT

      2024-02-16: No change to target date

      What was completed over the last two weeks?

      • Updates to Handshake, Auth, CRUD, Connection Pool and a refactor for socket timeout

      What's the focus over the next two weeks?

      • Wrapping up changes for Transactions, Convenient API, and Client-side Encryption
      • Change Streams and connection checkout/establishment
      • Note that CSOT planned end date is 2 weeks away, may require an additional week to address remaining tickets- TBD

      Impediments encountered over the last two weeks:

      • Transaction changes has taken a bit longer than expected, required additional refactoring to TimeoutSettings
      • Flaky encryption tests - was able to get clarity on expected test behavior from Kevin

      2024-02-02: No change to target date

      • SDAM, CRUD, GridFS and misc. refactor tasks to support timeouts completed
      • Client-side encryption, transactions + convenient API in progress
      • Handshake and Auth in code review

      2024-01-19: Updated target end date to 2024-03-01

      • Working on Auth, Connection checkout and handshake, CSFLE
      • Connection Pool, SDAM, CRUD in review
      • No blockers, updated with revised estimate (additional 10 eng wks)

      2024-01-01: No change to target end date

      • GridFS synchronous in review
        • reactive streams in progress
      • CRUD deprecations in review
      • Command Execution merged
      • Investigating sync adapter implementation
        • New test added, currently in review

      2023-12-08: No change to target end date

      • Retryable-reads and retryable-writes updates to support CSOT currently in review
      • Command Execution updates to support CSOT in progress
      • Up next: CRUD and SDAM

      2023-11-22: 

      • In progress:
        • Update existing timeouts according to the retryable-reads and retryable-writes specification to support CSOT
        • Cursor handling and timeoutModes now in review

      2023-11-10: 

      • Batch cursor refactor completed
      • Maxim has handed off his part of CSOT to Slava in advance of his upcoming leave
      • In progress:
        • Update existing timeouts according to the retryable-reads and retryable-writes specification to support CSOT
      • Corrected the estimated final cost to 23 weeks
        • No change to the target end date here, est. final cost should have been updated to 23 when we got the revised estimate

      2023-10-30: Updated target end date to 2024-01-19

      • Refactoring to obtain connectTimeoutMS from TimeoutContext rather than setting directly
      • Batch cursor refactor in progress to simplify future CSOT work

      2023-10-13: Updated target end date to 2023-12-15

      • Team has broken down remaining work and updated estimate
      • Async and sync command cursors have been brought much more into line with lots of code reuse
        • Brings a single batch cursor to sync also bringing async / sync into line
      • Server selection is in progress using the TimeoutContext now available in OperationContext

      2023-09-29: Updated target end date to 2023-12-01

      • TimeoutMS is in the settings and used in operations calculating maxTimeMS
      • The refactor to propagate an OperationContext down through the operation layer into the connection layer was merged earlier this week
      • Server selection is in progress using the TimeoutContext now available in OperationContext
      • The end date is based on a rough estimate atm and will be revised once the team has more confidence in the estimates

      2023-09-15: Set target date to 2023-10-13

      • Prerequisite refactors necessary to implement CSOT have been completed
        • Still need to decide between using timepoints vs deadlines (decision deferred for now)
      • ServerSelection and CRUD in progress
      • Pushed planned end date out as this will not be done by September 22nd

      Engineer(s): Ross, Maxim

      Summary: Allow users to configure the timeout on operations by using a single timeout setting.

      2023-09-01: Set target date to 2023-09-22

      • Refactor of OperationHelper & CommandOperationHelper completed
      • Cursor refactor and representing timeouts internally as deadlines in progress
      • Currently Ross is the only one working on this and Maxim will begin upon his return from PTO

      Engineer(s): Ross

      Summary: Allow users to configure the timeout on operations by using a single timeout setting.

      Cost in Eng Weeks: 8 Original | 8 To Date | 10 Est Final

      2021-06-15:

      • On pause till Kafka 5.0 support is complete

      2021-06-02: Updated target end date to 2021-06-25

      • CRUD changes in review. First round of change requests up
      • Currently working on cursors, timeouts and gridfs
      • Ross is mostly OOO this week since he is moving

      2021-05-18: Maintaining target end date of 2021-05-28

      • CRUD changes in review
      • Currently working on cursors, timeouts and gridfs
      • Ross also worked on answering some user questions on Spark

      2021-05-04: Updated target end date to 2021-05-28

      • Continuing work on the integration with operation layer
      • Close to wrapping up the CRUD integration
      • Server selection up next
      • Ross mentioned that the impl. touches a lot of code and the code review process won't be quick so we've updated the end date accordingly

      2021-04-20: Maintaining target end date of 2021-05-12

      • No new updates since last time. Still working on the operations layer and CRUD integration
      • Ross was OOO last to last week and was Skunkking last week
      • He is also working on 2 HELP tickets atm

      2021-04-06: Maintaining target end date of 2021-05-12

      • Currently working on updating existing timeouts according to the crud spec changes
      • Ross is OOO this week

      2021-03-23: Setting initial target end date to 2021-05-12

      • Adding timeoutMS to settings in code review
      • Currently working on updating existing timeouts according to the crud spec changes

      Show
      Engineer(s): Slava Summary: Allow users to configure the timeout on operations by using a single timeout setting. 2024-04-12: Removing Target End Date What was completed over the last two weeks? Error transformations in review Replace Timeout API with lambdas in review Write Concern timeout in progress What's the focus over the next two weeks? Address race conditions, bug fixes, finalize test coverage Undeprecate timeout APIs Impediments encountered over the last two weeks: As raised in the last update, there is some additional spec work that needs to be done before we feel comfortable releasing CSOT in the Java Driver due to the potential for connection churn (tickets filed by Shane/Matt). Therefore we are going to undeprecate some things for now then pick up remaining spec work once unblocked in Q2. As such, Ross has rolled off of this project and Slava will be doing the immediate remaining task then this project will be put on the backlog until the spec work is completed   2024-03-28: Target date set to 2024-04-12 What was completed over the last two weeks? Centralize setting maxTimeMS within CommandMessage and support changing timeoutMS at client level Deprecation of timeout config options Evaluating test coverage What's the focus over the next two weeks? Address race conditions, bug fixes, finalize test coverage Wrapping up the project Impediments encountered over the last two weeks: HELP tickets and a couple bug reports in Kotlin took some time away from CSOT 2024-03-15: Target date set to 2024-03-29 What was completed over the last two weeks? Transactions, Convenient API, Client-side Encryption, Change Streams and connection checkout/establishment What's the focus over the next two weeks? Wrapping up the project Impediments encountered over the last two weeks: HELP ticket came in that required attention and a release and took time away from CSOT 2024-02-16: No change to target date What was completed over the last two weeks? Updates to Handshake, Auth, CRUD, Connection Pool and a refactor for socket timeout What's the focus over the next two weeks? Wrapping up changes for Transactions, Convenient API, and Client-side Encryption Change Streams and connection checkout/establishment Note that CSOT planned end date is 2 weeks away, may require an additional week to address remaining tickets- TBD Impediments encountered over the last two weeks: Transaction changes has taken a bit longer than expected, required additional refactoring to TimeoutSettings Flaky encryption tests - was able to get clarity on expected test behavior from Kevin 2024-02-02: No change to target date SDAM, CRUD, GridFS and misc. refactor tasks to support timeouts completed Client-side encryption, transactions + convenient API in progress Handshake and Auth in code review 2024-01-19: Updated target end date to 2024-03-01 Working on Auth, Connection checkout and handshake, CSFLE Connection Pool, SDAM, CRUD in review No blockers, updated with revised estimate (additional 10 eng wks) 2024-01-01: No change to target end date GridFS synchronous in review reactive streams in progress CRUD deprecations in review Command Execution merged Investigating sync adapter implementation New test added, currently in review 2023-12-08: No change to target end date Retryable-reads and retryable-writes updates to support CSOT currently in review Command Execution updates to support CSOT in progress Up next: CRUD and SDAM 2023-11-22:  In progress: Update existing timeouts according to the retryable-reads and retryable-writes specification to support CSOT Cursor handling and timeoutModes now in review 2023-11-10:  Batch cursor refactor completed Maxim has handed off his part of CSOT to Slava in advance of his upcoming leave In progress: Update existing timeouts according to the retryable-reads and retryable-writes specification to support CSOT Corrected the estimated final cost to 23 weeks No change to the target end date here, est. final cost should have been updated to 23 when we got the revised estimate 2023-10-30: Updated target end date to 2024-01-19 Refactoring to obtain connectTimeoutMS from TimeoutContext rather than setting directly Batch cursor refactor in progress to simplify future CSOT work 2023-10-13: Updated target end date to 2023-12-15 Team has broken down remaining work and updated estimate Async and sync command cursors have been brought much more into line with lots of code reuse Brings a single batch cursor to sync also bringing async / sync into line Server selection is in progress using the TimeoutContext now available in OperationContext 2023-09-29: Updated target end date to 2023-12-01 TimeoutMS is in the settings and used in operations calculating maxTimeMS The refactor to propagate an OperationContext down through the operation layer into the connection layer was merged earlier this week Server selection is in progress using the TimeoutContext now available in OperationContext The end date is based on a rough estimate atm and will be revised once the team has more confidence in the estimates 2023-09-15: Set target date to 2023-10-13 Prerequisite refactors necessary to implement CSOT have been completed Still need to decide between using timepoints vs deadlines (decision deferred for now) ServerSelection and CRUD in progress Pushed planned end date out as this will not be done by September 22nd Engineer(s): Ross, Maxim Summary: Allow users to configure the timeout on operations by using a single timeout setting. 2023-09-01: Set target date to 2023-09-22 Refactor of OperationHelper & CommandOperationHelper completed Cursor refactor and representing timeouts internally as deadlines in progress Currently Ross is the only one working on this and Maxim will begin upon his return from PTO Engineer(s): Ross Summary: Allow users to configure the timeout on operations by using a single timeout setting. Cost in Eng Weeks: 8 Original | 8 To Date | 10 Est Final 2021-06-15: On pause till Kafka 5.0 support is complete 2021-06-02: Updated target end date to 2021-06-25 CRUD changes in review. First round of change requests up Currently working on cursors, timeouts and gridfs Ross is mostly OOO this week since he is moving 2021-05-18: Maintaining target end date of 2021-05-28 CRUD changes in review Currently working on cursors, timeouts and gridfs Ross also worked on answering some user questions on Spark 2021-05-04: Updated target end date to 2021-05-28 Continuing work on the integration with operation layer Close to wrapping up the CRUD integration Server selection up next Ross mentioned that the impl. touches a lot of code and the code review process won't be quick so we've updated the end date accordingly 2021-04-20: Maintaining target end date of 2021-05-12 No new updates since last time. Still working on the operations layer and CRUD integration Ross was OOO last to last week and was Skunkking last week He is also working on 2 HELP tickets atm 2021-04-06: Maintaining target end date of 2021-05-12 Currently working on updating existing timeouts according to the crud spec changes Ross is OOO this week 2021-03-23: Setting initial target end date to 2021-05-12 Adding timeoutMS to settings in code review Currently working on updating existing timeouts according to the crud spec changes

      Useful Info

      Summary

      Allow users to configure the timeout on operations by using a single timeout setting.

      Motivation

      Users have an array of options governing timeouts:
      Driver timeouts: Server selection timeout, socket write timeout, socket read timeout, socket connect timeout
      Server timeouts: maxTimeMS, maxAwaitTimeMS, wTimeout.
      Users are often times unaware of the existence of all these settings and the effect they have on the timeout behavior of the driver and server. Due to this lack of awareness and understanding users often leave these settings at their defaults which is not necessarily what they desire in their timeout behaviors. Furthermore, timeout interaction with retryable writes compounds the situation. The timeout settings do not provide any clarity or reassurance of how long a user will wait for any given write or read.

      We should also determine if the desired behavior should be, retry as many times as possible within the defined time period.

      Cast of Characters

      Lead: Jeff Yemin
      Author: Divjot Arora
      POCs: Java, C, Go, Swift
      Product Owner:
       

      Documentation

      Scope Document


      JQL search link to all sub tasks

            Assignee:
            ross@mongodb.com Ross Lawley
            Reporter:
            esha.bhargava@mongodb.com Esha Bhargava
            Votes:
            3 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: