[CDRIVER-3615] Reduce race conditions in SDAM error handling Created: 13/Apr/20  Updated: 28/Oct/23  Resolved: 15/May/20

Status: Closed
Project: C Driver
Component/s: None
Affects Version/s: None
Fix Version/s: 1.17.0-beta2, 1.17.0

Type: Improvement Priority: Major - P3
Reporter: Backlog - Core Eng Program Management Team Assignee: Kevin Albertson
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by CDRIVER-3535 Reduce Client Time To Recovery On Top... Closed
is depended on by CXX-1989 Reduce race conditions in SDAM error ... Closed
Duplicate
is duplicated by CDRIVER-3648 Resync SDAM error handling spec tests Closed
Related
related to CDRIVER-3645 Cluster nodes are not removed on topo... Backlog
related to PHPC-1589 Reduce race conditions in SDAM error ... Closed
Epic Link: C 4.4 Support

 Description   

Summary of required driver work:
Reduce race conditions in SDAM error handling. In summary, drivers will update their SDAM and error handling logic to:

Add topologyVersion field to ServerDescription.
Ignore stale errors based on generation number and topologyVersion.
Implement SDAM spec tests for topologyVersion comparison.
Implement SDAM spec tests for handling mock application errors with the new tests in the tests/errors directory.
Spec changes:
Spec commit (part 1 of 2):
Author:

{'name': 'Shane Harvey', 'email': 'shane.harvey@mongodb.com', 'username': 'ShaneHarvey'}

Message: SPEC-1663 Reduce race conditions in SDAM error handling (#781)
Add topologyVersion field to ServerDescription.
Add SDAM spec tests for topologyVersion comparison.
Clients ignore stale errors based on generation number and topologyVersion.
Branch: master
https://github.com/mongodb/specifications/commit/7dd9c008be6edaf65368787d5e91c2ac3aeb95b8

Spec commit (part 2 of 2):
Author:

{'name': 'Shane Harvey', 'email': 'shane.harvey@mongodb.com', 'username': 'ShaneHarvey'}

Message: SPEC-1663 Add SDAM tests for detecting stale application errors (#787)
Add applicationErrors to configure mock errors.
Add pool generation to outcome assertion.
Add optional "description" field to each phase.
Add tests for network errors before/after handshake completes.
Add tests for not master errors pre/post 4.2.
Branch: master
https://github.com/mongodb/specifications/commit/371e95b18f4783d485218bb1565970be10568e8a
See DRIVERS-1187 for updated details.



 Comments   
Comment by Kevin Albertson [ 04/Jun/20 ]

Subsequent PR: https://github.com/mongodb/mongo-c-driver/pull/609

Comment by Githook User [ 25/May/20 ]

Author:

{'name': 'Kevin Albertson', 'email': 'kevin.albertson@mongodb.com', 'username': 'kevinAlbs'}

Message: CDRIVER-3678 fix /Topology/request_scan_on_error

CDRIVER-3615 changed an exotic edge case for parsing errors.
An error with a NotMaster error code, but a "node is recovering" message
is considered a "node is recovering" error for multi-threaded error
handling. So requested scan is expected.

A "node is recovering" error should request scan for multi-threaded, not
single-threaded scanning. The test was making a wrong assertion, but the
check of the assertion was very racy.

This fixes the test and increases the wait to make the test less racy.
Branch: r1.17
https://github.com/mongodb/mongo-c-driver/commit/ad6e9ffef947a09b08368bfcd72fdac2e877dcf7

Comment by Githook User [ 25/May/20 ]

Author:

{'name': 'Kevin Albertson', 'email': 'kevin.albertson@mongodb.com', 'username': 'kevinAlbs'}

Message: CDRIVER-3615 reduce races in SDAM err handling
Branch: r1.17
https://github.com/mongodb/mongo-c-driver/commit/db50880f5e4a8a41b893a0c1888ef3ae96ae62d5

Comment by Githook User [ 20/May/20 ]

Author:

{'name': 'Kevin Albertson', 'email': 'kevin.albertson@mongodb.com', 'username': 'kevinAlbs'}

Message: CDRIVER-3678 fix /Topology/request_scan_on_error

CDRIVER-3615 changed an exotic edge case for parsing errors.
An error with a NotMaster error code, but a "node is recovering" message
is considered a "node is recovering" error for multi-threaded error
handling. So requested scan is expected.

A "node is recovering" error should request scan for multi-threaded, not
single-threaded scanning. The test was making a wrong assertion, but the
check of the assertion was very racy.

This fixes the test and increases the wait to make the test less racy.
Branch: master
https://github.com/mongodb/mongo-c-driver/commit/6dbadb53cf3568f733496ec8be7e740d5beca3bf

Comment by Githook User [ 15/May/20 ]

Author:

{'name': 'Kevin Albertson', 'email': 'kevin.albertson@mongodb.com', 'username': 'kevinAlbs'}

Message: CDRIVER-3615 reduce races in SDAM err handling
Branch: master
https://github.com/mongodb/mongo-c-driver/commit/924184ad11e716b3cdc36cdded2e2aac79f1be8f

Comment by Kevin Albertson [ 06/May/20 ]

PR: https://github.com/mongodb/mongo-c-driver/pull/601

Generated at Wed Feb 07 21:18:34 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.