[GODRIVER-2001] Session.WithTransaction method endless loop Created: 07/May/21 Updated: 28/Oct/23 Resolved: 27/May/21 |
|
| Status: | Closed |
| Project: | Go Driver |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | 1.5.3 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | 峻铭 张 | Assignee: | Benji Rewis (Inactive) |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||
| Documentation Changes: | Not Needed | ||||||||||||
| Description |
|
version: 1.5.2 I use a context with a short timeout, and find WithTransaction method will be stuck in a endless loop Here is my code:
The InsertOne method will return the following error:
The error has TransientTransactionError and NetworkError label, so it make WithTransaction method stuck in an infinite loop
|
| Comments |
| Comment by Benji Rewis (Inactive) [ 27/May/21 ] | ||||||||||||||||
|
zjm448172381@gmail.com the fix is now merged and should be available with the next patch of the Go driver. | ||||||||||||||||
| Comment by Githook User [ 27/May/21 ] | ||||||||||||||||
|
Author: {'name': 'Benjamin Rewis', 'email': '32186188+benjirewis@users.noreply.github.com', 'username': 'benjirewis'}Message: | ||||||||||||||||
| Comment by Githook User [ 27/May/21 ] | ||||||||||||||||
|
Author: {'name': 'Benjamin Rewis', 'email': '32186188+benjirewis@users.noreply.github.com', 'username': 'benjirewis'}Message: | ||||||||||||||||
| Comment by 峻铭 张 [ 17/May/21 ] | ||||||||||||||||
|
Thank you @Benji Rewis . There are no other problems at the moment. | ||||||||||||||||
| Comment by Benji Rewis (Inactive) [ 14/May/21 ] | ||||||||||||||||
|
Apologies for the delay zjm448172381@gmail.com ! After discussing with the rest of the team, we’ve decided that we can check specifically for a context deadline exceeded or canceled error during the transaction and return early if one is encountered. A fix is in review now: https://github.com/mongodb/mongo-go-driver/pull/668. Thank you again for your report and all the information you provided. I’ve also created a drivers ticket (DRIVERS-1753) to ask about making the WithTransaction timeout configurable drivers-wide. Let me know if you have any other questions or concerns. | ||||||||||||||||
| Comment by 峻铭 张 [ 13/May/21 ] | ||||||||||||||||
|
Hi @Benji Rewis, I think `withTransactionTimeout` should be configurable. And I also think should add some config that can control retry times. In my case, I expect `WithTransaction` method should return an error when as long as occurs error in the transaction. I don't want to retry any command. So I want to able to config retry times. | ||||||||||||||||
| Comment by Benji Rewis (Inactive) [ 12/May/21 ] | ||||||||||||||||
|
Thank you zjm448172381@gmail.com, I can now reproduce the series of errors if I Ping the server before running WithTransaction. Here’s what I think is happening. Without Ping, you will get a server selection timeout with InsertOne in WithTransaction. This is not a TransientTransactionError, so the transaction will not retry. With Ping, server selection will occur through Ping and be successful before WithTransaction, so InsertOne will actually try to write to the wire in roundTrip. Because of the short timeout, InsertOne will fail to write to the wire with context deadline exceeded. Per our drivers-wide specifications, we mark this failure to write as a TransientTransactionError.
InsertOne will continue to fail in writing to the wire because of the short timeout, and we will continue to retry since we don’t know if the write actually went through. The loop is not endless, and will stop within 2 minutes (as defined by the non-configurable withTransactionTimeout). We do want to retry writing when encountering network errors during transactions, so this series of errors is probably expected behavior. If you think the 2 minute delay defined by withTransactionTimeout ought to be something that's configurable, I could bring that up drivers-wide! | ||||||||||||||||
| Comment by 峻铭 张 [ 12/May/21 ] | ||||||||||||||||
|
Hi @Benji Rewis, thx for your reply. I rule out the reason of storage or whitelist. I find the reason is `Ping` method before `StartSession` method.
If I add `Ping` method, it will be in an endless loop. And if I remove this method, it will be back to normal. The following is the error message of the test result.
I will continue to debug this strange problem.
| ||||||||||||||||
| Comment by Benji Rewis (Inactive) [ 11/May/21 ] | ||||||||||||||||
|
Hello again zjm448172381@gmail.com! Thanks again for your patience as we try to reproduce this error. Using Go driver version 1.5.2 and Golang version 1.13, I cannot replicate the infinite loop against a MongoDB 4.2 standalone, replica set, or sharded cluster. In my case, the error from InsertOne has a NetworkError label but not a TransientTransactionError label, so the Transaction is not retried. A normal context deadline exceeded error from an expired context should not have the TransientTransactionError label, so I think there’s a something else going on. I have a couple hypotheses. Are you using Atlas? A MongoNetworkError with the TransientTransactionError label can occur when your IP address is not whitelisted (found under “Network Access/IP Access List” in the Atlas UI). Have you run out of storage space in your MongoDB instance? This could cause a “failed to write” error that might have the TransientTransactionError label. In any case, is there any other info included in the error message? | ||||||||||||||||
| Comment by 峻铭 张 [ 11/May/21 ] | ||||||||||||||||
|
Hello @Benji Rewis , here is my version: mongo version: 4.2 | ||||||||||||||||
| Comment by Benji Rewis (Inactive) [ 10/May/21 ] | ||||||||||||||||
|
Hello zjm448172381@gmail.com! We're still working on reproducing this issue. What MongoDB server version were you running against? And, what version of Go are you using? | ||||||||||||||||
| Comment by Benji Rewis (Inactive) [ 07/May/21 ] | ||||||||||||||||
|
Hello zjm448172381@gmail.com! Thank you for your report; we're looking into this issue now. |