[JAVA-3244] Ways to timeout long mongo write operation Created: 22/Mar/19 Updated: 11/Sep/19 Resolved: 02/Apr/19 |
|
| Status: | Closed |
| Project: | Java Driver |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Major - P3 |
| Reporter: | Zhexuan Chen | Assignee: | Unassigned |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Description |
|
Hi, We are using mongo-java-driver 3.4.2, and our database structure is: sharded-cluster: # of shard: 3 Have replica set every dc across US. Have 3 primarys, all the others are secondary. We now experience cross-dc network issue in one datacenter to the primary in another datacenter. So some of the requests have very long latency (16min), and succeed. Our solution is to add retry template in client side. So for the request that has a long latency, it will trigger retry and the retry one can succeed very fast. The problem is, we cannot kill the previous one (which with long latency), and eventually the mongodb execute 2 requests for it. We investigate why the long latency, and found the network from mongos to mongod has huge package loss, it may be the root cause. We wish to timeout the long latency write operation. But found mongo doc and mongo-java-driver doc, no useful timeout found.
Work we have done: From MongoClientOption: add socketTiemout, connectionTimeout, maxConnectionIdleTime, maxConnectionLifeTime. All with no use. Request can still take very long(longest is 16min) |
| Comments |
| Comment by Jeffrey Yemin [ 02/Apr/19 ] |
|
For read operations, you can use maxTimeMS to control the execution time of the operation on the server. For write operations, the driver can be configured to [retry writes|https://docs.mongodb.com/manual/core/retryable-writes/] in some situations. Alternatively, killOp can be used, but you'll have to handle idempotency in the application. |
| Comment by Zhexuan Chen [ 01/Apr/19 ] |
|
Hi Jeff.
|
| Comment by Jeffrey Yemin [ 01/Apr/19 ] |
|
Hi zchen12345. What are the actual operations that are taking so long? Are they read operation, write operations, or both. If they are write operations, are the writes idempotent, in which case retrying them in the client is safe? |