[SERVER-75278] Consider sleeping only if there are more to fetch in ReshardingOplogFetcher Created: 24/Mar/23 Updated: 12/Dec/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | Sharding |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Randolph Tan | Assignee: | Backlog - Cluster Scalability |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||
| Assigned Teams: |
Cluster Scalability
|
||||
| Participants: | |||||
| Linked BF Score: | 5 | ||||
| Description |
|
Consider moving this sleep to maybe the following then statement. And perhaps change this line to something like this:
|
| Comments |
| Comment by Max Hirschhorn [ 07/Jun/23 ] | ||||||||
|
Chatted with Randolph over Slack and I have a better understanding of what this ticket was intending to cover. The issue with the following code is that the sleep will still execute even when moreToCome == false and _reschedule() wouldn't be sending a new aggregate command to the donor shard anyway. Removing this sleep when moreToCome == false by restructing the code would be very beneficial because the sleep in that case happens during the critical of resharding. The recipient shard is unable to transition to kStrictConsistency until the future returned by the ReshardingOplogFetcher is ready.
| ||||||||
| Comment by Randolph Tan [ 07/Jun/23 ] | ||||||||
|
We use a sentintel oplog entry to indicates that we don't need to fetch more entries. What I'm proposing is to not sleep if we already know that moreToCome is false. | ||||||||
| Comment by Max Hirschhorn [ 07/Jun/23 ] | ||||||||
I'm not sure how the ReshardingOplogFetcher can know whether there will be more data to fetch without sending a new aggregate command after its cursor has been exhausted. Ultimately we would want to use tailable, awaitData cursor for this purpose and have the waiting happening on the donor shard side. I can imagine having the ReshardingOplogFetcher only sleep if the cursor returned no results at all when a new aggregate command was sent. In this manner reestablishing a new cursor won't be done immediately back-to-back when there hasn't been any new writes destined for the recipient shard. |