[SERVER-29196] collection cloner only sets batchSize on initial find, not getMores Created: 15/May/17 Updated: 30/Oct/23 Resolved: 07/Jun/17 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | Replication |
| Affects Version/s: | None |
| Fix Version/s: | 3.5.9 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Eric Milkie | Assignee: | Jason Chan |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | neweng | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||||||||||
| Backwards Compatibility: | Fully Compatible | ||||||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||||||
| Sprint: | Repl 2017-06-19 | ||||||||||||||||||||||||
| Participants: | |||||||||||||||||||||||||
| Description |
|
If you don't specify a batchSize for a getMore, it will revert to the default size. It doesn't inherit the batchSize from the initial find. |
| Comments |
| Comment by Githook User [ 07/Jun/17 ] |
|
Author: {u'name': u'Jason Chan', u'email': u'jason.chan@mongodb.com'}Message: |
| Comment by Githook User [ 06/Jun/17 ] |
|
Author: {u'name': u'Jason Chan', u'email': u'jasonchan@Jasons-MacBook-Pro.local'}Message: |
| Comment by Eric Milkie [ 16/May/17 ] |
|
I'm hoping that our initial sync perf tests will show some change after this is fixed. |
| Comment by Daniel Pasette (Inactive) [ 15/May/17 ] |
|
Not sure if this will show up in practice, but larger batch sizes have the potential to make intra-cluster compression (on by default in 3.6) more efficient as well. |
| Comment by Eric Milkie [ 15/May/17 ] |
|
Yes, that hardcoded value was explicitly written as part of the initial sync project; it was just overlooked that we didn't apply the value to both find's and getMore's. I came across this when I tried to make the batchSize very small to reproduce a bug – I was surprised to discover that it didn't actually work for getMores. |
| Comment by Spencer Brody (Inactive) [ 15/May/17 ] |
|
So we don't actually expose the batch size at all, we just hard code it here. So the question becomes whether we believe this hard-coded value to be meaningful and likely to be better than the default for the majority of workloads. I think the answer is probably yes - given that latency doesn't really matter for initial sync, only throughput matters, that suggests that bigger batch sizes are likely to perform better. |
| Comment by Eric Milkie [ 15/May/17 ] |
|
The impact is that the getMore efficiency might be greatly improved with a larger batch size. This would translate to less load on the sync source as well as better throughput for high latency, high bandwidth connections. |
| Comment by Spencer Brody (Inactive) [ 15/May/17 ] |
|
milkie, what's the impact of this? The collection cloner is only for initial sync - do we even allow users to specify non-default batch sizes for initial sync? |