-
Type: Improvement
-
Resolution: Duplicate
-
Priority: Major - P3
-
None
-
Affects Version/s: 3.2.11
-
Component/s: Sharding
-
None
The Problem:
For customers who have their availability zones geographically quite far away, secondaries can have an unavoidable high latency connection to the primary, even though the data-rates are well within needs. Round-trip times as measured by ping can be in the 40-100ms range.
In the above environment, chunk migration is terribly slow. Chunk transfer rates in our cluster average a measly 4k bytes/sec. Furthermore, range-deletes for removing the documents that belonged to a chunk that was just moved are also terribly slow. The rest of Mongo works well in our cluster. In fact, our secondaries' Op-time lag are typically less than 1-second. Standing up local secondary servers (i.e., close to the primary servers) improves chunk migration rates and range-delete rates by at least two orders of magnitude, but this has a tremendous cost, effectively doubling the number of secondary servers we require.
A Possible Solution:
It appears from our experience that when the balancer calls moveChunk and/or rangeDeletes, it is using secondary-throttling by default with a write concern of at least 2. Why couldn't MongoDB support a "high-latency-tolerant" Chunk Migration mode; where calls to moveChunk and rangeDeletes for chunk migration, would use a write-concern of 1 (i.e., secondary-throttling disabled) except for the final write or delete?
The final write or delete for each chunk could use secondary throttling with a write-concern of 2. I believe this would yield a huge improvement in chunk migration performance for high-latency environments. What are the down sides to this solution? I really can't think of any.
- duplicates
-
SERVER-23340 Turn off moveChunk secondaryThrottle by default
- Closed