Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-97844

Improve handling of Chunk Ranges

    • Cluster Scalability
    • Cluster Scalability Priorities

      The chunk map can contain thousands or even millions of chunks and a call to read the map can take a very long time. It can be possible that the parameter 
      chunkRanges can be sent to be filled by mistake to the functions like

      void ChunkManager::getShardIdsForRange(const BSONObj& min,                                       const BSONObj& max,                                       std::set<ShardId>* shardIds,                                       std::set<ChunkRange>* chunkRanges,                                       bool includeMaxBound) 

      than can cause a big performance problems.
      The task https://jira.mongodb.org/browse/SERVER-95448 was created based on one such incident and it makes some improvement in the API visibility when the ChunkRange parameter is used. 

      But there are some potential improvements of chunk ranges creation which are used in 

      • getShardIdsForCanonicalQuery
      • getShardIdsForQuery

      mostly currently for analyzeShardKey case.

      1. The places like https://github.com/10gen/mongo/blob/4b957288ea685f9238b6e895912d0af598e501ca/src/mongo/s/chunk_manager.cpp#L899 copy chunks from one set to another, that can be avoided and replaced to use a reference.
      2. The set of Chunk Ranges is sent from one function to another for additional processing, but can be replaced by calculating of chunk ranges in place.
      3. Or ChunkManager methods can be extended for additional parameters, that ChunkManager can calculate the final chunk ranges result and return a reference to it.

       

            Assignee:
            Unassigned Unassigned
            Reporter:
            igor.praznik@mongodb.com Igor Praznik
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: