[SERVER-12581] mapReduce sharded output replace does not fully split Created: 03/Feb/14  Updated: 23/Sep/19  Resolved: 28/Aug/14

Status: Closed
Project: Core Server
Component/s: MapReduce, Sharding
Affects Version/s: 2.5.5
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Randolph Tan Assignee: Randolph Tan
Resolution: Duplicate Votes: 0
Labels: todo_in_code
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File mrShardedOutput.js    
Issue Links:
Duplicate
duplicates SERVER-13402 bulk insert can result in too large c... Closed
Related
related to SERVER-14324 MapReduce does not respect existing s... Closed
related to SERVER-43527 Complete TODO listed in SERVER-12581 Closed
is related to SERVER-12580 Improve mrShardedOutput.js Closed
Operating System: ALL
Participants:

 Description   

The reason for lies in here:

https://github.com/mongodb/mongo/blob/r2.5.5/src/mongo/s/commands_public.cpp#L1790-1804

                    // do the splitting round
                    ChunkManagerPtr cm = confOut->getChunkManagerIfExists( finalColLong );
                    for ( map<BSONObj, int>::iterator it = chunkSizes.begin() ; it != chunkSizes.end() ; ++it ) {
                        BSONObj key = it->first;
                        int size = it->second;
                        verify( size < 0x7fffffff );
 
                        // key reported should be the chunk's minimum
                        ChunkPtr c =  cm->findIntersectingChunk(key);
                        if ( !c ) {
                            warning() << "Mongod reported " << size << " bytes inserted for key " << key << " but can't find chunk" << endl;
                        } else {
                            c->splitIfShould( size );
                        }
                    }

Note that mongos calls Chunk::splitIfShould, which performs a split with only a single split point. This means that if the resulting chunk is multiples larger than the chunk size, a small section of the chunk will be split into a new chunk. For example, if chunkSize is 1 and resulting chunk is 5, the split operation will split the chunk into size 1 and size 4.



 Comments   
Comment by Randolph Tan [ 03/Feb/14 ]

Attached test

Generated at Thu Feb 08 03:28:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.