[SERVER-3627] sharded map-reduce output should be parallelized and properly distribute chunks Created: 17/Aug/11 Updated: 11/Jul/16 Resolved: 21/Dec/11 |
|
| Status: | Closed |
| Project: | Core Server |
| Component/s: | MapReduce |
| Affects Version/s: | None |
| Fix Version/s: | 2.1.0 |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Chris Westin | Assignee: | Antoine Girbal |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Tested Vista |
||
| Issue Links: |
|
||||||||||||||||||||
| Operating System: | ALL | ||||||||||||||||||||
| Participants: | |||||||||||||||||||||
| Description |
|
See QA-12 for the test case. |
| Comments |
| Comment by Antoine Girbal [ 21/Dec/11 ] |
|
verified:
tests are in bigMapReduce.js and mrShardedOutput.js. |
| Comment by auto [ 21/Dec/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 17/Dec/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 17/Dec/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 17/Dec/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 17/Dec/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 17/Dec/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 17/Dec/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 15/Dec/11 ] |
|
Author: {u'login': u'gregstuder', u'name': u'Greg Studer', u'email': u'greg@10gen.com'}Message: buildbot bigMapReduce.js can fail until |
| Comment by auto [ 08/Dec/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 07/Dec/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 01/Dec/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 29/Nov/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 24/Nov/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by auto [ 18/Oct/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: - |
| Comment by Antoine Girbal [ 13/Oct/11 ] |
|
Basically the problematic code is after all M/R have run on each shard. 1. Send the new records to a temp collection that is aligned with the final collection's sharding
Cons:
2. Simplify the output modes with no atomicity guaranteed. Pros:
Cons:
So far solution #1 was implemented to maintain most functionality. |
| Comment by Antoine Girbal [ 13/Oct/11 ] |
|
Ticket was reopened because it would not split properly on certain servers. |
| Comment by auto [ 11/Oct/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: - |
| Comment by Antoine Girbal [ 10/Oct/11 ] |
|
added test |
| Comment by auto [ 10/Oct/11 ] |
|
Author: {u'login': u'agirbal', u'name': u'agirbal', u'email': u'antoine@10gen.com'}Message: |
| Comment by Eliot Horowitz (Inactive) [ 22/Aug/11 ] |
|
Should be easy to pre-split. |
| Comment by Antoine Girbal [ 22/Aug/11 ] |
|
yes when I put together the methods to do internal insert/update for sharded system, I had to remove a couple things because the request and dbmessage objects do not exist. For now the workaround is obviously to presplit, but not ideal. |