Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.4.1
Component/s: MapReduce
Labels:
None
Environment:
CentOS release 6.3 (Final) 2.6.32-279.9.1.el6.x86_64
Java driver

Assigned Teams:

Query Optimization
Operating System:
Linux
Steps To Reproduce:

Hide

All the mongod instances are @ 2.4.0-rc1
All the mongos instances are @ 2.4.0-rc1 (or at least all the ones we're using to test/demonstrate the MapReduce issue)
All the config server instances are @ 2.4.0-rc1
The MongoDB Java driver is @ 2.10.1
We set the ReadPreference to secondary each and every place possible
We execute the MapReduce job
Connect to a primary and a secondary on one of our shards (scatter/gather).
Via currentOp() we can observe that the job only runs on the primary.

Show
All the mongod instances are @ 2.4.0-rc1 All the mongos instances are @ 2.4.0-rc1 (or at least all the ones we're using to test/demonstrate the MapReduce issue) All the config server instances are @ 2.4.0-rc1 The MongoDB Java driver is @ 2.10.1 We set the ReadPreference to secondary each and every place possible We execute the MapReduce job Connect to a primary and a secondary on one of our shards (scatter/gather). Via currentOp() we can observe that the job only runs on the primary.
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name:
None
Goal Link:
None

We've upgraded our cluster to 2.4.0-rc1 and we're still not seeing the MapReduce jobs execute on the secondaries. We were hoping ~~SERVER-7423~~ would be included.

Attached is a sample Java application which can be used to illustrate the issue.

Current state of affairs in 2.5.0:
There are a few issues preventing M/R from running on secondaries:

We don't pass query options to map reduce - so the actual command can't even tell if slaveOk bit is on.
We currently write the temporary out put to tmp database and this is not allowed on secondaries since this is a write operation.
Routing and keeping track of which nodes the M/R job is run on. This is because sharded map reduce is done in 2 stages:
1. 1st stage: Run mapReduce on every shard.
2. 2nd stage: Run mapReduce.shardedfinish on every shard. The 2nd stage involves aggregating the results from all other shards and running finalReduce on them.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

MapReduceTest.java
3 kB
Mar 05 2013 06:05:11 PM UTC

related to

SERVER-5504 Allow map-reduce jobs to run on replicas, provided the output is into the "local" database

Closed

SERVER-41455 Support running $out or $merge from a secondary with writes to the primary

Closed

Assignee:: [DO NOT USE] Backlog - Query Optimization
Reporter:: Matt Narrell
Participants:: [DO NOT USE] Backlog - Query Optimization, David Storch, Esha Bhargava, Matt Narrell
Votes:: 8 Vote for this issue
Watchers:: 16 Start watching this issue

Created:: Mar 05 2013 06:05:11 PM UTC
Updated:: Dec 06 2022 05:23:33 AM UTC
Resolved:: Feb 04 2022 03:10:03 PM UTC

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates