[SERVER-7641] Error running MapReduce, which writes result to another db with authorization. Created: 13/Nov/12  Updated: 11/Jul/16  Resolved: 19/Jan/13

Status: Closed
Project: Core Server
Component/s: MapReduce
Affects Version/s: 2.2.1
Fix Version/s: 2.4.0-rc0

Type: Bug Priority: Major - P3
Reporter: Pavel Chertorogov Assignee: Daniel Pasette (Inactive)
Resolution: Done Votes: 0
Labels: Authenticate, MapReduce
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File mrShardedOutputAuth.js     File repro_mapreduce.js     File repro_setup_auth.js    
Backwards Compatibility: Fully Compatible
Participants:

 Description   

Have problem when try run MapReduce, which writes result to another db.

Both db have authorization.

MR out section:
'out' :

{ 'merge' : $_outputCollection, 'db' : $_anotherDB, 'sharded' : true }

Fails with error: MR parallel processing failed:
{ result: "tmp.mrs.views_raw_1348067821_5", errmsg: "exception: splitVector failed:

{ errmsg: "need to login", ok: 0.0 }

", code: 15921, ok: 0.0 }

How I must login with MapReduce task?
May be need change 'db' to array

{'name':'$_anotherDB', 'user':'login', 'password':'pass'}

?



 Comments   
Comment by auto [ 19/Jan/13 ]

Author:

{u'date': u'2013-01-13T10:52:44Z', u'email': u'dan@10gen.com', u'name': u'Dan Pasette'}

Message: SERVER-7641 - added jstest mrShardedOutputAuth.js
Branch: master
https://github.com/mongodb/mongo/commit/ac08508bd4727043521f8b57da19571525675072

Comment by Daniel Pasette (Inactive) [ 14/Jan/13 ]

This is fixed in 2.3.2.

Attaching a single jstest which doesn't require outside setup.

I took the test a bit further and tested some expected failure cases as well.

Comment by sam.helman@10gen.com [ 15/Nov/12 ]

Attached files to reproduce.

Started setup without auth, ran repro_setup_auth.js to create the appropriate users, then restarted cluster with auth and ran repro_mapreduce.js to reproduce issue. The mapreduce call is a simplified version of the ticket reporter's, but does the same important thing (uses different dbs for input and output). The same error is produced. By experimentation, the "sharded:true" field in the "out" object seems to be the cause of the error.

Setup: mongos running locally on 27107, 3 config servers and 2 standalone mongods as shards all running on another machine.

Comment by Pavel Chertorogov [ 14/Nov/12 ]

In shell same problem.

MongoDB shell version: 2.2.1
connecting to: test

 
> use zero_raw
switched to db zero_raw
> db.auth('zero.kz','pass')
1
> use zero_hit
switched to db zero_hit
> db.auth('zero.kz','pass')
1
> use zero_raw
> db.runCommand(
{
	"mapreduce":"data_15658",
	"map":
		function() {
			var hs = this.hs; 
			var cmList = [hs];  
			this.cm.forEach(function(d){
				cmList[d] = hs;
			});                                
			emit({sid:this.sid, t:1352863869}, {cmL: cmList});
		},
	"reduce":
		function(k, values) {
			var result = {cmL:[]};
			values.forEach(function(value) {
				value.cmL.forEach(function(hits, cm){
					if(cm in result.cmL) {
						result.cmL[cm] += hits;
					} else {
						result.cmL[cm] = hits;
					}
				});
			});
			return result;
		},
	"out":{
		"merge":"delta_day_15658",
		"db":"zero_hit",
		"sharded":true,
		"nonAtomic":true
	},
	"verbose":true,
	"query":{"sh":3, "t":{"$gte":1352863869,"$lte":1352864168}}
}
);
{
        "ok" : 0,
        "errmsg" : "MR parallel processing failed: { 
			result: "tmp.mrs.data_15658_1352864509_0", 
			errmsg: "exception: splitVector failed: { errmsg: "need to login", ok: 0.0 }", 
			code: 15921, 
			ok: 0.0 
		}"
}

Comment by Pavel Chertorogov [ 13/Nov/12 ]

Hello.

In one Mongo Cluster (2-shards in replica-sets all v2.2.1). Source collection data evenly distributed via this two shards. MapReduce writes results with merge option to another sharded database to existed collection. This DBs have same username and password, but it is different users (as mongo designed).

I use PHP 5.3.6 with MongoDriver v1.2.1 to run this MR tasks. I tried many variations to log in two this databases via $MongoDB->authenticate() and $MongoDB->command().
So in PHP this only works when I login to admin db and after that all databases available, and MR works without errors. But by security reasons it unsafe: somebody easily can break down sharded cluster.

Tomorrow I run this MR in shell with logging in to both databases via db.auth() and write additional comment.

Thanks.

Comment by sam.helman@10gen.com [ 13/Nov/12 ]

Hello,

Have you tried logging in to both databases via db.auth() before you run the MapReduce command? Additionally, which driver are you using? Are you running these commands from the shell?

Generated at Thu Feb 08 03:15:09 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.