[SERVER-1667] Failed to insert data while sharding gridfs.fs.chunks with key {"files_id": 1, "n": 1} Created: 24/Aug/10  Updated: 30/Mar/12  Resolved: 25/Aug/10

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 1.6.1
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Che-Ching Wu Assignee: Mathias Stearn
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File mongos.log    
Operating System: ALL
Participants:

 Description   

I built up a system that consists of 2 replica set shards (each has 3 servers), 1 config server, 1 mongos.
then sharded the database and gridfs.fs.chunks without any error.
However, when I tried to insert a file with my python script, it always failed.



 Comments   
Comment by Mathias Stearn [ 25/Aug/10 ]

FYI: I looked into it and this message was hidden by the python driver. See http://jira.mongodb.org/browse/PYTHON-158

Comment by Mathias Stearn [ 25/Aug/10 ]

We currently only support sharding on

{files_id:1}

. The issue is that we don't yet have code to distribute the md5 calculations between shards, so a file must live fully on a single shard. You should get an error that looks something like this:

13092: "GridFS chunks collection can only be sharded on files_id"

Comment by Che-Ching Wu [ 25/Aug/10 ]

The exception of python insertion script I've posted above.
And I found that there is no fs.files collection in the DB.

> show dbs
admin
gridfs_bench2
local
> use gridfs_bench2
switched to db gridfs_bench2
> show collections
fs.chunks
system.indexes
> use config
switched to db config
> db.chunks.find()
{ "_id" : "gridfs_bench2.fs.chunks-files_id_MinKeyn_MinKey", "lastmod" :

{ "t" : 1000, "i" : 0 }

, "ns" : "gridfs_bench2.fs.chunks", "min" : { "files_id" :

{ $minKey : 1 }

, "n" :

{ $minKey : 1 }

}, "max" : { "files_id" :

{ $maxKey : 1 }

, "n" :

{ $maxKey : 1 }

}, "shard" : "shard0000" }
> db.databases.find()

{ "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "gridfs_bench2", "partitioned" : true, "primary" : "shard0000" }

> db.collections.find()
{ "_id" : "gridfs_bench2.fs.chunks", "lastmod" : "Thu Jan 15 1970 20:18:22 GMT+0000 (UTC)", "dropped" : true, "key" :

{ "files_id" : 1, "n" : 1 }

, "unique" : false }

==============

Here is the log from mongod.

Wed Aug 25 07:17:02 [conn14] dropDatabase gridfs_bench2
Wed Aug 25 07:17:02 [conn13] getmore local.oplog.rs cid:381969710586602011 getMore: { ts: { $gte: new Date(5509233406178
Wed Aug 25 07:17:05 allocating new datafile /var/lib/mongo/gridfs_bench2.ns, filling with zeroes...
Wed Aug 25 07:17:05 done allocating datafile /var/lib/mongo/gridfs_bench2.ns, size: 16MB, took 0.034 secs
Wed Aug 25 07:17:05 allocating new datafile /var/lib/mongo/gridfs_bench2.0, filling with zeroes...
Wed Aug 25 07:17:06 done allocating datafile /var/lib/mongo/gridfs_bench2.0, size: 64MB, took 0.112 secs
Wed Aug 25 07:17:06 [conn14] building new index on

{ _id: 1 }

for gridfs_bench2.fs.chunks
Wed Aug 25 07:17:06 [conn14] Buildindex gridfs_bench2.fs.chunks idxNo:0 { name: "id", ns: "gridfs_bench2.fs.chunks", k
Wed Aug 25 07:17:06 [conn14] done for 0 records 0secs
Wed Aug 25 07:17:06 [conn14] info: creating collection gridfs_bench2.fs.chunks on add index
building new index on

{ files_id: 1, n: 1 }

for gridfs_bench2.fs.chunks
Wed Aug 25 07:17:06 [conn14] Buildindex gridfs_bench2.fs.chunks idxNo:1 { ns: "gridfs_bench2.fs.chunks", key: { files_id
Wed Aug 25 07:17:06 [conn14] done for 0 records 0secs
Wed Aug 25 07:17:06 [conn14] insert gridfs_bench2.system.indexes 153ms
Wed Aug 25 07:17:06 [conn13] getmore local.oplog.rs cid:381969710586602011 getMore: { ts: { $gte: new Date(5509233406178
Wed Aug 25 07:17:06 [conn10] getmore local.oplog.rs cid:5384725566605300308 getMore: { ts: { $gte: new Date(550923340617
Wed Aug 25 07:17:06 allocating new datafile /var/lib/mongo/gridfs_bench2.1, filling with zeroes...
Wed Aug 25 07:17:06 done allocating datafile /var/lib/mongo/gridfs_bench2.1, size: 128MB, took 0.212 secs

==============

Finally, if I use

{files_id:1}

only for shard key. There's no error.

Comment by Eliot Horowitz (Inactive) [ 25/Aug/10 ]

There are no errors there.

What happens when you try to write something?

Comment by Che-Ching Wu [ 25/Aug/10 ]

I've modified to use SON in python script and it's in the right sequence now. Following is the output messages of mongos.
But I still got the error. Things not changed.

Wed Aug 25 02:16:31 connection accepted from 172.16.1.181:29753 #3
Wed Aug 25 02:16:31 [conn3] couldn't find database [gridfs_bench2] in config db
Wed Aug 25 02:16:31 [conn3] put [gridfs_bench2] on: shard0000:shard1/172.16.1.190:27018,172.16.1.191:27018,172.16.1.192:27018
Wed Aug 25 02:16:31 [conn3] DROP DATABASE: gridfs_bench2
Wed Aug 25 02:16:31 [conn3] DBConfig::dropDatabase: gridfs_bench2
Wed Aug 25 02:16:31 [conn3] config change: { _id: "vm-simplestore1.saas-sd.lava.tw-2010-08-25T02:16:31-0", server: "vm-simplestore1.saas-sd.lava.tw", time: new Date(1282702591538), what: "dropDatabase.start", ns: "gridfs_bench2", details: {} }
Wed Aug 25 02:16:31 [conn3] DBConfig::dropDatabase: gridfs_bench2 dropped sharded collections: 0
Wed Aug 25 02:16:31 [conn3] config change: { _id: "vm-simplestore1.saas-sd.lava.tw-2010-08-25T02:16:31-1", server: "vm-simplestore1.saas-sd.lava.tw", time: new Date(1282702591556), what: "dropDatabase", ns: "gridfs_bench2", details: {} }
Wed Aug 25 02:16:34 [conn3] couldn't find database [gridfs_bench2] in config db
Wed Aug 25 02:16:34 [conn3] put [gridfs_bench2] on: shard0000:shard1/172.16.1.190:27018,172.16.1.191:27018,172.16.1.192:27018
Wed Aug 25 02:16:34 [conn3] enabling sharding on: gridfs_bench2
Wed Aug 25 02:16:34 [conn3] CMD: shardcollection: { shardcollection: "gridfs_bench2.fs.chunks", key:

{ files_id: 1, n: 1 }

}
Wed Aug 25 02:16:34 [conn3] enable sharding on: gridfs_bench2.fs.chunks with shard key:

{ files_id: 1, n: 1 }

Wed Aug 25 02:16:34 [conn3] no chunks for:gridfs_bench2.fs.chunks so creating first: ns:gridfs_bench2.fs.chunks at: shard0000:shard1/172.16.1.190:27018,172.16.1.191:27018,172.16.1.192:27018 lastmod: 1|0 min:

{ files_id: MinKey, n: MinKey }

max:

{ files_id: MaxKey, n: MaxKey }

Wed Aug 25 02:16:34 [conn3] end connection 172.16.1.181:29753

Comment by Eliot Horowitz (Inactive) [ 25/Aug/10 ]

Something is weird

notice:

Tue Aug 24 21:20:42 [conn1] CMD: shardcollection: { shardcollection: "gridfs_bench.fs.chunks", key:

{ n: 1, files_id: 1 }

}
Tue Aug 24 21:20:42 [conn1] enable sharding on: gridfs_bench.fs.chunks with shard key:

{ n: 1, files_id: 1 }

Do you have the commands you used to shard?

Comment by Che-Ching Wu [ 25/Aug/10 ]

BTW, here is the exception from python script.

Traceback (most recent call last):
File "gridfs_bench.py", line 42, in ?
fs.put(value, filename="%d/%d.txt" % (fn/4000,fn%4000))
File "build/bdist.linux-x86_64/egg/gridfs/_init_.py", line 109, in put
File "build/bdist.linux-x86_64/egg/gridfs/grid_file.py", line 218, in close
File "build/bdist.linux-x86_64/egg/gridfs/grid_file.py", line 200, in __flush
File "build/bdist.linux-x86_64/egg/pymongo/database.py", line 293, in command
pymongo.errors.OperationFailure: command SON([('filemd5', ObjectId('4c7443a7865b3b5014000000')), ('root', u'fs')]) failed: db assertion failure

Comment by Che-Ching Wu [ 25/Aug/10 ]

> use config
> db.collections.find()
{ "_id" : "gridfs_bench.fs.chunks", "lastmod" : "Thu Jan 15 1970 20:18:07 GMT+0000 (UTC)", "dropped" : false, "key" :

{ "files_id" : 1, "n" : 1 }

, "unique" : false }

{ "_id" : "gridfs.fs.chunks", "lastmod" : "Thu Jan 15 1970 20:17:00 GMT+0000 (UTC)", "dropped" : true }

>

Comment by Eliot Horowitz (Inactive) [ 25/Aug/10 ]

Looks like you created the shard key backwards.

You made it on

{ n: 1, files_id: 1 }

needs to be on

{ files_id: 1 , n : 1 }
Comment by Eliot Horowitz (Inactive) [ 24/Aug/10 ]

Can you send the full mongos log?

Generated at Thu Feb 08 02:57:41 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.