Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.2.0-rc2
Component/s: Sharding
Labels:
None
Environment:
Ubuntu 10.04 64bit

Operating System:
Linux
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

I am running a shard with two members, 3 configservers and one mongos. I am currently in a state where on mongos sh.status() always fails. The error message is:

on Aug 27 13:58:35 decode failed. probably invalid utf-8 string [???]
Mon Aug 27 13:58:35 why: TypeError: malformed UTF-8 character sequence at offset 0
TypeError: malformed UTF-8 character sequence at offset 0

The same error also is printed when I use the config database an issue db.databases.find({}); I can see 3 databases listed and the error is then displayed. db.databases.find({}).count() shows that there are 5 records there. When connected to the shard members, there are 4 databases on each (and they match by name at least).

Now I got in this state during some heavy modifications to the sharding setup, which might be relevant:

I ran db.runCommand(

{removeshard: "shard0000"}

) to remove a shard running very low on disk space. Unfortunately the free space ran out and mongod process froze/crashed and had to be restarted (this is actually because of the post-cleanup.XXXX.bson files created, I now bravely delete them by hand). So, the draining process then hangs, log displays:

Mon Aug 27 10:57:16 [conn4] about to log metadata event: { _id: "mongoimg-2012-08-27T07:57:16-19", server: "mongoimg", clientAddr: "192.168.100.40:36263", time: new Date(1346054236814), what: "moveChunk.from", ns: "project.fs.chunks", details: { min:

{ files_id: ObjectId('4f8323e4af8cd13414001317') }

, max:

{ files_id: ObjectId('4f8414daae8cd11f7200004a') }

, step1 of 6: 0, note: "aborted" } }

each 6 seconds. To solve this, I set the shards draining status to false and restarted configservers (more than once) then enabled draining, by issuing db.runCommand(

{removeshard: "shard0000"}

) again and restarted configservers one by one again (this was repeated several times because one of the configservers was moved to a different IP and the repeating logmessage wouldn't go away).

So, currently the shard seems to be draining normally, but sh.status() always fails.

Nothing interesting in mongos.log, the string "UTF8" doesn't appear. The only thing that I see that might be related to this is:

Mon Aug 27 11:45:03 [Balancer] moveChunk result: { who: { _id: "project.fs.chunks", process: "mongoimg:27017:1345997938:164881649", state: 2, ts: ObjectId('503b20b8706f005dbbff04b8'), when: new Date(1346052280689), who: "mongoimg:27017:1345997938:164881649:conn28:1347786322", why: "migrate-

{ files_id: ObjectId('4f8323e4af8cd13414001317') }

" }, errmsg: "the collection metadata could not be locked with lock migrate-

{ files_id: MinKey }

", ok: 0.0 }

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

dump-configvr-02.tar.gz
Aug 27 2012 01:04:08 PM UTC
299 kB
Edmunds Kalnins
dump-configvr-01.tar.gz
Aug 27 2012 01:04:08 PM UTC
296 kB
Edmunds Kalnins
dump-configsvr.tar.gz
Aug 27 2012 01:04:08 PM UTC
296 kB
Edmunds Kalnins

Assignee:: Scott Hernandez (Inactive)
Reporter:: Edmunds Kalnins
Participants:: Edmunds Kalnins, Scott Hernandez
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Aug 27 2012 11:15:15 AM UTC
Updated:: Jul 11 2016 05:57:48 PM UTC
Resolved:: Aug 28 2012 11:25:38 AM UTC

Details

Description

Attachments

Attachments

Activity

People

Dates