Loading...

XML

Word

Printable

JSON

Type: Bug
Resolution: Done
Priority: Major - P3
Fix Version/s: None
Affects Version/s: 2.6.1
Component/s: Sharding
Labels:
None

Operating System:
ALL
Confidence Status:
None
Work Order:
3
CAR Domain/s:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

We are presplitting our chunks which worked always fine until we have upgraded to v2.6.1_linux_64bit. Since then we encounter never ending "split failed Cause: the collection's metadata lock is taken" error messages.
In the log of the mongod holding the chunk to split we find:

2014-06-27T16:42:57.797+0200 [LockPinger] cluster sx210:20020,sx176:20020,sx177:20020 pinged successfully at Fri Jun 27 16:42:57 2014 by distributed lock pinger 'sx210:20020,sx176:20020,sx177:20020/s484:27017:1403879511:112806737', sleeping for 30000ms
2014-06-27T16:42:58.707+0200 [conn17] received splitChunk request: { splitChunk: "offerStore.offer", keyPattern: { _id: 1.0 }, min: { _id: 2929980021 }, max: { _id: MaxKey }, from: "offerStoreDE5", splitKeys: [ { _id: 2930480021 } ], shardId: "offerStore.offer-_id_2929980021", configdb: "sx210:20020,sx176:20020,sx177:20020" }
2014-06-27T16:43:01.738+0200 [conn17] received splitChunk request: { splitChunk: "offerStore.offer", keyPattern: { _id: 1.0 }, min: { _id: 2929980021 }, max: { _id: MaxKey }, from: "offerStoreDE5", splitKeys: [ { _id: 2930480021 } ], shardId: "offerStore.offer-_id_2929980021", configdb: "sx210:20020,sx176:20020,sx177:20020" }
2014-06-27T16:43:04.771+0200 [conn17] received splitChunk request: { splitChunk: "offerStore.offer", keyPattern: { _id: 1.0 }, min: { _id: 2929980021 }, max: { _id: MaxKey }, from: "offerStoreDE5", splitKeys: [ { _id: 2930480021 } ], shardId: "offerStore.offer-_id_2929980021", configdb: "sx210:20020,sx176:20020,sx177:20020" }

Related to the issue https://jira.mongodb.org/browse/SERVER-14047 , where we learnt that we have to shut down the whole cluster to clean-up noTimeOut cursors because they may block chunk moves, we restarted the whole cluster (which already is quite painful!). We left all routers shut down and started only one router on a "private" port so that only the application which does the presplit was connected to the cluster. Nevertheless, we received the same error messages as above!
How it's possible that there is still a metadata lock? How to deblock it? How can we proceed with our presplitting?

Assignee:: Sam Kleinman (Inactive)
Reporter:: Kay Agahd
Participants:: Asya Kamsky, Kay Agahd, Sam Kleinman, Thomas Rueckstiess
Votes:: 0 Vote for this issue
Watchers:: 9 Start watching this issue

Created:: Jun 27 2014 03:36:26 PM UTC
Updated:: Apr 14 2015 06:43:00 PM UTC
Resolved:: Apr 14 2015 06:43:00 PM UTC

Details

Description

Attachments

Forms

Activity

People

Dates