[SERVER-12638] Initial sharding with hashed shard key can result in duplicate split points Created: 06/Feb/14  Updated: 11/Jul/16  Resolved: 03/Apr/14

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: None
Fix Version/s: 2.6.1, 2.7.0

Type: Bug Priority: Major - P3
Reporter: Mikhail Kochegarov [X] Assignee: Randolph Tan
Resolution: Done Votes: 2
Labels: sharding
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment:

Centos 6.3

  • 1 app server with installed mongos
  • 3 different servers with installed mongoc (mongoc0, mongoc1, mongoc2)
  • 2 Replica sets with groups of 3 servers each running mongod (rs: mongodb0-0, mongodb0-1, mongodb0-2; rs1: mongodb1-0, mongodb1-1, mongodb1-2)

Attachments: File 20140321_1540.rar     File SERVER-12638.js     Text File mongos1-failedShardConfig.log     File repro.js     File repro24.out.gz     Text File shard-problem.txt    
Issue Links:
Related
Operating System: Linux
Backport Completed:
Participants:

 Description   
Issue Status as of April 15, 2014

ISSUE SUMMARY
In certain cases, the initial distribution of chunks for a hashed sharded collection to multiple shards can cause mongos to split at the same split point more than once, resulting in a corrupted collection metadata in the shard (not visible in the config server). If the corrupted chunks are later migrated in this collection, the corrupted chunk data can creep into the config server.

These empty chunks can be seen via the getShardVersion command with the {fullMetadata : true} option executed directly against the affected single mongod or replica set primary of the shard.

USER IMPACT
This bug can corrupt the config metadata and in turn cause existing documents not to be returned correctly.

WORKAROUNDS
If the corrupt metadata has not yet propagated to the config servers, the workaround is to stepdown or restart all primaries after sharding the collection on a hashed shard key. This will correctly reload metadata from the config server.

RESOLUTION
Prevent splitting on chunk boundaries to avoid the issue.

AFFECTED VERSIONS
All recent production releases up to 2.6.0 are affected.

PATCHES
The patch is included in the 2.6.1 production release.

Original description

In certain cases, the initial distribution of chunks for a hashed sharded collection to multiple shards can create duplicate split points, resulting in invisible, empty chunks with the same "min" and "max" value in the collection metadata. These should not interfere with normal operation, but if chunks are later migrated in this collection, this may result in inconsistent metadata which must be manually fixed.

These empty chunks can be seen via the "getShardMetadata" command with the "fullMetadata : true" option executed directly against the affected single mongod or replica set primary of the shard. The workaround is to stepdown or restart the single mongod or primary, which will correctly reload metadata from the config server.

Original Description:

After an unexpected reboot of application server I found out that mongos started to show errors while I try to run show collections.

 mongos> show collections;
  Mon Feb  3 22:50:21.680 error: {
    "$err" : "error loading initial database config information :: caused by :: Couldn't load a valid config for database.stats_archive_monthly after 3 attempts. Please try again.",
    "code" : 13282
  } at src/mongo/shell/query.js:128

However, all mongo servers and mongo config servers were healthy and have no issues in logs.

First of all I tried to reboot each of the server in cluster with no success. Error still occurs.

Then after a little check of mongo source I found out that this error could be caused by overlapping ranges of shard keys.

Looking into shard information for broken collection, I noticed this:

database.stats_archive_monthly
        shard key: { "a" : "hashed" }
        chunks:
            rs1 6
            rs0 6
        { "a" : { "$minKey" : 1 } } -->> { "a" : NumberLong("-7686143364045646500") } on : rs1 Timestamp(2, 0)
        { "a" : NumberLong("-7686143364045646500") } -->> { "a" : NumberLong("-6148914691236517200") } on : rs1 Timestamp(3, 0)
        { "a" : NumberLong("-6148914691236517200") } -->> { "a" : NumberLong("-4611686018427387900") } on : rs1 Timestamp(4, 0)
        { "a" : NumberLong("-4611686018427387900") } -->> { "a" : NumberLong("-3074457345618258600") } on : rs1 Timestamp(5, 0)
        { "a" : NumberLong("-3074457345618258600") } -->> { "a" : NumberLong("-1537228672809129300") } on : rs1 Timestamp(6, 0)
        { "a" : NumberLong("-1537228672809129300") } -->> { "a" : NumberLong(0) } on : rs1 Timestamp(7, 0)
        { "a" : NumberLong(0) } -->> { "a" : NumberLong("7686143364045646500") } on : rs0 Timestamp(7, 1)
        { "a" : NumberLong("1537228672809129300") } -->> { "a" : NumberLong("3074457345618258600") } on : rs0 Timestamp(1, 9)
        { "a" : NumberLong("3074457345618258600") } -->> { "a" : NumberLong("4611686018427387900") } on : rs0 Timestamp(1, 10)
        { "a" : NumberLong("4611686018427387900") } -->> { "a" : NumberLong("6148914691236517200") } on : rs0 Timestamp(1, 11)
        { "a" : NumberLong("6148914691236517200") } -->> { "a" : NumberLong("7686143364045646500") } on : rs0 Timestamp(1, 12)
        { "a" : NumberLong("7686143364045646500") } -->> { "a" : { "$maxKey" : 1 } } on : rs0 Timestamp(1, 13)

There is range

{ "a" : NumberLong(0) } -->> { "a" : NumberLong("*7686143364045646500*") } on : rs0 Timestamp(7, 1)

that is overlapping all shard keys from first replica set.

For some additional statistics: First replica set contains 73 records, second replica set contain 0 records.

rs0:PRIMARY> db.stats_archive_monthly.count();
73
 
rs1:PRIMARY> db.stats_archive_monthly.count();
0

Only one query that work with this collection is:

 $mongo_db['stats_archive_monthly'].update( {a: account_id, l_id: location_id, t: time.truncate(interval())}, {'$set' => {u: data.to_i}}, upsert: true)

All data on DB servers is correct, since it is staging environment and all documents has

{"a" : 1}

they all should appear at only one shard.

Somehow now DB is completely unusable unless it is full restored.



 Comments   
Comment by Githook User [ 09/Apr/14 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-12638 Sharding chunks ranges overlap

(cherry picked from commit bc26f73ef697c844cde8e4561bbf9a9c2f4728be)
Branch: v2.6
https://github.com/mongodb/mongo/commit/337729f2d0b943e0557a1c709a3bc155bf374d51

Comment by Githook User [ 03/Apr/14 ]

Author:

{u'username': u'renctan', u'name': u'Randolph Tan', u'email': u'randolph@10gen.com'}

Message: SERVER-12638 Sharding chunks ranges overlap
Branch: master
https://github.com/mongodb/mongo/commit/bc26f73ef697c844cde8e4561bbf9a9c2f4728be

Comment by Randolph Tan [ 03/Apr/14 ]

The issue comes from the way mongos performs the initial splits for empty hashed collection. This is currently done in 3 stages - (1) split the hash space into N chunks, where N is number of shards. (2) Move the chunks to each shard (3) do a finer split on the chunks. So, the issue comes about when the split point in step 3 also exists in step 1. This is not properly handled in the CollectionMetadata and will make the chunk have a range of min -> min. This is completely internal to the shard, but can 'taint' the config servers when the chunk gets moved because the information it uses on the applyOps for the moveChunk command will be from this bad chunk info.

Comment by Asya Kamsky [ 24/Mar/14 ]

With hashed sharding the initial pre-split is expected to move the empty chunks evenly around the cluster - this does not engage the "regular" balancer.

Comment by Björn Bullerdieck [X] [ 24/Mar/14 ]

chunks:
    shard_002       2
    shard_001       2
    shard_003       2
    shard_004       2

Why doesn't it look like this? after the initialize. I hoped so (for new collections).

Comment by Ankur Chauhan [ 24/Mar/14 ]

That is a very good point. Why did some of the chunks move even with the balancer disabled?

Comment by Björn Bullerdieck [X] [ 24/Mar/14 ]

omitting the numInitialChunks argument looks like a easy work around. With this "fix", I can not reproduce it.

Comment by Björn Bullerdieck [X] [ 24/Mar/14 ]

But why some chunks are moved and some not? With deactivated balancer.

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "version" : 3,
        "minCompatibleVersion" : 3,
        "currentVersion" : 4,
        "clusterId" : ObjectId("53300f28512956601e55ce7f")
}
  shards:
        {  "_id" : "shard_001",  "host" : "127.0.0.1:20117" }
        {  "_id" : "shard_002",  "host" : "127.0.0.1:20217" }
        {  "_id" : "shard_003",  "host" : "127.0.0.1:20317" }
        {  "_id" : "shard_004",  "host" : "127.0.0.1:20417" }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
...
        {  "_id" : "ci_400000000000008",  "partitioned" : true,  "primary" : "shard_001" }
                ci_400000000000008.some_collection
                        shard key: { "_id" : "hashed" }
                        chunks:
                                shard_001       8
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6917529027641081850") } on : shard_001 Timestamp(1, 4)
                        { "_id" : NumberLong("-6917529027641081850") } -->> { "_id" : NumberLong("-4611686018427387900") } on : shard_001 Timestamp(1, 5)
                        { "_id" : NumberLong("-4611686018427387900") } -->> { "_id" : NumberLong("-2305843009213693950") } on : shard_001 Timestamp(1, 6)
                        { "_id" : NumberLong("-2305843009213693950") } -->> { "_id" : NumberLong(0) } on : shard_001 Timestamp(1, 7)
                        { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("2305843009213693950") } on : shard_001 Timestamp(1, 8)
                        { "_id" : NumberLong("2305843009213693950") } -->> { "_id" : NumberLong("4611686018427387900") } on : shard_001 Timestamp(1, 9)
                        { "_id" : NumberLong("4611686018427387900") } -->> { "_id" : NumberLong("6917529027641081850") } on : shard_001 Timestamp(1, 10)
                        { "_id" : NumberLong("6917529027641081850") } -->> { "_id" : { "$maxKey" : 1 } } on : shard_001 Timestamp(1, 11)
        {  "_id" : "ci_400000000000001",  "partitioned" : true,  "primary" : "shard_001" }
                ci_400000000000001.some_collection
                        shard key: { "_id" : "hashed" }
                        chunks:
                                shard_002       1
                                shard_001       3
                                shard_003       2
                                shard_004       2
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6917529027641081850") } on : shard_002 Timestamp(4, 0)
                        { "_id" : NumberLong("-6917529027641081850") } -->> { "_id" : NumberLong("-4611686018427387900") } on : shard_001 Timestamp(4, 1)
                        { "_id" : NumberLong("-4611686018427387900") } -->> { "_id" : NumberLong("-2305843009213693950") } on : shard_001 Timestamp(3, 4)
                        { "_id" : NumberLong("-2305843009213693950") } -->> { "_id" : NumberLong(0) } on : shard_001 Timestamp(3, 5)
                        { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("2305843009213693950") } on : shard_003 Timestamp(3, 6)
                        { "_id" : NumberLong("2305843009213693950") } -->> { "_id" : NumberLong("4611686018427387900") } on : shard_003 Timestamp(3, 7)
                        { "_id" : NumberLong("4611686018427387900") } -->> { "_id" : NumberLong("6917529027641081850") } on : shard_004 Timestamp(3, 8)
                        { "_id" : NumberLong("6917529027641081850") } -->> { "_id" : { "$maxKey" : 1 } } on : shard_004 Timestamp(3, 9)
        {  "_id" : "ci_400000000000006",  "partitioned" : true,  "primary" : "shard_001" }
                ci_400000000000006.some_collection
                        shard key: { "_id" : "hashed" }
                        chunks:
                                shard_003       1
                                shard_001       5
                                shard_002       2
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6917529027641081850") } on : shard_003 Timestamp(3, 0)
                        { "_id" : NumberLong("-6917529027641081850") } -->> { "_id" : NumberLong("-4611686018427387900") } on : shard_001 Timestamp(3, 1)
                        { "_id" : NumberLong("-4611686018427387900") } -->> { "_id" : NumberLong("-2305843009213693950") } on : shard_002 Timestamp(2, 4)
                        { "_id" : NumberLong("-2305843009213693950") } -->> { "_id" : NumberLong(0) } on : shard_002 Timestamp(2, 5)
                        { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("2305843009213693950") } on : shard_001 Timestamp(2, 6)
                        { "_id" : NumberLong("2305843009213693950") } -->> { "_id" : NumberLong("4611686018427387900") } on : shard_001 Timestamp(2, 7)
                        { "_id" : NumberLong("4611686018427387900") } -->> { "_id" : NumberLong("6917529027641081850") } on : shard_001 Timestamp(2, 8)
                        { "_id" : NumberLong("6917529027641081850") } -->> { "_id" : { "$maxKey" : 1 } } on : shard_001 Timestamp(2, 9)
        {  "_id" : "ci_400000000000005",  "partitioned" : true,  "primary" : "shard_001" }
                ci_400000000000005.some_collection
                        shard key: { "_id" : "hashed" }
                        chunks:
                                shard_002       1
                                shard_001       7
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6917529027641081850") } on : shard_002 Timestamp(2, 0)
                        { "_id" : NumberLong("-6917529027641081850") } -->> { "_id" : NumberLong("-4611686018427387900") } on : shard_001 Timestamp(2, 1)
                        { "_id" : NumberLong("-4611686018427387900") } -->> { "_id" : NumberLong("-2305843009213693950") } on : shard_001 Timestamp(1, 6)
                        { "_id" : NumberLong("-2305843009213693950") } -->> { "_id" : NumberLong(0) } on : shard_001 Timestamp(1, 7)
                        { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("2305843009213693950") } on : shard_001 Timestamp(1, 8)
                        { "_id" : NumberLong("2305843009213693950") } -->> { "_id" : NumberLong("4611686018427387900") } on : shard_001 Timestamp(1, 9)
                        { "_id" : NumberLong("4611686018427387900") } -->> { "_id" : NumberLong("6917529027641081850") } on : shard_001 Timestamp(1, 10)
                        { "_id" : NumberLong("6917529027641081850") } -->> { "_id" : { "$maxKey" : 1 } } on : shard_001 Timestamp(1, 11)

Comment by Björn Bullerdieck [X] [ 24/Mar/14 ]

@Shaun Verch
1. Deactivate balancer
2. 10 Threads, each create 1 collection, activate sharding, insert 100000 Entries in the created collection

db.adminCommand({ enableSharding : dbName });
db.adminCommand({ shardcollection : dbName + "." + collName, key : { _id : "hashed" }, numInitialChunks : 15 }));
fillCollection(100000)

3. Activate balancer
4. Check config data (config data has overlapping ranges)

But it looks like the 100000 entries and deaktivate the balancer are not important for this bug.

Comment by Asya Kamsky [ 23/Mar/14 ]

I probably should not have called it a race condition - a specific sequence of steps will reproduce this. Among the sequence is a failure to migrate one of the initially split chunks - this is much more likely to happen when multiple dbs/collections are being sharded at approximately the same time.

The bug that fixed this issue fixed a mistake in the code that manifested differently depending on sequence of events (and whether they were any failures which caused chunks to be on particular shards).

So it was only a race condition if you have multiple dbs/collections sharding happening at approximately the same time. The more of them, the more likely that one will have a failure to do "normal" initial steps, and create a sequence of events that's necessary to trigger this bug.

There are several ways you can work around this bug until 2.4.10 comes out - the simplest one would be to set up your scripts to shard dbs and enable sharding on "hash" keys on collections one at a time, checking the previous one's success. The initial steps may take a little bit longer but since this is splitting and balancing empty chunks, the whole thing is a matter of seconds.

Another way of avoiding situation that can trigger this is I think by omitting the numInitialChunks argument - I haven't been able to trigger this bug when default number of chunks (numShards*2) is used. Possibly because that uses regular split and larger number of chunks will use "multi-split" - I'm going to leave the rest of debugging to appropriate kernel engineers who work on sharding.

It's possible you may need to do both of the above to guarantee the bug can't be triggered.

Comment by Ankur Chauhan [ 23/Mar/14 ]

It seems like you have a root cause for the broken shard ranges but the bug mentioned above does not address any race condition. Could you please elaborate on the specifics of the race condition and if possible the exact bug/commits that address it.
Race conditions are difficult to understand without context so this would really help in understanding what series of events caused the problem.

Comment by Asya Kamsky [ 23/Mar/14 ]

It appears the shard ranges were broken as a side effect of SERVER-12515 (though it seems to involve a race condition so it didn't always manifest unless the circumstances always created necessary sequence of conditions).

The good news is that I cannot reproduce this on either 2.4 master (intended 2.4.10 release) nor 2.6.0 release candidate.

Comment by Asya Kamsky [ 23/Mar/14 ]

The trouble starts on line 4173 of repro24.out
This has 10 collections, 9 of them get this wrong chunk range, the one that doesn't is collection5.

Comment by Asya Kamsky [ 23/Mar/14 ]

sverch I made a small modification to your script and ran it a few times with 2.6 and couldn't get it to fail but the first time I tried it on 2.4.8 it failed in exactly the way that others saw. I'm going to attach the script and the output.

Comment by Asya Kamsky [ 22/Mar/14 ]

Björn Bullerdieck just double checking - you checked config DB before the balancing started and you said all was okay - were all the chunks created? So the only difference between the config before and after was chunks were moved to different shards and the ranges for some of them in the process became incorrect, is that right?

Comment by Asya Kamsky [ 22/Mar/14 ]

achauhan@brightcove.com it may be helpful to get a copy of your config database - would you be able to attach it to this ticket?

Comment by Ankur Chauhan [ 22/Mar/14 ]

We were able to reproduce this behaviour. Yesterday, after we got things to work quite well. I realized that we had the balancer turned off, I then enabled the balancer and very soon after that the config on one of the collections got corrupted in the same way (overlapping chunks). I have a feeling that there is something funky going on with the balancer that is biting us. There were a lot of chunk migrate aborted messages in some of the mongod logs. We are running mongos with the --noAutoSplit option and numInitialChunks = 4*shardcount.

Comment by Shaun Verch [ 21/Mar/14 ]

Hi Björn,

I'm trying to reproduce your test, but I'm not entirely clear on what you are doing. Do you mean to:

  1. Deactivate balancer
  2. Spawn 10 threads. In each thread, do the following 100000 times, for different values of "dbName" and "collName":
    1.  db.adminCommand({ enableSharding : dbName }); 

    2.  db.adminCommand({ shardcollection : dbName + "." + collName, key : { _id : "hashed" }, numInitialChunks : 15 })); 

  3. Activate balancer
  4. Check config data (config data has overlapping ranges)

I've attached a script to this ticket that I think is based on what you described using the mongo shell, but I haven't been able to reproduce it. Can you reproduce it creating fewer than "100,000" databases in each thread? Am I understanding that number correctly?

Thanks,
~Shaun Verch

Comment by Björn Bullerdieck [X] [ 21/Mar/14 ]

yes I have a little java program. I'll see if I can extract it.
But it is very simple:
create one mongoclient -> start 10 threads -> (this time: deaktivate balancer)
-> every thread: create database (ci_40000000000000*) -> create collection with enablesharding _id:hashed,numInitialChunks:15 -> create 100.000 entries
=> config looks ok (but chunks are not distributed - sh.status())
-> activate balancer
=> config looks bad

same result without deaktivate balancer, but I think a balancer bug.

I can reproduce it easily

Testsystem (1 developer machine- win7 64-bit): 3 single-shard-instances (no replicas), 3 config-instances, 1 mongos

Comment by Björn Bullerdieck [X] [ 21/Mar/14 ]

config dump and logs (20140321_1540.rar)

Comment by Scott Hernandez (Inactive) [ 21/Mar/14 ]

Björn Bullerdieck, do you have a script that can reproduce this behavior? If so, please attach it to the issue so we can try to reproduce it as well. If not, can you please upload a mongodump of the config database, and logs from your mongos instances that cover the period when you noticed this behavior?

Comment by Björn Bullerdieck [X] [ 21/Mar/14 ]

first try with 15 initial chunks was successful but now..
different number of chunks does not solve the problem (next try with deaktivated balancer)

   {  "_id" : "ci_400000000000006",  "partitioned" : true,  "primary" : "shard_001" }
           ci_400000000000006.some_collection
                   shard key: { "_id" : "hashed" }
                   chunks:
                           shard_002       5
                           shard_003       5
                           shard_001       5
                   { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-7993589098607472360") } on : shard_002 Timestamp(2, 0)
                   { "_id" : NumberLong("-7993589098607472360") } -->> { "_id" : NumberLong("-6763806160360168920") } on : shard_003 Timestamp(3, 0)
                   { "_id" : NumberLong("-6763806160360168920") } -->> { "_id" : NumberLong("-5534023222112865480") } on : shard_002 Timestamp(4, 0)
                   { "_id" : NumberLong("-5534023222112865480") } -->> { "_id" : NumberLong("-4304240283865562040") } on : shard_003 Timestamp(5, 0)
                   { "_id" : NumberLong("-4304240283865562040") } -->> { "_id" : NumberLong("-3074457345618258600") } on : shard_002 Timestamp(6, 0)
                   { "_id" : NumberLong("-3074457345618258600") } -->> { "_id" : NumberLong("-1844674407370955160") } on : shard_003 Timestamp(7, 0)
                   { "_id" : NumberLong("-1844674407370955160") } -->> { "_id" : NumberLong("-614891469123651720") } on : shard_002 Timestamp(8, 0)
                   { "_id" : NumberLong("-614891469123651720") } -->> { "_id" : NumberLong("614891469123651720") } on : shard_003 Timestamp(9, 0)
                   { "_id" : NumberLong("614891469123651720") } -->> { "_id" : NumberLong("1844674407370955160") } on : shard_002 Timestamp(10, 0)
                   { "_id" : NumberLong("1844674407370955160") } -->> { "_id" : NumberLong("3074457345618258600") } on : shard_003 Timestamp(11, 0)
                   { "_id" : NumberLong("3074457345618258600") } -->> { "_id" : NumberLong("7993589098607472360") } on : shard_001 Timestamp(11, 1)
                   { "_id" : NumberLong("4304240283865562040") } -->> { "_id" : NumberLong("5534023222112865480") } on : shard_001 Timestamp(1, 14)
                   { "_id" : NumberLong("5534023222112865480") } -->> { "_id" : NumberLong("6763806160360168920") } on : shard_001 Timestamp(1, 15)
                   { "_id" : NumberLong("6763806160360168920") } -->> { "_id" : NumberLong("7993589098607472360") } on : shard_001 Timestamp(1, 16)
                   { "_id" : NumberLong("7993589098607472360") } -->> { "_id" : { "$maxKey" : 1 } } on : shard_001 Timestamp(1, 17)

interesting, always the final number twice and always in all sharded collections the same chunk(s)

                   { "_id" : NumberLong("3074457345618258600") } -->> { "_id" : NumberLong("7993589098607472360") } on : shard_001 Timestamp(11, 1)
                   { "_id" : NumberLong("6763806160360168920") } -->> { "_id" : NumberLong("7993589098607472360") } on : shard_001 Timestamp(1, 16)

Comment by Ankur Chauhan [ 20/Mar/14 ]

Is there a possible resolution or a workaround for this bug?

Comment by Björn Bullerdieck [X] [ 20/Mar/14 ]

Same Problem.
Operating System:Win7-64
Affects Version/s: 2.4.9

10 Threads creating 10 shard databases with sharded collections (hashed shard key, 8 chunks)

2014-03-20 15:17:48,621 INFO  [main] de.webtrekk.mongodbvscouchbase.AbstractDatabaseHandler: startAccountThreads..
2014-03-20 15:17:48,926 INFO  [pool-2-thread-7] ***.MongoDBUtils: Command: { "enablesharding" : "ci_400000000000007"}. Enabled sharding for database: { "serverUsed" : "***:27017" , "ok" : 1.0}
2014-03-20 15:17:49,059 INFO  [pool-2-thread-5] ***.MongoDBUtils: Command: { "enablesharding" : "ci_400000000000005"}. Enabled sharding for database: { "serverUsed" : "***:27017" , "ok" : 1.0}
...
2014-03-20 15:17:54,480 INFO  [pool-2-thread-3] ***.MongoDBUtils: Command: { "shardCollection" : "ci_400000000000003.some_collection" , "key" : { "_id" : "hashed"} , "numInitialChunks" : 8}. Enabled sharding for database: { "serverUsed" : "***:27017" , "collectionsharded" : "ci_400000000000003.some_collection" , "ok" : 1.0}
2014-03-20 15:17:54,481 INFO  [pool-2-thread-7] ***.MongoDBUtils: Command: { "shardCollection" : "ci_400000000000007.some_collection" , "key" : { "_id" : "hashed"} , "numInitialChunks" : 8}. Enabled sharding for database: { "serverUsed" : "***:27017" , "collectionsharded" : "ci_400000000000007.some_collection" , "ok" : 1.0}
...

only 2 of 10 ok:

mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "version" : 3,
        "minCompatibleVersion" : 3,
        "currentVersion" : 4,
        "clusterId" : ObjectId("532af869d4e449358dc2f64d")
}
  shards:
        {  "_id" : "shard_001",  "host" : "***:20117" }
        {  "_id" : "shard_002",  "host" : "***:20217" }
        {  "_id" : "shard_003",  "host" : "***:20317" }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "ci_400000000000007",  "partitioned" : true,  "primary" : "shard_001" }
                ci_400000000000007.some_collection
                        shard key: { "_id" : "hashed" }
                        chunks:
                                shard_002       3
                                shard_003       2
                                shard_001       3
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : NumberLong("-6917529027641081850") } on : shard_002 Timestamp(2, 0)
                        { "_id" : NumberLong("-6917529027641081850") } -->> { "_id" : NumberLong("-4611686018427387900") } on : shard_003 Timestamp(3, 0)
                        { "_id" : NumberLong("-4611686018427387900") } -->> { "_id" : NumberLong("-2305843009213693950") } on : shard_002 Timestamp(4, 0)
                        { "_id" : NumberLong("-2305843009213693950") } -->> { "_id" : NumberLong(0) } on : shard_003 Timestamp(5, 0)
                        { "_id" : NumberLong(0) } -->> { "_id" : NumberLong("2305843009213693950") } on : shard_002 Timestamp(6, 0)
                        { "_id" : NumberLong("2305843009213693950") } -->> { "_id" : NumberLong("6917529027641081850") } on : shard_001 Timestamp(6, 1)
                        { "_id" : NumberLong("4611686018427387900") } -->> { "_id" : NumberLong("6917529027641081850") } on : shard_001 Timestamp(1, 9)
                        { "_id" : NumberLong("6917529027641081850") } -->> { "_id" : { "$maxKey" : 1 } } on : shard_001 Timestamp(1, 10)

the problem:
both max 6917529027641081850

 { "_id" : NumberLong("2305843009213693950") } -->> { "_id" : NumberLong("6917529027641081850") } on : shard_001 Timestamp(4, 1)
 { "_id" : NumberLong("4611686018427387900") } -->> { "_id" : NumberLong("6917529027641081850") } on : shard_001 Timestamp(2, 8)

more details:
shard-problem.txt
mongos1-failedShardConfig.log

Comment by Eliot Horowitz (Inactive) [ 07/Feb/14 ]

Is it possible to send all the mongos logs and a mongodump of the config database?

Comment by Mikhail Kochegarov [X] [ 06/Feb/14 ]

For addition, here is code that was used to create this collection:

db.adminCommand( { shardCollection: "stats_archive_monthly", key: {a: "hashed"}, numInitialChunks: 12 } )
db.stats_archive_monthly.ensureIndex( { "a": 1, "l_id":1, "t": 1 } )
db.stats_archive_monthly.ensureIndex( {t: 1}, { expireAfterSeconds: 2678400 + 604800} )

Generated at Thu Feb 08 03:29:07 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.