[SERVER-22380] 2-shard setup with tag ranges but mongo never writes to shard #2 Created: 30/Jan/16  Updated: 03/Feb/16  Resolved: 03/Feb/16

Status: Closed
Project: Core Server
Component/s: Sharding
Affects Version/s: 2.6.11
Fix Version/s: None

Type: Question Priority: Minor - P4
Reporter: Hector Lai Assignee: Unassigned
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Participants:

 Description   

I'm new to MongoDB, especially on the IT side. I'm trying to set up a Sharded Cluster with Replica Set on one single VM (virtual machine).
Our setup:
1 VM (host: vmmongo2) running RHEL 6.6, mongo 2.6.11.

  • 3 config servers (vmmongo2, ports 30001, 30002, 30003)
  • 2 shards
  • 1 mongos

Each shard contains a replica set

  • shard #1 has replica set rstest2sh1
  • shard #2 has replica set rstest2sh2

Replica set rstest2sh1 has

  • primary: vmmongo2:31001
  • secondary: vmmongo2:31002, vmmongo2:31003

Replica set rstest2sh2 has

  • primary: vmmongo2:32001
  • secondary: vmmongo2:32002, vmmongo2:32003

Shard setting

  • chunk size: 1MB for testing only (default is 64MB). Motivation: trying to force data to be written to shard #2.

Sharded Collection

  • Database: DB_TEST2
  • Collection: robots
  • Docs: 30000 of them (with all different names, etc. but the same LocalSite:"ASIA")
  • Indexes: {"name":1, "LocalSite":1}
  • Shard Key: "LocalSite" (one of these 3 values: USA,ASIA,EUR)
  • Shard Tags:
    tagUSA,tagEUR added to shard #1 replica set rstest2sh1
    tagASIA added to shard #2 replica set rstest2sh2
  • Shard Tag Ranges:
    • for USA<=LocalSite<USB, save data in shard #1 replica set rstest2sh1
    • for ASIA<=LocalSite<ASIB, save data in shard #2 replica set rstest2sh2
    • for EUR<=LocalSite<EUS, save data in shard #1 replica set rstest2sh1
      (I think mongo does not support exact match but only range, I'm hoping a value of 'ASIA' is indeed greater than or equal to 'ASIA' inclusively and less than 'ASIB' exclusively, in alphabetical comparison)

In the end my collection of 300 docs shows up in shard #1 of rstest2sh1 at vmmongo2, ports 30001,30002,30003. However, it is a sad story for shard #2 of rstest2sh2:
vmmongo2:32001, db DB_TEST2, collection robots is empty
vmmongo2:32002, db DB_TEST2 does not exist
vmmongo2:32003, db DB_TEST2 does not exist

I'm guessing all data are written to shard #1 first (why not shard #2? what decides this?) Then, every now and then, when the balancer runs, it moves all my 3000 docs (they all have LocalSite set to 'ASIA') to shard #2 replica set rstest2sh2 since that is how I set up the shard tag range.

Any comments are greatly appreciated on why no data are being written to shard #2 at all.

Thanks.
Hector

==========
P.S.

mongos> db.robots.getShardDistribution()
Shard rstest2sh1 at rstest2sh1/vmmongo2:31001,vmmongo2:31002,vmmongo2:31003
 data : 3.2MiB docs : 30000 chunks : 6
 estimated data per chunk : 546KiB
 estimated docs per chunk : 5000
Totals
 data : 3.2MiB docs : 30000 chunks : 6
 Shard rstest2sh1 contains 100% data, 100% docs in cluster, avg obj size on shard : 112B
 
mongos> sh.getBalancerState()
true
mongos> sh.isBalancerRunning()
true
 
mongos> sh.status()
--- Sharding Status ---
  sharding version: {
        "_id" : 1,
        "version" : 4,
        "minCompatibleVersion" : 4,
        "currentVersion" : 5,
        "clusterId" : ObjectId("56abc258a4cfd0c893c8272a")
}
  shards:
        {  "_id" : "rstest2sh1",  "host" : "rstest2sh1/vmmongo2:31001,vmmongo2:31002,vmmongo2:31003",  "tags" : [ "tagUSA", "tagEUR" ] }
        {  "_id" : "rstest2sh2",  "host" : "rstest2sh2/vmmongo2:32001,vmmongo2:32002,vmmongo2:32003",  "tags" : [ "tagASIA" ] }
  databases:
        {  "_id" : "admin",  "partitioned" : false,  "primary" : "config" }
        {  "_id" : "DB_TEST2",  "partitioned" : true,  "primary" : "rstest2sh1" }
                DB_TEST2.robots
                        shard key: { "LocalSite" : 1 }
                        chunks:
                                rstest2sh1      3
                        { "LocalSite" : { "$minKey" : 1 } } -->> { "LocalSite" : "EUR" } on : rstest2sh1 Timestamp(1, 5)
                        { "LocalSite" : "EUR" } -->> { "LocalSite" : "USA" } on : rstest2sh1 Timestamp(1, 4)
                        { "LocalSite" : "USA" } -->> { "LocalSite" : { "$maxKey" : 1 } } on : rstest2sh1 Timestamp(1, 2)
                         tag: tagASIA  { "LocalSite" : "ASIA" } -->> { "LocalSite" : "ASIB" }
                         tag: tagEUR  { "LocalSite" : "EUR" } -->> { "LocalSite" : "EUS" }
                         tag: tagUSA  { "LocalSite" : "USA" } -->> { "LocalSite" : "USB" }



 Comments   
Comment by Kelsey Schubert [ 03/Feb/16 ]

Hi hectorl,

This is expected behavior. As you have noted, new insertions are not necessarily routed to the shard within the tag range.

Documents are inserted into the correct chunk according to the chunk range. Tag aware sharding relies on the balancer to appropriately redistribute chunks according to the tag ranges specified. Once the chunks have been properly distributed, future insertions should be routed according to the shard tag range.

To avoid waiting for the balancer, you may want to consider presplitting chunks and ensuring that they are appropriately distributed before inserting data.

For additional information, please consider reviewing our documentation on tag aware sharding.

Please note that SERVER project is for reporting bugs or feature suggestions for the MongoDB server. For MongoDB-related support discussion please post on the mongodb-users group or Stack Overflow with the mongodb tag.

Kind regards,
Thomas

Generated at Thu Feb 08 04:00:14 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.