Uploaded image for project: 'Core Server'
  1. Core Server
  2. SERVER-12141

Cannot create appropriate tag ranges with compound shard key

    • Type: Icon: Bug Bug
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • None
    • Affects Version/s: 2.4.3
    • Component/s: Sharding
    • Labels:
      None
    • Environment:
      Ubuntu
    • ALL

      We're using 2.4.3.

      We have a compound shard key, composed of a 'token' that's represented by an 8-digit int and an _id type field that's a hashed value. To be clear, we are NOT using shard-key hashing.

      An example shard key might look like:

         { token : 11000001, _id : "3ad3e42367cc4580b81b0b5df235bf36" }
      

      With no tag ranges in place the data seems to distribute pretty well. But we want to use tag aware sharding.

      Assuming we have shards tagged 'A' and 'B', if we define tag ranges that look like:

         { "test.mycoll", { token : MinKey}, { token : "11000000"}, "A")
         { "test.mycoll", { token : "11000000"}, { token : "11999999"}, "A")
         { "test.mycoll", { token : "12000000"}, { token : "12999999"}, "A")
         { "test.mycoll", { token : "21000000"}, { token : "21999999"}, "B")
         ...
      

      All the data ends up on just one shard.

      If we define tag ranges that look like:

         { "test.mycoll", {token:MinKey, _id:MinKey}, {token:"11000000", _id:MaxKey}, "A")
         { "test.mycoll", {token:"11000000", _id:MinKey}, {token:"11999999", _id:MaxKey}, "A")
         { "test.mycoll", {token:"12000000", _id:MinKey}, {token:"12999999", _id:MaxKey}, "A")
         { "test.mycoll", {token:"21000000", _id:MinKey}, {token:"21999999", _id:MaxKey}, "B")
         ...
      

      MongoDB complains (in the log) that the tag ranges are not valid.

      If we use MinKey everywhere instead of MaxKey:

         { "test.mycoll", {token:MinKey, _id:MinKey}, {token:"11000000", _id:MinKey}, "A")
         { "test.mycoll", {token:"11000000", _id:MinKey}, {token:"11999999", _id:MinKey}, "A")
         { "test.mycoll", {token:"12000000", _id:MinKey}, {token:"12999999", _id:MinKey}, "A")
         { "test.mycoll", {token:"21000000", _id:MinKey}, {token:"21999999", _id:MinKey}, "B")
         ...
      

      The complaints go away but all the data still ends up on one shard.

      The token field is constructed in such a way that we can pretty much guarantee that there will never be a token < 11000000 or a token greater than a certain number. We put in rules to handle those cases anyway.

      We have plenty of data and there are lots of chunks being created. The chunks seem to be getting split very nicely within the ranges.

      Ideally, the first example would work since for the purposes of shard tagging we do not care about the _id value. But it does not.

      Either we just can't figure out how to define the tag ranges appropriately, or MongoDB just doesn't respect tag ranges in conjunction with a compound shard key.

      Any help would be appreciated.

            Assignee:
            Unassigned Unassigned
            Reporter:
            chris.coppick@tealium.com Chris Coppick
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: