[SERVER-80914] cannot retrieve partial results from a sharded cluster with one zone down Created: 08/Sep/23  Updated: 22/Jan/24

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: 6.0.6
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Jonathan Janos Assignee: Backlog - Cluster Scalability
Resolution: Unresolved Votes: 0
Labels: cs-subteam1, cs-techdebt, sharding-nyc-subteam1
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Cluster Scalability
Operating System: ALL
Participants:
Story Points: 3

 Description   

Results are not being returned as expected when issuing what should be a targeted query against a zone-sharded cluster when one zone is down. Steps to reproduce (you can find the exact commands that I used here on github):

  1. Launch a 2-shard cluster ("0-deployCluster.mlaunch.sh")
  2. Tag each shard with a zone (US and WORLD). ("1-addShardstoZone.sh").
  3. Define the zone ranges such that a 'metadata.site' value of 'site1' is mapped to the US zone and a 'metadata.site' value of 'site2' is mapped to the WORLD zone. ("2-defineZoneRanges.sh")
  4. Shard a collection using a compound shard key: {'metadata.site':1, 'metadata.sensorID': 1}.   ("3-shardCollection.sh")
  5. Insert test data. 2 records will suffice: one containing 'metadata.site'='site1' and another container 'metadata.site'='site2'. ("4-insertData.sh")
  6.  Run the test scenario defined in "testSiteFailure.txt". Kill shard02 (corresponding to site2/WORLD), connect to the mongos, and run some queries. Note that a query on {"metadata.site":"site1"} should be a targeted query, and should succeed, but it does not. No results are returned.

Working with ratika.gandhi@mongodb.com to understand why results are not being returned when using the prefix of a compound shard key. Our documentation says this should work. 

I tested this using Mongo 6.0.6



 Comments   
Comment by Jonathan Janos [ 30/Oct/23 ]

I do have a customer running into this. They are specifically working with native time series collections, and when you do that, it appears that you can't get results back at all when one of the zones is down, even when using targeted queries containing the complete shard key. I'm not sure if that's the same issue or an additional issue. Adding michael.gargiulo@mongodb.com for visibility on time series. 

Generated at Thu Feb 08 06:44:56 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.