[SERVER-78594] Make timeseries collection support analyzeShardKey and configureQueryAnalyzer commands Created: 30/Jun/23  Updated: 13/Sep/23  Resolved: 13/Sep/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Major - P3
Reporter: Cheahuychou Mao Assignee: Ratika Gandhi
Resolution: Duplicate Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
Assigned Teams:
Sharding NYC
Participants:

 Description   

A timeseries collection inside MongoDB is writable non-materialized view of an internal bucket collection. The analyzeShardKey and configureQueryAnalyzer commands currently do not work with timeseries collections because:

  • The analyzeShardKey command cannot calculate the cardinality, frequency and monotonicity metrics for a timeseries collection the same way it does for a ordinary collection since the data is stored in bucket documents as shown in the following example.

    [
     	{
     		"_id" : ObjectId("649f39e0f61e0c02d1a008d0"),
     		"control" : {
     			"version" : 1,
     			"min" : {
     				"_id" : ObjectId("649f3a101400fcf278694868"),
     				"ts" : ISODate("2023-06-30T20:24:00Z")
     			},
     			"max" : {
     				"_id" : ObjectId("649f3a101400fcf2786948cb"),
     				"ts" : ISODate("2023-06-30T20:24:48.125Z")
     			}
     		},
     		"data" : {
     			"ts" : {
     				"0" : ISODate("2023-06-30T20:24:48.124Z"),
     				"1" : ISODate("2023-06-30T20:24:48.124Z"),
     				...
     				"99" : ISODate("2023-06-30T20:24:48.125Z")
     			},
     			"_id" : {
     				"0" : ObjectId("649f3a101400fcf278694868"),
     				"1" : ObjectId("649f3a101400fcf278694869"),
     				...
     				"99" : ObjectId("649f3a101400fcf2786948cb")
     			}
     		}
     	}
     ]
    

  • There are various restrictions on the shard key for a timeseries collection that the analyzeShardKey command.
  • The configureQueryAnalyzer command needs to account for the fact that reads and writes for a timeseries collection sometime show up with the view namespace and sometimes with the bucket namespace.


 Comments   
Comment by Ratika Gandhi [ 17/Aug/23 ]

A customer has very limited options to choose a shard for a timeseries collection namely, metaField, subfields of metaField or timeField or a combination of these. Therefore, while it is still tricky to find the right shard key, their options are fairly limited. 

Generated at Thu Feb 08 06:38:45 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.