[DOCS-15803] Rename "cardinality" term for shard key considerations Created: 04/Jan/23  Updated: 29/Aug/23

Status: Ready for Work
Project: Documentation
Component/s: manual, Server
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Minor - P4
Reporter: Charlie Swanson Assignee: Joseph Dougherty
Resolution: Unresolved Votes: 0
Labels: query, sharding
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File Screen Shot 2023-02-01 at 11.30.04 AM.png    
Issue Links:
Related
is related to SERVER-72507 Rename the field "cardinality" in the... Closed
Participants:
Days since reply: 1 year, 1 week ago
Epic Link: DOCSP-11702

 Description   

I was reviewing the design document for an upcoming project "Add and expose metrics to make shard key selection easier", when I noticed we use the term "cardinality" to mean "number of unique values":
https://www.mongodb.com/docs/manual/core/sharding-choose-a-shard-key/#std-label-shard-key-range

This is mostly fine, and somewhat consistent with set theory (as in the wiki page), but are the shard key values a set? I would think they are more like a multi-set or vector, since values repeat.

We are underway in making a new query optimizer which will estimate the "cardinality" of different query plan sub-segments. In that context, the "cardinality" will mean "number of values", not "number of unique values". This is all going to get confusing I think.

I'm open to suggestions here but I would propose the "choose a shard key" page would use "number of distinct values" instead of cardinality.



 Comments   
Comment by Joseph Dougherty [ 01/Feb/23 ]

Thanks for opening this issue, charlie.swanson@mongodb.com! I like the idea of simplifying our terminology here a bit.
I think we should also consider:

  • other part of the corpus may still use the "cardinality" terminology
  • there are plenty of supplementary materials that will continue to use this language (for example, Demystifying Sharding in MongoDB – See attached screenshot.)

Let's consider making this a discussion point at the next 6WR for this issue. I'd like to hear what folks think about this.

Thanks!
Joe

Generated at Thu Feb 08 08:13:55 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.