-
Type:
Task
-
Resolution: Unresolved
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
The boundary between schema-design and query-optimization skill isn't 100% solid.
There are cases where both perspectives need to be considered.
As interplay between two skills isn't documented or supported very well, we'll need to consider how to organize our skills in such a manner that these cases are covered well. A major reorganization is out of scope for the current project, but we want to lay the ground for it by documenting several examples (prompts, ideally with datasets or at least a description of relevant factors that aren't immediately visible)
Example:
Let's say we have a collection of messages. Each message has a date.
The most frequent query searches for the messages of a given year and other criteria.
A query that extracts the year from the date and compares it to "2025" would not be optimized because it does not use an index.
A) Replacing the query with a range for "01-01-2025" to "01-01-2026" should be better, as it uses an index. However, if you have other range predicates and sort order, it may still not be the fastest it can be.
B) Refactoring the data model by storing the "year" as an additional field allows for a simple equality request.
In this case, if the perf gain is crucial, it may be better to change the data model, but you then need to ensure that any code that changes the date in the document recalculates the year.
Alternatively, the simple query change may be enough to avoid worrying about the best optimization.