-
Type:
Task
-
Resolution: Done
-
Priority:
Major - P3
-
None
-
Affects Version/s: None
-
Component/s: None
-
None
-
Execution Team 2022-04-18, Execution Team 2022-05-02
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Goal: ascertain whether the performance difference of persisting a change diff log for derived metadata is significant, to inform whether to choose this implementation plan. A carefully shaped index should be a reasonable simile of a change diff log.
Hypothetical Change Diff Log, clustered by timestamp
{
timestamp: <> <- clustered index
nss/UUID: <> <- collection identifier
change: {
count: <1,-1> <- absence of DM type field == no change
dataSize: <int>
}
}
Collection
{
_id: <> <- leave blank
monotonicField: 1 <- shall create index on this, always increases like a timestamp
randomValueField: <> <- shall represent derived metadata diff values
}
Index on monotonicField
{
monotonicFieldValue: <collection_docID>
}
Workload
{
1 thread running inserts on the collection
manually compare performance results with and without the proposed index
}
1) I'm choosing an insert workload, to prompt adding entries to the index, like as if it were a log where an entry is written on every write.
2) I may experiment with multiple threads, say 3-4, depending on the 1 thread results. The monotonicField would then be mostly monotonically increasing, arguably simulating out-of-order writes.