-
Type: Improvement
-
Resolution: Unresolved
-
Priority: Major - P3
-
None
-
Affects Version/s: None
-
Component/s: Index Maintenance, Write Ops
-
None
-
Storage Execution
-
Execution Team 2024-04-29, Execution Team 2024-05-13
Right not, the Collection layer gets a batch of inserts from the write ups, and propagates the batch down to the RecordStore layer, but then passes the IndexCatalog one record at a time. This has at least three downsides:
- It goes through all indexes on one document before going to the next. It is likely to be more CPU (cache/branch predictor/etc) friendly to go through all documents in the first index, then do the same for the second index, especially if they are different kinds of indexes and use different code paths.
- It reduces the chances to avoid duplicating work using write cursors if we do SERVER-55337. This may be most evident for wildcard, where you really want to insert all of the keys for all documents in a given path, before moving to the next path. Also any case where multiple documents in the batch generate the same index key, they will be inserting right next to each other which should be really fast.
- It prevents deduping the multikeyMetadataKeys that are common between documents in a batch. There is a pretty high likelihood of there being many common keys, if not all documents generating an identical set of keys.
- is related to
-
SERVER-55337 Use cursors for index writes at the SortedDataInterface/IndexCatalog layer
- Backlog
- related to
-
SERVER-55341 WiredTigerRecordStore should reserve contiguous RecordIds for batch insert
- Closed
-
SERVER-81568 Batch Bulk inserts during the indexing collection scan
- Closed