[SERVER-83398] Unnecessary copy of the index keys when position the cursor Created: 17/Nov/23  Updated: 31/Jan/24

Status: Open
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Major - P3
Reporter: Chenhao Qu Assignee: Backlog - Storage Execution Team
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Assigned Teams:
Storage Execution
Sprint: Execution Team 2024-01-22
Participants:

 Description   

I found each time we move the index cursor, we copy the whole key:

    /**
     * This must be called after moving the cursor to update our cached position. It should not
     * be called after a restore that did not restore to original state since that does not
     * logically move the cursor until the following call to next().
     */
    void updatePosition(const char* newKeyData, size_t newKeySize) {
        // Store (a copy of) the new item data as the current key for this cursor.
        _key.resetFromBuffer(newKeyData, newKeySize);
 
        _eof = atOrPastEndPointAfterSeeking();
        if (_eof)
            return;
 
        updateIdAndTypeBits();
    }

This is unnecessary. We only need to cache the pointer if we still hold the cursor position. We should copy the data only when we yield the cursor.

In addition to that, if the key type is KeyStringEntry, we do another copy:

    boost::optional<KeyStringEntry> getKeyStringEntry() {
        if (_eof)
            return {};
 
        // Most keys will have a RecordId appended to the end, with the exception of the _id index
        // and timestamp unsafe unique indexes. The contract of this function is to always return a
        // KeyString with a RecordId, so append one if it does not exists already.
        if (_unique &&
            (_isIdIndex ||
             _key.getSize() ==
                 key_string::getKeySize(_key.getBuffer(), _key.getSize(), _ordering, _typeBits))) {
            // Create a copy of _key with a RecordId. Because _key is used during cursor restore(),
            // appending the RecordId would cause the cursor to be repositioned incorrectly.
            key_string::Builder keyWithRecordId(_key);
            keyWithRecordId.appendRecordId(_id);
            keyWithRecordId.setTypeBits(_typeBits);
 
            LOGV2_TRACE_CURSOR(20090,
                               "returning {keyWithRecordId} {id}",
                               "keyWithRecordId"_attr = keyWithRecordId,
                               "id"_attr = _id);
            return KeyStringEntry(keyWithRecordId.getValueCopy(), _id);
        }
 
        _key.setTypeBits(_typeBits);
 
        LOGV2_TRACE_CURSOR(20091, "returning {key} {id}", "key"_attr = _key, "id"_attr = _id);
        return KeyStringEntry(_key.getValueCopy(), _id);
    }

The necessity of this is also questionable.


Generated at Thu Feb 08 06:52:04 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.