[SERVER-73757] Calling storageSize() on ephemeral temporary record stores returns 0 Created: 08/Feb/23  Updated: 23/Feb/23

Status: Backlog
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Major - P3
Reporter: Adi Agrawal Assignee: Backlog - Storage Execution Team
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-74335 Spill to disk tests for $group must b... Closed
is related to SERVER-73717 When using in-memory storage engine s... Closed
is related to SERVER-73311 Report group spill stats in serverSta... Closed
Assigned Teams:
Storage Execution
Operating System: ALL
Steps To Reproduce:
  • create a temporary record store
  • insert some records
  • call storageSize()
Participants:

 Description   

When we create a temporary record store, the field _tracksSizeAdjustment is set to false

On inserting records, the method _changeNumRecordsAndDataSize is called to add the statistics. The method returns without modifying the _sizeInfo when _tracksSizeAdjustment is set to false.

When calling storageSize() to get the size, if storage engine is ephemeral, dataSize() is called which uses _sizeInfo.



 Comments   
Comment by Louis Williams [ 10/Feb/23 ]

When you say this are you referring specifically to the SizeStorer? The observation we made is that the $group spill stats seem to be incorrect only on inMemory. The reason is presumably as Adi observes above, which is that the storageSize() method uses WiredTigerUtil::getStatisticsValue() for non-ephemeral record stores.

Yeah, I think we're in agreement on the problem. We explicitly don't update the SizeStorer for these temporary record stores. The getStatisticsValue() call would not return useful information, so on inMemory, we return whatever is in the size storer. But in this case, it always returns zero.

Comment by David Storch [ 09/Feb/23 ]

 
What is the priority of this?

The impact that we're observing is that the execution stats exposed by SBE's $group implementation in explain and serverStatus are wrong when spilling occurs and the in-memory storage engine is in use. So it's a diagnostics problem, which might reduce its priority. I think the SE team is in a better position to evaluate how we want to prioritize fixes for inMemory specifically against other work. (I don't know the adoption of inMemory amongst our user base, for example.) I would add that if we're going to work on any improvement in this area, I would argue it would be more important for us to spend time on related ticket SERVER-73717.

No, no TemporaryRecordStores track size information.

When you say this are you referring specifically to the SizeStorer? The observation we made is that the $group spill stats seem to be incorrect only on inMemory. The reason is presumably as Adi observes above, which is that the storageSize() method uses WiredTigerUtil::getStatisticsValue() for non-ephemeral record stores.
 

 

Comment by Louis Williams [ 09/Feb/23 ]

it's only true for ephemeral temporary record stores. Is that correct?

No, no TemporaryRecordStores track size information.

Comment by Louis Williams [ 09/Feb/23 ]

StorEx can take this ticket. If I remember why we made this decision originally, it was in part because the SizeStorer has a slow memory leak: SERVER-33494. We don't delete size information for record stores once they have been dropped. So as part of this fix, we may need to evaluate the consequences of creating more stale entires in the SizeStorer.

What is the priority of this?

Comment by Adi Agrawal [ 08/Feb/23 ]

Thanks for catching this. It should apply only to ephemeral record stores.

This is because of how storageSize() is implemented. If the storage engine is in memory, then storageSize() calls dataSize() which checks the _sizeInfo. However, if the storage engine is not in memory, the size is obtained from
WiredTigerUtil::getStatisticsValue 
 

Comment by David Storch [ 08/Feb/23 ]

If it applies to non-ephemeral record stores, then how come the problem we observed around the $group spilling stats in SERVER-73311 only manifested for the inMemory storage engine?

Comment by Adi Agrawal [ 08/Feb/23 ]

I think it applies to all temporary record stores. The field _tracksSizeAdjustment is set to false regardless of if the record store is ephemeral. (https://github.com/10gen/mongo/blob/master/src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp/#L1811-L1812)

Comment by David Storch [ 08/Feb/23 ]

adi.agrawal@mongodb.com to clarify, this isn't true for all temporary record stores, it's only true for ephemeral temporary record stores. Is that correct?

louis.williams@mongodb.com is this something that your team could schedule a fix for, assuming you still believe that it is incorrect behavior?

Generated at Thu Feb 08 06:25:32 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.