[SERVER-73757] Calling storageSize() on ephemeral temporary record stores returns 0 Created: 08/Feb/23 Updated: 23/Feb/23 |
|
| Status: | Backlog |
| Project: | Core Server |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Bug | Priority: | Major - P3 |
| Reporter: | Adi Agrawal | Assignee: | Backlog - Storage Execution Team |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Assigned Teams: |
Storage Execution
|
||||||||||||||||
| Operating System: | ALL | ||||||||||||||||
| Steps To Reproduce: |
|
||||||||||||||||
| Participants: | |||||||||||||||||
| Description |
|
When we create a temporary record store, the field _tracksSizeAdjustment is set to false. On inserting records, the method _changeNumRecordsAndDataSize is called to add the statistics. The method returns without modifying the _sizeInfo when _tracksSizeAdjustment is set to false. When calling storageSize() to get the size, if storage engine is ephemeral, dataSize() is called which uses _sizeInfo. |
| Comments |
| Comment by Louis Williams [ 10/Feb/23 ] |
Yeah, I think we're in agreement on the problem. We explicitly don't update the SizeStorer for these temporary record stores. The getStatisticsValue() call would not return useful information, so on inMemory, we return whatever is in the size storer. But in this case, it always returns zero. |
| Comment by David Storch [ 09/Feb/23 ] |
The impact that we're observing is that the execution stats exposed by SBE's $group implementation in explain and serverStatus are wrong when spilling occurs and the in-memory storage engine is in use. So it's a diagnostics problem, which might reduce its priority. I think the SE team is in a better position to evaluate how we want to prioritize fixes for inMemory specifically against other work. (I don't know the adoption of inMemory amongst our user base, for example.) I would add that if we're going to work on any improvement in this area, I would argue it would be more important for us to spend time on related ticket
When you say this are you referring specifically to the SizeStorer? The observation we made is that the $group spill stats seem to be incorrect only on inMemory. The reason is presumably as Adi observes above, which is that the storageSize() method uses WiredTigerUtil::getStatisticsValue() for non-ephemeral record stores.
|
| Comment by Louis Williams [ 09/Feb/23 ] |
No, no TemporaryRecordStores track size information. |
| Comment by Louis Williams [ 09/Feb/23 ] |
|
StorEx can take this ticket. If I remember why we made this decision originally, it was in part because the SizeStorer has a slow memory leak: What is the priority of this? |
| Comment by Adi Agrawal [ 08/Feb/23 ] |
|
Thanks for catching this. It should apply only to ephemeral record stores. This is because of how storageSize() is implemented. If the storage engine is in memory, then storageSize() calls dataSize() which checks the _sizeInfo. However, if the storage engine is not in memory, the size is obtained from |
| Comment by David Storch [ 08/Feb/23 ] |
|
If it applies to non-ephemeral record stores, then how come the problem we observed around the $group spilling stats in |
| Comment by Adi Agrawal [ 08/Feb/23 ] |
|
|
| Comment by David Storch [ 08/Feb/23 ] |
|
adi.agrawal@mongodb.com to clarify, this isn't true for all temporary record stores, it's only true for ephemeral temporary record stores. Is that correct? louis.williams@mongodb.com is this something that your team could schedule a fix for, assuming you still believe that it is incorrect behavior? |