12% regression in the SortGroupWithBigDocuments test. As the name suggests, big documents are inserted and the graphed operations on evergreen are sort and group.
The initial analysis is provided on the BF linked to this ticket. To summarise the analysis, there is a regression in group and sort operations (~12%) and this is caused due to the changes made in
From the FTDC data it was observed that the cache fill ratio is pretty low ( < 1%) and the dirty fill ratio ( <0.5%). In spite of that, repositioning process is triggered, (the reason remains unknown, however, the guess is that it is due to large pages in the cache). Hence, more pages are selected for forceful eviction(mainly dirty), out of which most are successfully evicted.
Since the application thread (mainly cursor next and search) are triggering the reposition process, the latency of those operations increase, which leads to this regression.
- Does this affect any team outside of WT?
- How likely is it that this use case or problem will occur?
The regression is reproducible 100% of the times.
Acceptance Criteria (Definition of Done)
- Find out the reason the repositioning process is triggered, even though the cache fill ratio is negligible.
- Restrict the repositioning process if that helps with getting the performance back.