It is time we have proper mechanism to stress test Cedar in some staging environment, either the existing staging or a separate one. This will be a heavy investment, but well-worth the effort for our current development cycle.
Things to think about:
- How big does the staging environment be?
- How will we realistically replicate Cedar load? There are various "services", such as buildlogger and test results, that compete for the same resources as load increases. These data vary in shape, processing, frequency of writes and reads (gRPC versus HTTP), etc. This is made more complicated by the intentional segmentation of data (different storage for metadata and data, separation into small components/units that may only make sense when combined with other components, etc).
- Insight into Evergreen's load would be helpful here, if it exists.