Run accuracy tests in Compass nightly vs cloud dev

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Done
    • Priority: Major - P3
    • No version
    • Affects Version/s: None
    • Component/s: GAI
    • None
    • 3
    • Iteration Minmi, Iteration Nodosaurus
    • Not Needed

      We'd like to know when prompt changes or other regressions impact the accuracy of the generative ai results. To do this we'll run the accuracy tests that were recently added to Compass (scripts/ai-accuracy-tests.js) on a nightly basis. They should fail under a certain threshold. 
      Currently these tests might be synchronous, we might need to parallelize if its really slow.

              Assignee:
              Rhys Howell
              Reporter:
              Rhys Howell
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: