Run accuracy tests in Compass nightly vs cloud dev

XMLWordPrintableJSON

    • Type: Task
    • Resolution: Done
    • Priority: Major - P3
    • No version
    • Affects Version/s: None
    • Component/s: GAI
    • None
    • 3
    • Iteration Minmi, Iteration Nodosaurus
    • Not Needed

      We'd like to know when prompt changes or other regressions impact the accuracy of the generative ai results. To do this we'll run the accuracy tests that were recently added to Compass (scripts/ai-accuracy-tests.js) on a nightly basis. They should fail under a certain threshold. 
      Currently these tests might be synchronous, we might need to parallelize if its really slow.

            Assignee:
            Rhys Howell
            Reporter:
            Rhys Howell
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: