Uploaded image for project: 'Compass '
  1. Compass
  2. COMPASS-7763

Evaluate GPT-4 model

    • Type: Icon: Investigation Investigation
    • Resolution: Done
    • Priority: Icon: Major - P3 Major - P3
    • No version
    • Affects Version/s: None
    • Component/s: GAI
    • 5
    • Iteration Zephyrosaurus
    • Not Needed
    • Developer Tools

      Test using GPT 4 turbo for query and aggregation generation (instead of 3.5 turbo). We know that it is both more expensive and slower. It does however perform better on a number of benchmarks which indicate it can create more elaborate pipelines with more accuracy.

       

      https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo 

       

      `gpt-4-turbo-preview` is trained on data up to December 2023, while our current `gpt-3.5-turbo` only has data up to September 2021.

       

      To gauge the feasibility of switching models we will perform the generative AI accuracy tests a number of times with the new model and assess the:

       

      Cost

      How much does running this new model with our usual token usages cost in comparison? This is something we can mostly quantify outside of actually running the model as the tokenizer is publicly available, however we will not fully know until we get the token counts from the generations.

      Accuracy

      What % improvement on the accuracy tests does the newer model have? Does it support newer aggregation syntax?

      Speed

      Less than 2x speed at least. To be measured with updated accuracy test results.

      Availability

      What are the regions? Can we do the task of having a failover region like we planned for GPT 3.5

      Fine-tuning capability (milestone 4 support)

      This may be less required if the training data (up to December 2023) has knowledge of newer MongoDB syntax out of the box.

       

      Once we’ve looked at the results of these various indicators we will make a decision on if we should use this newer model in our backend.

            Assignee:
            rhys.howell@mongodb.com Rhys Howell
            Reporter:
            rhys.howell@mongodb.com Rhys Howell
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: