[COMPASS-7130] Update schema format from mongodb-schema to pass highest probability Created: 21/Aug/23  Updated: 10/Nov/23  Resolved: 07/Nov/23

Status: Closed
Project: Compass
Component/s: GAI, Schema
Affects Version/s: None
Fix Version/s: 1.41.0

Type: Task Priority: Major - P3
Reporter: Rhys Howell Assignee: Rhys Howell
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
Story Points: 5
Documentation Changes: Not Needed
Sprint: Iteration Minmi, Iteration Nodosaurus

 Description   

We now pass only one type in our prompt. Let's update the getSimplifiedSchema in mongodb-schema to have the highest probability type first in the types array.

Old description v
This ticket involves a bit of an investigation.
We're currently returning a simplified schema generated in `mongodb-schema`: https://github.com/mongodb-js/mongodb-schema/blob/c15e4b70972163182a1189ccb64533d5fee154b4/src/schema-analyzer.ts#L580 
This schema representation takes up a decent amount of characters as it represents everything with an object like format, with an array of types and then possible subfields inside of it.
We'd like to have the format take up less tokens if possible. Additionally, if possible, we'd like the ai to better understand the schema.
This ticket involves seeing if a condensed format is better interpreted by the ai model, and using it if so.

Something to reference, Maurizio did a bit of work on a more condensed schema representation in this branch which is on the receiving end of the schema: https://github.com/10gen/compass-mongodb-com/compare/mql-only-poc. It probably makes sense to keep the schema being sent to the server with some detail and then condense into prompt shape on the backend, so that we have more freedom over changing the prompt on the fly.



 Comments   
Comment by Githook User [ 10/Nov/23 ]

Author:

{'name': 'Rhys', 'email': 'Anemy@users.noreply.github.com', 'username': 'Anemy'}

Message: chore(deps): bump mongodb-schema for simplified schema type sort COMPASS-7130 (#5069)
Branch: beta-releases
https://github.com/mongodb-js/compass/commit/c787db5509648c04a843991556d9d2f4af0e3db5

Comment by Githook User [ 07/Nov/23 ]

Author:

{'name': 'Rhys', 'email': 'Anemy@users.noreply.github.com', 'username': 'Anemy'}

Message: chore(deps): bump mongodb-schema for simplified schema type sort COMPASS-7130 (#5069)
Branch: 7321-dev
https://github.com/mongodb-js/compass/commit/c787db5509648c04a843991556d9d2f4af0e3db5

Comment by Githook User [ 06/Nov/23 ]

Author:

{'name': 'Rhys', 'email': 'Anemy@users.noreply.github.com', 'username': 'Anemy'}

Message: chore(deps): bump mongodb-schema for simplified schema type sort COMPASS-7130 (#5069)
Branch: main
https://github.com/mongodb-js/compass/commit/c787db5509648c04a843991556d9d2f4af0e3db5

Generated at Wed Feb 07 22:45:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.