[SERVER-11919] Add Tokenizer to JSON parser Created: 02/Dec/13  Updated: 10/May/22

Status: Backlog
Project: Core Server
Component/s: Internal Code
Affects Version/s: None
Fix Version/s: None

Type: Improvement Priority: Minor - P4
Reporter: Shaun Verch Assignee: DO NOT USE - Backlog - Platform Team
Resolution: Unresolved Votes: 1
Labels: move-sa, neweng
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Depends
is depended on by SERVER-11920 Use parseNumberFromString in the JSON... Backlog
Related
related to TOOLS-183 JSON parser needs better context, val... Accepted
is related to SERVER-5677 rewrite json parser Closed
Participants:

 Description   

The JSON parser currently reads directly from the JSON string buffer, converting to BSON as it goes using recursive descent. This ticket is about adding a Tokenizer as an initial stage before doing the conversion to BSON. This may improve performance in some cases, since we could potentially reduce the number of string comparisons needed to determine which "$<keyword>" we are using, or which "<constructor>()" we are using for extended JSON types. This also may facilitate better error messages.

The fact that the JSON parser doesn't Tokenize as a separate stage also means that we cannot use parseNumberFromString, which is currently our standard way to read numbers out of strings, but requires the length of the number to be known beforehand.



 Comments   
Comment by Steven Vannelli [ 10/May/22 ]

Moving this ticket to the Backlog and removing the "Backlog" fixVersion as per our latest policy for using fixVersions.

Generated at Thu Feb 08 03:27:06 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.