Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Major - P3
Fix Version/s: 7.2.0-rc0
Affects Version/s: None
Component/s: None
Labels:

Assigned Teams:

Service Arch
Backwards Compatibility:
Fully Compatible
Sprint:
Service Arch 2023-10-02, Service Arch 2023-10-16, Service Arch 2023-10-30, Service Arch 2023-11-13
Confidence Status:
None
Work Order:
3

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Name(s):
None
Goal Link:
None

When we parse a BSON array in IDL, we check that all the field names are as expected using the general purpose NumberParser class, and for larger arrays this can take significant time. For an example, here is the parsing of an oplog entry using code from ~~SERVER-81101~~ with NumberParser:

--------------------------------------------------------------------------------------
Benchmark                                            Time             CPU   Iterations
--------------------------------------------------------------------------------------
BM_ParseOplogEntryWithNoStatementId                204 ns          204 ns      2681374
BM_ParseOplogEntryWithOneStatementId               239 ns          239 ns      2939634
BM_ParseOplogEntryWithMultiStatementId/2           328 ns          328 ns      2138516
BM_ParseOplogEntryWithMultiStatementId/8           498 ns          498 ns      1405159
BM_ParseOplogEntryWithMultiStatementId/64         1985 ns         1985 ns       352795
BM_ParseOplogEntryWithMultiStatementId/512       14392 ns        14392 ns        48773
BM_ParseOplogEntryWithMultiStatementId/1000      27760 ns        27760 ns        25179

(The parameter is the number of entries in the statementID array; the first two benchmarks are not using arrays.)

Here is the same benchmark with all array fieldname checking removed

--------------------------------------------------------------------------------------
Benchmark                                            Time             CPU   Iterations
--------------------------------------------------------------------------------------
BM_ParseOplogEntryWithNoStatementId                201 ns          201 ns      2642658
BM_ParseOplogEntryWithOneStatementId               236 ns          236 ns      2963950
BM_ParseOplogEntryWithMultiStatementId/2           287 ns          287 ns      2437392
BM_ParseOplogEntryWithMultiStatementId/8           399 ns          399 ns      1756477
BM_ParseOplogEntryWithMultiStatementId/64         1035 ns         1035 ns       676153
BM_ParseOplogEntryWithMultiStatementId/512        6088 ns         6088 ns       115098
BM_ParseOplogEntryWithMultiStatementId/1000      11527 ns        11526 ns        60909

And here is the code using the C++ "std::from_chars" method

--------------------------------------------------------------------------------------
Benchmark                                            Time             CPU   Iterations
--------------------------------------------------------------------------------------
BM_ParseOplogEntryWithNoStatementId                204 ns          204 ns      2627978
BM_ParseOplogEntryWithOneStatementId               237 ns          237 ns      2951225
BM_ParseOplogEntryWithMultiStatementId/2           298 ns          298 ns      2350274
BM_ParseOplogEntryWithMultiStatementId/8           401 ns          401 ns      1743373
BM_ParseOplogEntryWithMultiStatementId/64         1138 ns         1138 ns       614838
BM_ParseOplogEntryWithMultiStatementId/512        7032 ns         7032 ns        99665
BM_ParseOplogEntryWithMultiStatementId/1000      13535 ns        13534 ns        51844

I tried a few other things like encoding the expected field number and comparing that, and incrementing the expected field number represented as a string; they weren't faster than from_chars.

These timings are on my Intel workstation so not too precise, but the differences are significant.

Ideally we wouldn't even have field names in BSON arrays but I think that ship has long since sailed.

causes

SERVER-82983 Fix ambiguity formatting DecimalCounter using libfmt in bsonelement.cpp

Closed

Assignee:: Patrick Freed
Reporter:: Matthew Russotto
Participants:: Billy Donahue, Githook User, Matthew Russotto, Patrick Freed
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: Sep 19 2023 02:12:16 PM UTC
Updated:: Nov 09 2023 02:48:21 PM UTC
Resolved:: Nov 06 2023 08:04:04 PM UTC
Confidence Status Last Update:: 20/Sep/23 9:08 PM

Details

Description

Attachments

Issue Links

Activity

People

Dates