[SERVER-78480] Avoid unnecessary string copies in NumberParser Created: 27/Jun/23  Updated: 29/Oct/23  Resolved: 09/Oct/23

Status: Closed
Project: Core Server
Component/s: None
Affects Version/s: None
Fix Version/s: 7.2.0-rc0

Type: Improvement Priority: Major - P3
Reporter: Matt Boros Assignee: Patrick Freed
Resolution: Fixed Votes: 0
Labels: perf, techdebt
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Related
related to SERVER-62509 Write tests to stress ABT and Bonsai Closed
Assigned Teams:
Service Arch
Backwards Compatibility: Fully Compatible
Sprint: Service Arch 2023-10-02, Service Arch 2023-10-16
Participants:

 Description   

NumberParser is used in BSONElement, the SBE VM, and many _gen files. This class uses a helper function parseNumberFromStringHelper, where I see two unnecessary string copies: #1 and #2.

In an extreme case I came up with, parsing one million numbers from {$match: {a: 1, b: 1, c: 1, ...}} originally took 4 seconds, and with a few lines change to avoid a copy it took 2 seconds instead, and the patch was green.

We should also look at the Decimal128 constructors, they appear to make unnecessary copies as well.



 Comments   
Comment by Githook User [ 06/Oct/23 ]

Author:

{'name': 'Patrick Freed', 'email': 'patrick.freed@mongodb.com', 'username': 'patrickfreed'}

Message: SERVER-78480 Eliminate redundant copies in Decimal128 string parsing
Branch: master
https://github.com/mongodb/mongo/commit/e93490c661599645453e09d9bd9938df62cdb166

Comment by Patrick Freed [ 04/Oct/23 ]

The copies in the double path are actually necessary, since the underlying conversion function being used (strtod) requires a null-terminated string, which is not guaranteed for StringData::rawData. The newer from_chars doesn't have this requirement, but unfortunately it seems to be slower despite not having the copy.

------------------------------------------------------------------------------------------
Benchmark                                Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------------------------
BM_NumberParserDouble10k/1            65.8 ns         65.8 ns     10633441 bytes_per_second=86.936M/s
BM_NumberParserDouble10k/8             574 ns          574 ns      1219148 bytes_per_second=96.4312M/s
BM_NumberParserDouble10k/64           4800 ns         4800 ns       145662 bytes_per_second=94.3676M/s
BM_NumberParserDouble10k/512         39810 ns        39809 ns        17591 bytes_per_second=90.0043M/s
BM_NumberParserDouble10k/4096       324067 ns       324058 ns         2159 bytes_per_second=89.0407M/s
BM_NumberParserDouble10k/32768     2606815 ns      2606747 ns          269 bytes_per_second=88.6063M/s
BM_NumberParserDouble10k/100000    7948380 ns      7948237 ns           88 bytes_per_second=88.6538M/s
BM_from_charsDouble10k/1               120 ns          120 ns      5847886 bytes_per_second=47.8316M/s
BM_from_charsDouble10k/8              1000 ns         1000 ns       700490 bytes_per_second=55.3387M/s
BM_from_charsDouble10k/64             8286 ns         8286 ns        84387 bytes_per_second=54.67M/s
BM_from_charsDouble10k/512           67177 ns        67176 ns        10415 bytes_per_second=53.337M/s
BM_from_charsDouble10k/4096         542507 ns       542457 ns         1290 bytes_per_second=53.192M/s
BM_from_charsDouble10k/32768       4346975 ns      4346905 ns          161 bytes_per_second=53.1353M/s
BM_from_charsDouble10k/100000     13271530 ns     13271182 ns           53 bytes_per_second=53.0956M/s
BM_strtodDouble10k/1                  55.4 ns         55.4 ns     12673668 bytes_per_second=103.33M/s
BM_strtodDouble10k/8                   492 ns          492 ns      1422592 bytes_per_second=112.489M/s
BM_strtodDouble10k/64                 4192 ns         4192 ns       166948 bytes_per_second=108.06M/s
BM_strtodDouble10k/512               34775 ns        34775 ns        20137 bytes_per_second=103.032M/s
BM_strtodDouble10k/4096             286776 ns       286770 ns         2442 bytes_per_second=100.618M/s
BM_strtodDouble10k/32768           2306171 ns      2306147 ns          304 bytes_per_second=100.156M/s
BM_strtodDouble10k/100000          7041471 ns      7041207 ns           99 bytes_per_second=100.074M/s

benchmark: https://github.com/patrickfreed/mongo/blob/parse-number-bm-old/src/mongo/base/parse_number_bm.cpp (the 10k refers to the min/max number being parsed, the parameter is the number of random numbers parsed)

The copies in the Decimal128 path were not necessary though, and removing them demonstrated some performance benefits, so I've gone ahead with just that.

Comment by Kyle Suarez [ 11/Jul/23 ]

We think that Service Arch owns this so punting this ticket over.

Generated at Thu Feb 08 06:38:28 UTC 2024 using Jira 9.7.1#970001-sha1:2222b88b221c4928ef0de3161136cc90c8356a66.