Loading...

XML

Word

Printable

JSON

Type: Improvement
Resolution: Fixed
Priority: Unknown
Fix Version/s: 4.5
Affects Version/s: None
Component/s: None
Labels:
None

Epic Link:
Python BSON performance
Confidence Status:
None

Aha! Reference:
None
Tracking Level:
None
Risk Status:
None
Exec Notes:
None
Goal Link:
None
Goal Name(s):
None

The C extensions have handling for encoding dict, RawBSONDocument, and any mapping class. We can optimize these code paths for the common case which is standard dict.

For example, write_dict here https://github.com/mongodb/mongo-python-driver/blob/7e96249/bson/_cbsonmodule.c#L1510 checks for "_type_marker" and PyObject_IsInstance() on every mapping object even if we know that object is a dict already.

We should skip those checks if the object is PyDict_Check or PyDict_CheckExact. We might be able to apply this style of optimization in other parts of the encoding/decoding logic too.

These inefficiencies could explain why encoding a deeply nested document is much slower than decoding the same deeply nested document. We should expect encoding to always be faster than decoding since the encoding side should allocate fewer Python objects. Notice that TestDeepEncoding has lower throughput than TestDeepDecoding:

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Screenshot 2023-07-06 at 4.27.45 PM.png
574 kB
Jul 10 2023 06:56:59 PM UTC
screenshot-1.png
330 kB
Jul 11 2023 08:17:48 PM UTC

is related to

PYTHON-3824 Optimize BSON encoding of standard Python list and tuples

Closed

Assignee:: Iris Ho
Reporter:: Shane Harvey
Votes:: 0 Vote for this issue
Watchers:: 2 Start watching this issue

Created:: Jul 07 2023 06:54:02 PM UTC
Updated:: Oct 29 2023 02:27:41 AM UTC
Resolved:: Jul 11 2023 08:23:32 PM UTC
Confidence Status Last Update:: 10/Jul/23 4:14 PM

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates