Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-4146

Use insert_many to upload GridFS chunks for better performance

    • Type: Icon: Task Task
    • Resolution: Fixed
    • Priority: Icon: Unknown Unknown
    • 4.7
    • Affects Version/s: None
    • Component/s: GridFS, Performance
    • Labels:
      None

      Consider using insert_many to upload GridFS chunks for better performance.

      Context

      While talking to james.kovacs@mongodb.com about CSHARP-4900 he mentioned that the .NET driver uses insert_many to insert into the chunks collection. That was surprising to me since PyMongo uses insert_one. Using insert_many could improve the GridFS upload performance and this could explain why our GridFS upload throughput is much lower than our download throughput.

      Pitfalls

      We need to take care not to inflate the data stream too much because we don't want to bloat memory usage. We'd probably want to limit the insert_many batch to less than the max OP_MSG message size.

            Assignee:
            shane.harvey@mongodb.com Shane Harvey
            Reporter:
            shane.harvey@mongodb.com Shane Harvey
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: