Uploaded image for project: 'Python Driver'
  1. Python Driver
  2. PYTHON-2382

Memory leak in mongoDB

    • Type: Icon: Bug Bug
    • Resolution: Fixed
    • Priority: Icon: Major - P3 Major - P3
    • 3.11.1
    • Affects Version/s: 3.11
    • Component/s: Codecs
    • Labels:
      None
    • Environment:
      pymongo 3.11.0
      pymongo.has_c() True
      bson.has_c() True
      MongoDB: version 4.4.1
      Platform: Windows

      There is a memory leak when using a custom codec. The memory leak seems similar to https://jira.mongodb.org/browse/PYTHON-1554

      When using to custom codecs, the memory leak is not there (e.g. when using `type_registry = TypeRegistry() `)

      A minimal example reproducing the problem is:

      #%%
      import gc
      from typing import Any
      from pymongo import MongoClient
      from bson.codec_options import CodecOptions
      from bson.codec_options import TypeCodec, TypeRegistry
      
      #%% Define method to analyse the memory
      def memory_report():
          gc.collect()
          rr={}
          for obj in gc.get_objects():
      
              tt=type(obj)
              if 'bson' in str(tt):
                  rr[tt] = rr.get(tt, 0)+1
      
          print('memory report:')
          for key in rr:
              nn=rr[key]
              print(f' {key}: {nn}')
      
      
      #%% Create codec
      
      class MyClass:
          pass
      
      class MyCodec(TypeCodec):  # type: ignore
      
          pass
          @property
          def python_type(self) -> Any:
               return MyClass
      
          def transform_python(self, value: Any) -> Any:
              #print(f'transform_python: {value}')
              return value
      
          @property
          def bson_type(self) -> Any:
               return dict
      
          def transform_bson(self, value: Any) -> Any:
               #print(f'transform_bson: {value}')
               return value
          
      codec=MyCodec()
      type_registry = TypeRegistry([codec]) 
      codec_options = CodecOptions(type_registry=type_registry) 
      
      client = MongoClient('localhost:27017')
      database = client['my_database']
      collection = database['my_collection'].with_options(codec_options)
      
      for ii in range(3000):
          cursor = collection.find({})
          for data in cursor:
              pass
      
      _=memory_report() # for multiple iterations objects of type bson.codec_options.CodecOptions are leaking
      

            Assignee:
            prashant.mital Prashant Mital (Inactive)
            Reporter:
            pieter.eendebak@gmail.com Pieter Eendebak
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: