The current documentation in http://api.mongodb.com/python/current/examples/bulk.html states that "A batch of documents can be inserted by passing a list to the insert_many() method. PyMongo will automatically split the batch into smaller sub-batches based on the maximum message size accepted by MongoDB, supporting very large bulk insert operations.".
I have a simple generator that generates dictionaries from bytes using JSON.load that I pass to the insert_many method of the pymongo.Collection class. This will ultimately lead to a MemoryError for large objects that always gets thrown at line 741 of https://github.com/mongodb/mongo-python-driver/blob/master/pymongo/collection.py.
I'm not an expert Python programmer, but after digging a bit into the code it seems to me that because of that line all contents will be expanded prior to the start of the bulk insertion process and thus not taking into consideration the size of individual documents to properly split these into batches.