[CSHARP-332] read large int array Created: 28/Sep/11 Updated: 02/Apr/15 Resolved: 29/Sep/11 |
|
| Status: | Closed |
| Project: | C# Driver |
| Component/s: | None |
| Affects Version/s: | 1.1 |
| Fix Version/s: | 1.3 |
| Type: | Improvement | Priority: | Major - P3 |
| Reporter: | Andrei Neagu | Assignee: | Robert Stam |
| Resolution: | Done | Votes: | 0 |
| Labels: | performance | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Environment: |
Windows 7 x64 bit, mongodb 2.0 |
||
| Attachments: |
|
| Description |
|
Reading/Writing a large number of int values (20-30 k or more) into a BsonArray takes a lot of time. The deserialization is done into a BsonDocument. |
| Comments |
| Comment by Robert Stam [ 29/Sep/11 ] |
|
See the Google Groups discussion for more information about the implications of creating an index on a very large array element. |
| Comment by Andrei Neagu [ 29/Sep/11 ] |
|
With 60% it should be way better Please have a look at http://groups.google.com/group/mongodb-user/browse_thread/thread/8cdc9d0fed6dfc23 also. Thanks |
| Comment by Andrei Neagu [ 29/Sep/11 ] |
|
can you have a test with this one? I didn't mention there is an index on that object. |
| Comment by Robert Stam [ 29/Sep/11 ] |
|
Resolved issue for now. Will reopen if further discussion warrants. |
| Comment by Robert Stam [ 29/Sep/11 ] |
|
In the BSON specification arrays are stored as pseudo-documents where the elements are named "0", "1", etc... The C# driver ignores these names during deserialization, and one quick optimization was to not bother doing the UTF8 decoding of these element names when they were going to be ignored anyway. Making this one small change improved performance by about 60%: 200 iterations in 00:00:03.1001773 That was the only low hanging fruit though. Does this seem like enough of an improvement to you? |
| Comment by Robert Stam [ 29/Sep/11 ] |
|
If I run the test program without the debugger it's considerably faster: 200 iterations in 00:00:04.9522832 Note: replaced values with slightly lower numbers. Each run produces different numbers (???) and these values are closer to the median. |
| Comment by Robert Stam [ 29/Sep/11 ] |
|
This is the output of the test program on my computer (your numbers will vary): 200 iterations in 00:00:05.8833365 Note that while 33 documents per second sounds slow, when you look at it as deserializing over 1 million array values per second it doesn't sound so slow any more. Nonetheless, I will run this under a profile and look for possible optimizations. |
| Comment by Robert Stam [ 29/Sep/11 ] |
|
I've attached the test program I'm using to evaluate performance of reading documents with very large arrays of integers. |