[JAVA-245] DBCursor.toArray() should run decodes in thread-pool instead of serially Created: 28/Dec/10 Updated: 21/Sep/16 Resolved: 21/Sep/16 |
|
| Status: | Closed |
| Project: | Java Driver |
| Component/s: | Performance |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | New Feature | Priority: | Minor - P4 |
| Reporter: | Scott Hernandez (Inactive) | Assignee: | Unassigned |
| Resolution: | Won't Fix | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Description |
|
By using a thread-pool, or more than just a single thread, the decoding time can be greatly decreased. This means the results will be available to clients in a much shorter time. This could also be done for each batch the cursor retrieves; that might be enough to take care of the issue. |
| Comments |
| Comment by Eliot Horowitz (Inactive) [ 29/Dec/10 ] |
|
I get it, I've never seen a driver do something like this, though it makes sense. |
| Comment by Scott Hernandez (Inactive) [ 29/Dec/10 ] |
|
Right now when doing batch processing of records in java one of the bottlenecks is the time it takes to decode (and encode) to/from bson. By doing this in a thread-pool it seems like throughput would be pushed way up. It seems like this is going to the case with any application that wants to reduce the time it takes to get back the results. Maybe I'm not explaining things well. If I get the time I can put together a fork with some examples. |
| Comment by Eliot Horowitz (Inactive) [ 29/Dec/10 ] |
|
I didn't mean it wouldn't have to be in the driver, i mean its a pretty app level type of thing to be in a low level driver. I guess as an option i can be ok. |
| Comment by Scott Hernandez (Inactive) [ 29/Dec/10 ] |
|
Yeah, it is should not be the default. It really needs to be done in the driver. It needs to happen when the bson is being decoded into java objects. There is no other place but for the driver to do it. We could create a holder, like the java.util.concurrent.Future object, so that a factory could be used to do the decoding. The default implementation could just return a synchronous version, giving the same behavior that exists now. We could also provide an async version that runs some of them concurrently in a decoder pool (on multiple threads). Part of the issue now is that all decoding is done serially, even when you explicitly state that you don't want to use an iterator (like in toArray). On a multi-proc machine it would cut down on the total time to generate the list by using more cpu/cores. |
| Comment by Eliot Horowitz (Inactive) [ 28/Dec/10 ] |
|
Not sure that's something that it makes sense for a driver to do. |