It would be nice to leverage the available and ever growing GPU resources available in machines in some fashion. It makes the most sense for heavy OLAP/Analytical queries – but the biggest internal feature we'd need to add first is support for parallel query processing. Then the single user request can be divided up into N parts and executed in parallel by multiple threads, subsystems, or even [remote] processes. Then building on that, we could determine which parts would naturally perform well on GPU's – things like aggregations, sorting, grouping, etc. – and send that portion of the work to the GPU's using something like the CUDA API.