$include_dir="/home/hyper-archives/ublas/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [ublas] cublas again
From: Rutger ter Borg (rutger_at_[hidden])
Date: 2011-03-21 03:48:15
On 03/19/2011 10:35 PM, Andrey Asadchev wrote:
> Hello.
>
> I have incorporated cublas bindings on top of blas bindings, such that
> you can use blas and cublas at once:
[snip]
Hello Andrey,
cool stuff! I've been pondering about how to do GPU-based computation 
for a while now.
I think we could greatly benefit if we'd use NVidia's support for 
compiling C/C++ code directly for the GPU, through nvcc. I'm not too 
much of an expert yet, but cuda supports running "kernels" which are 
similar to functors.
Anyway, first things first, and IMHO that would be to choose the right 
computational model to get the most out of both the CPU and GPU. I think 
we should follow an asynchronous programming model to achieve that. To 
get an idea, here's an example:
boost::asio::io_service ios;
ublas::vector< double > vec1;
cuda::vector< double > vec2( ios );
// this operation calls nvidia's asynchronous copy operation
// or vec2.async_assign( ... );
cuda::async_copy( begin( vec1 ), begin( vec2 ), end( vec2 ), 
boost::bind( ©_done, _1 ) );
.... do stuff on the CPU while the copy to the GPU is in progress ......
blas::gemm( .... );
.... do more stuff
void copy_done( const boost::system::error_code& error ) {
   if ( !error ) {
     ....
     bindings::cuda::async_run( some_kernel(), &kernel_done );
     ....
   }
}
void kernel_done( const boost::system::error_code& error ) {
    ...
}
in some other source file some_kernel, compile with nvcc, ends up with 
native GPU code:
class some_kernel() {
   operator()() {
       ... not 100% of C/C++ supported yet
       ... cublas::gemm( .... );
   }
};
The next question would be on how to get this rolling. Although the 
numeric bindings bind to external numeric libraries, the cublas/blas 
algorithms might be a good fit. Containers, on the other hand, (and the 
stuff shown above) might warrant a separate library, or an extension to 
uBLAS.
What do you guys think?
Cheers,
Rutger