$include_dir="/home/hyper-archives/ublas/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [ublas] Announcement of ViennaCL (linear algebra on GPUs)
From: Karl Rupp (rupp_at_[hidden])
Date: 2010-06-08 06:22:51
Hi Rutger,
> Awesome! Are you aware of the numeric bindings library? It contains bindings 
> to CUBLAS (levels 1 through 3) -- if you're looking for optimized kernels 
> for your NVidia card, that would be the best bet I guess.
Since we rely on OpenCL and not on NVidia CUDA, some porting effort 
would certainly be necessary. Including the PTX kernel binaries in a 
header-only library (oh gosh, I forgot to mention that - ViennaCL is 
header-only!) is on the other hand quite a challenge...
The nice thing about OpenCL is the just-in-time compiler, which in 
principle allows to generate the required compute kernels at runtime. In 
particular, an complicated operation like
gpu_vec1 = alpha * gpu_vec2 - beta * gpu_vec3 + gamma * gpu_vec4
(with GPU vectors gpu_vec1, ... , gpu_vec4)
can be completely wrapped in expression templates and create a compute 
kernel which eliminates all temporaries involved. But that's not a 
trivial thing... ;-)
> I like the way you've implemented the copy operator. Does it matter what 
> kind of input iterator is used for copying vectors/matrices? 
At present, the input iterators for vectors have to provide a difference 
type in order to determine the required memory size on the GPU - this 
can be easily relaxed. What happens internally is that data is copied to 
a temporary std::vector (this is to allow also sparse vectors and all 
other sorts of storage schemes) and a pointer to the std::vector is then 
passed to the OpenCL memory copy routines. This is the safe way.
We also provide a fast_copy() operation, which requires that the 
.begin() input iterator points to the first element in a linear memory 
sequence, so that the temporary std::vector is not needed at all.
For matrices, we rely on the iterator1 and iterator2 types as provided 
by ublas types. A wrapper class is provided for sparse matrices of type 
std::vector< std::map<> >, so ViennaCL can also be used without ublas 
(e.g. if boost is not available)
 > If not, it
 > would be possible to do just
 >
 > copy( bindings::begin(vec), bindings::end(vec), begin(gpu_vec) );
 >
 > and have it work for many types of vectors.
If the bindings provide a difference type, this should work out of the 
box, yes - but I haven't checked that yet.
> What's the license of ViennaCL?
It an MIT (X11) license, so compatible with almost everything.
Best regards,
Karli