Ublas mailing page: Re: [ublas] Announcement of ViennaCL (linear algebra on GPUs)

Date view	Thread view	Subject view	Author view

Subject: Re: [ublas] Announcement of ViennaCL (linear algebra on GPUs)
From: Karl Rupp (rupp_at_[hidden])
Date: 2010-06-08 06:22:51

Next message: Anders Kabell Kristensen: "[ublas] [uBLAS] What's inside the uBLAS matrix multiplication algorithm?"
Previous message: Rutger ter Borg: "Re: [ublas] Announcement of ViennaCL (linear algebra on GPUs)"
In reply to: Rutger ter Borg: "Re: [ublas] Announcement of ViennaCL (linear algebra on GPUs)"

Hi Rutger,

> Awesome! Are you aware of the numeric bindings library? It contains bindings
> to CUBLAS (levels 1 through 3) -- if you're looking for optimized kernels
> for your NVidia card, that would be the best bet I guess.

Since we rely on OpenCL and not on NVidia CUDA, some porting effort
would certainly be necessary. Including the PTX kernel binaries in a
header-only library (oh gosh, I forgot to mention that - ViennaCL is
header-only!) is on the other hand quite a challenge...

The nice thing about OpenCL is the just-in-time compiler, which in
principle allows to generate the required compute kernels at runtime. In
particular, an complicated operation like

gpu_vec1 = alpha * gpu_vec2 - beta * gpu_vec3 + gamma * gpu_vec4

(with GPU vectors gpu_vec1, ... , gpu_vec4)
can be completely wrapped in expression templates and create a compute
kernel which eliminates all temporaries involved. But that's not a
trivial thing... ;-)

> I like the way you've implemented the copy operator. Does it matter what
> kind of input iterator is used for copying vectors/matrices?

At present, the input iterators for vectors have to provide a difference
type in order to determine the required memory size on the GPU - this
can be easily relaxed. What happens internally is that data is copied to
a temporary std::vector (this is to allow also sparse vectors and all
other sorts of storage schemes) and a pointer to the std::vector is then
passed to the OpenCL memory copy routines. This is the safe way.

We also provide a fast_copy() operation, which requires that the
.begin() input iterator points to the first element in a linear memory
sequence, so that the temporary std::vector is not needed at all.

For matrices, we rely on the iterator1 and iterator2 types as provided
by ublas types. A wrapper class is provided for sparse matrices of type
std::vector< std::map<> >, so ViennaCL can also be used without ublas
(e.g. if boost is not available)

> If not, it
> would be possible to do just
>
> copy( bindings::begin(vec), bindings::end(vec), begin(gpu_vec) );
>
> and have it work for many types of vectors.

If the bindings provide a difference type, this should work out of the
box, yes - but I haven't checked that yet.

> What's the license of ViennaCL?

It an MIT (X11) license, so compatible with almost everything.

Best regards,
Karli

Next message: Anders Kabell Kristensen: "[ublas] [uBLAS] What's inside the uBLAS matrix multiplication algorithm?"
Previous message: Rutger ter Borg: "Re: [ublas] Announcement of ViennaCL (linear algebra on GPUs)"
In reply to: Rutger ter Borg: "Re: [ublas] Announcement of ViennaCL (linear algebra on GPUs)"

Date view	Thread view	Subject view	Author view