$include_dir="/home/hyper-archives/ublas/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [ublas] Matrix multiplication performance
From: Oswin Krause (Oswin.Krause_at_[hidden])
Date: 2016-01-24 04:18:12
Hi,
I would still vote for the route to rewrite uBLAS based on BLAS bindings 
and providing a reasonable default implementation that also works well 
without memory assumptions.The main reason is that only having a fast 
gemm implementation does not really improve things, given that BLAS 
level 3 is a quite large beast.
Im still willing to donate my partial uBLAS rewrite, unfortunately I am 
a bit short on time to polish it(just finished my phd and have a huge 
load of work on my desk). But if someone opened a git-branch for that i 
could try to make the code ready (porting my implementation back to 
boost namespaces etc).
On 2016-01-23 18:53, palik imre wrote:
> Hi All,
> 
> what's next?  I mean what is the development process for ublas?
> 
> Now we have a C-like implementation that outperforms both the
> mainline, and the branch version (axpy_prod).  What will we do with
> that?
> 
> As far as I see we have the following options:
> 
> 1) Create a C++ template magic implementation out of it.  But for
> this, at the least we would need compile-time access to the target
> instruction set.  Any idea how to do that?
> 
> 2) Create a compiled library implementation out of it, and choose the
> implementation run-time based on the CPU capabilities.
> 
> 3) Include some good defaults/defines, and hope the user will use
> them.
> 
> 4) Don't include it, and do something completely different.
> 
> What do you think?
> 
> Cheers,
> 
> Imre
> 
> _______________________________________________
> ublas mailing list
> ublas_at_[hidden]
> http://listarchives.boost.org/mailman/listinfo.cgi/ublas
> Sent to: Oswin.Krause_at_[hidden]