$include_dir="/home/hyper-archives/ublas/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [ublas] Matrix decompositions.
From: Nasos Iliopoulos (nasos_i_at_[hidden])
Date: 2013-05-28 11:41:18
Yes,
additionally with the boost transition to github we are planning to make 
contributing to uBlas much easier. We would like though to stress 
quality  and find a way to define performance requirements 
specifications. When the github repo (a fork of boostorg/ublas) is ready 
(I hope I can have it ready this weekend) we will include instructions 
on how to work with pull requests (to submit an algorithm or a small 
bugfix for example), or in cases where larger involvement is required 
maybe give access to the ublas development repo to ppl that want to 
contribute.
-Nasos
On 05/28/2013 10:59 AM, Salman Javaid wrote:
> Are there any plans to integrate Karl's QR Decomposition 
> implementation into Boost.uBLAS? I will be grateful for a response.
>
>
>
> Best Regards,
> Salman Javaid
>
>
> On Thu, May 16, 2013 at 7:06 AM, Karl Rupp <rupp_at_[hidden] 
> <mailto:rupp_at_[hidden]>> wrote:
>
>     Hi guys,
>
>     I have a working implementation of QR for uBLAS in ViennaCL for
>     about a year already:
>
>     https://github.com/viennacl/viennacl-dev/blob/master/viennacl/linalg/qr.hpp
>
>     Feel free to copy&paste and relicense as needed, I agree to
>     whatever is necessary to integrate into uBLAS if of interest.
>
>     There is a bit of duplication for the ViennaCL types (requiring
>     host<->GPU transfers) and the uBLAS types (operating in main RAM),
>     yet it gives an idea how things can be implemented. It can
>     certainly be improved here and there (more details on request),
>     yet it addresses most of the points raised by Oswin. And it's
>     faster than a standard LAPACK for sizes above ~1k times 1k.
>
>     I recommend extracting the Householder reflections into a nice
>     separate interface, since this functionality will also be needed
>     for other algorithms like SVD or GMRES. As a nice side effect, it
>     makes the implementation for QR more compact.
>
>     Generally, in order to get *any* reasonable performance, one
>     really needs to use matrix-matrix multiplications where possible
>     in order to avoid the memory bandwidth bottleneck.
>
>     Best regards,
>     Karli
>
>
>
>     On 05/16/2013 01:06 AM, oswin krause wrote:
>
>         Hi,
>
>         These are further good points!
>
>         I also came up with a few new ones(and tips!):
>
>         - QR needs pivoting. It's main usages are SVD and
>         pseudo-inverses. In
>         both cases the input does not necessary have full rank. Also
>         pivoting
>         helps for matrices with high condition numbers.
>
>         - For the same reasons H was not formed explicitly, Q should
>         not be
>         formed. Instead there should be a version of the algorithm
>         which does
>         only return the reflection vectors forming Q.
>
>         - For dense matrices at least, it is possible to do the QR
>         in-place by
>         storing the R part as lower triangular and the householder
>         transformation vectors in the upper triangular. (That is very
>         similar to
>         how LU is implemented).
>
>         - The best sources for algorithmic information are the LAPACK
>         working notes.
>         http://www.netlib.org/lapack/lawns/
>
>         In your case lawn114 sems to be the most relevant, even though it
>         assumes a fast BLAS3.
>
>         Greetings,
>         Oswin
>
>
>         On 16.05.2013 05:32, Nasos Iliopoulos wrote:
>
>             That's not a bad start.
>
>             I think Oswin covered a whole lot of items here, but a
>             more complete
>             algorithm needs to satisfy some or all of  the following:
>
>             - The algortihm should have a dispatch mechanism so that
>             optimized
>             versions for various matrix types can be provided.
>             (sparse, banded,
>             etc.). You don't necessarily need to provide them all to
>             start with.
>             - The algorithm should operate on matrix expressions
>             rather than
>             matrices (so it can be applied to for example subranges).
>             Static
>             dispatch or overload if for some reason this seems to
>             reduce performance.
>             - Const correctness is important. Try  using const
>             reference on
>             immutable types.
>             - Instead of 0.00025 provide a value based on a user
>             choice.If it is
>             hard coded by the user, the compiler will  probably
>             convert it into a
>             const value.
>             - Don't use ints for indexing, use either std::size_t, or
>             container::size_type. If you need a signed type (i.e. to
>             count for
>             differences on unsigned types) use ptrdiff_t. uBlas
>             containers provide
>             a difference_type typedef for that purpose (i.e.
>             matrix<double>::difference_type).
>             - use noalias(..) =  in every assignment that the lhs is
>             not a part of
>             rhs, or when the algebraic operation is mapped 1-1. (i.e.
>             A=2*A+B can
>             be written as noalias(A)=2*A+B, but A=prod(B,A)+D cannot
>             atm). This
>             provides more benefits than just avoiding temporaries.
>
>
>             The QR decomposition of a 100x100 matrix should take no
>             more than a
>             few miliseconds (or even less than a milisecond) to run.
>             A 1000x1000 should take around 1/3 to 1/10 of a sec.
>
>             Compile with:
>             g++ -O3 -DNDEBUG Main.cpp -o qrtest
>
>             Then you'll see that your code runs pretty fast, but it
>             doesn't scale
>             well as Oswin noted.
>
>             Best regards,
>             -Nasos
>
>
>
>
>             On 05/15/2013 10:12 PM, Salman Javaid wrote:
>
>                 Thank you, Oswin for the detailed response. I am going
>                 to update the
>                 code.
>
>                 David, Nasos, any advise on coding conventions? Or
>                 anything else that
>                 you can possible suggest? I will stand grateful.
>
>
>
>
>
>                 Best Regards,
>                 Salman Javaid
>
>
>                 On Tue, May 14, 2013 at 10:53 PM, oswin krause
>                 <oswin.krause_at_[hidden]
>                 <mailto:oswin.krause_at_[hidden]>
>                 <mailto:oswin.krause_at_[hidden]
>                 <mailto:oswin.krause_at_[hidden]>>> wrote:
>
>                     Hi,
>
>                     in the order I stumbled over the things:
>
>                     main.cpp
>                     line 44-54: you don't need a copy, instead you
>                 should use a
>                     combination of row/subrange.
>                     line 58-60: you should take a look at inner_prod
>                     line 63: 0.00025 is too big.
>                     line 66: You should never create H explicitly.
>                     line 67: because you formed H, this step is O(n^3)
>                 which makes
>                     the whole algorithm O(n^4). This can be done in O(n^2)
>                     line 73-79: same applies here.
>
>                     Greetings,
>                     Oswin
>
>                     On 14.05.2013 22:12, Salman Javaid wrote:
>
>                         Hello uBLAS Contributors:
>
>                                                              I have
>                     applied to GSoC 2013
>                         and pitched implementation of SVD
>                     factorization for uBLAS. In
>                         order to better prepare myself and to get my
>                     hands dirty at
>                         uBLAS, I ended up implementing QR
>                     Factorization employing
>                         Householder Reflections using uBLAS. This is
>                     only the first
>                         draft and will be needing significant
>                     improvement, e.g.,
>                         computation of QR decomposition of 100 * 100
>                     matrix takes around
>                         30 seconds. But I guess just to get familiar
>                     with code base, it
>                         was a good exercise. Over the next week or two
>                     I will be trying
>                         to optimize the code.
>
>
>                                                              I will be
>                     absolutely
>                         grateful if contributors can have a quick
>                     glance at the code,
>                         and point me to any improvements they can
>                     suggest. Particularly
>                         in speeding up matrix multiplication.
>
>                         I used Visual Studio 2010 to compile the code.
>                     I will try to get
>                         the code running on my Ubuntu machine in a
>                     couple of days hopefully.
>
>                         Here the header file:
>                     https://github.com/salmanjavaid/QR_Decomposition/blob/master/QR_Header.hpp
>
>
>                         The main file:
>                     https://github.com/salmanjavaid/QR_Decomposition/blob/master/Main.cpp
>
>
>                         Best Regards,
>                         Salman Javaid
>
>
>                         _______________________________________________
>                         ublas mailing list
>                     ublas_at_[hidden]
>                     <mailto:ublas_at_[hidden]>
>                      <mailto:ublas_at_[hidden]
>                     <mailto:ublas_at_[hidden]>>
>                     http://listarchives.boost.org/mailman/listinfo.cgi/ublas
>                         Sent to:Oswin.Krause_at_[hidden]
>                     <mailto:to%3AOswin.Krause_at_[hidden]>
>                      <mailto:Oswin.Krause_at_[hidden]
>                     <mailto:Oswin.Krause_at_[hidden]>>
>
>
>
>                     _______________________________________________
>                     ublas mailing list
>                 ublas_at_[hidden] <mailto:ublas_at_[hidden]>
>                 <mailto:ublas_at_[hidden]
>                 <mailto:ublas_at_[hidden]>>
>                 http://listarchives.boost.org/mailman/listinfo.cgi/ublas
>                     Sent to: Javaid.Salman_at_[hidden]
>                 <mailto:Javaid.Salman_at_[hidden]>
>                 <mailto:Javaid.Salman_at_[hidden]
>                 <mailto:Javaid.Salman_at_[hidden]>>
>
>
>
>
>
>                 _______________________________________________
>                 ublas mailing list
>                 ublas_at_[hidden] <mailto:ublas_at_[hidden]>
>                 http://listarchives.boost.org/mailman/listinfo.cgi/ublas
>                 Sent to:athanasios.iliopoulos.ctr.gr_at_[hidden]
>                 <mailto:to%3Aathanasios.iliopoulos.ctr.gr_at_[hidden]>
>
>
>
>
>             _______________________________________________
>             ublas mailing list
>             ublas_at_[hidden] <mailto:ublas_at_[hidden]>
>             http://listarchives.boost.org/mailman/listinfo.cgi/ublas
>             Sent to:Oswin.Krause_at_[hidden]
>             <mailto:to%3AOswin.Krause_at_[hidden]>
>
>
>
>
>         _______________________________________________
>         ublas mailing list
>         ublas_at_[hidden] <mailto:ublas_at_[hidden]>
>         http://listarchives.boost.org/mailman/listinfo.cgi/ublas
>         Sent to: rupp_at_[hidden] <mailto:rupp_at_[hidden]>
>
>
>     _______________________________________________
>     ublas mailing list
>     ublas_at_[hidden] <mailto:ublas_at_[hidden]>
>     http://listarchives.boost.org/mailman/listinfo.cgi/ublas
>     Sent to: Javaid.Salman_at_[hidden] <mailto:Javaid.Salman_at_[hidden]>
>
>
>
>
> _______________________________________________
> ublas mailing list
> ublas_at_[hidden]
> http://listarchives.boost.org/mailman/listinfo.cgi/ublas
> Sent to: athanasios.iliopoulos.ctr.gr_at_[hidden]