From: walter_at_[hidden]
Date: 2001-11-21 05:58:40


--- In boost_at_y <mailto:boost_at_y>..., Peter Schmitteckert (boost) <
boost_at_s <mailto:boost_at_s>...> wrote:
> Salut,
>
> On Tuesday 20 November 2001 16:33, walter_at_g <mailto:walter_at_g>...
wrote:
> [...]
>
> > > Are there any blocking techniques in the current ublas library ?
> >
> > One of the objectives for ublas is to reduce the abstraction
penalty
> > when using vector and matrix abstractions. Therefore we use Todd
> > Veldhuizen's expression template technique to eliminate
temporaries
> > and fuse loops and the Barton-Nackman trick to avoid virtual
function
> > call overhead.
>
> Thanks for the clarification.
> In my problem a major part of the computing time in spent in matrix-
matrix
> multiplications of various forms, with dimensions ranging from tiny
up to
> ~1000, with the typical size of 100..400.
 
Ok, we'll check the effect of blocked matrix multiply for these sizes.
 
> (All these small blocks a part
> of a big matrix, and the vectors are stored in a dyadic product of
> 2 vector spaces, i.e. represented by a list of matrices).

You've lost me. Could you please explain or give a reference?
 
> > > I'm asking since I would need a performance which is comparable
> > > to BLAS/ATLAS.
> >
> > Do you want to get netlib (reference) BLAS performance? Do you
want
[snip]
> > to get ATLAS performance without any platform specific
optimization?
> > If we tune for a certain platform, which compiler/operating system
> > combination do you prefer?
>
> That's a pretty hard question. The programm will at least run on
> Athlon PCs (Linux-Cluster), ibm rs6k (SP) Power2/3/4, HP PA, and
> SGI Origins.
 
We only share the Intel/Linux platform.

> I'd be happy if I could replace (specialize) a few routines of ublas
> by ATLAS or vendor supplied BLAS routines in order to perform
benchmarks,
> (mainly _x_gemm).
 
I think, this is one of the next steps as already discussed with Toon
Knapen. We'll also look at it.
 
Regards
 
Joerg