$include_dir="/home/hyper-archives/ublas/include"; include("$include_dir/msg-header.inc") ?>
From: Gunter Winkler (guwi17_at_[hidden])
Date: 2005-03-13 16:55:41
Hello,
my results are a different than these of Paul.
Am Sonntag, 13. März 2005 10:46 schrieb Dima Sorkin:
> Because of these ( and similar cases ) examples,I think that use of
> expression templates in sparse operations, as it done now, is serious
> performance bug of ublas. I may, however , miss some point. Can
> someone explain my mistake ?
I don't think so, because each expression contains information about 
storage layout, sparsity and so on. This information can be used (and 
is used by most expressions). Of course it is a bug if the wrong 
algorithm is used. I think matrix_assign does still not work perfectly.
> Minor remarks:
> By the way it was undocumented in ublas help that compressed_matrix
> has direct constructor from coordinate_matrix. According to ublas
> help it can be constructed from matrix_expression only.
That correct. There is no constructor from COO. But its easy to convert 
the data. I could create such a constructor ...
> Second minor remark : in ublas documentation the arguments of sparse
> matrices constructors written in reverse order - nnzeros comes first.
Thanks. I should change this.
Now my results:
(compressed_matrix: cm1, cm2; mapped_matrix (was sparse_matrix) mm)
I first ran the expression constructor cm2(-cm1) with different sizes 
and number of nonzeros (now: cm1.nnz()). The result was O(n^2) where n 
is the size of the matrix.
2nd I ran an assign cm = mm
3rd I ran the product mm = prod(cm1, trans(cm2)) which gave O(n^2) where 
n is the size of the matrix. (I also timed cm3 = prod(cm1, trans(cm2)))
size    nnz     nnz     copy    assign  prod1   nnz_p1  prod2   nnz_p2
8000 100 100 0 6.26 13.67 104 14.79 104
8000 200 200 0 6.5 13.76 201 14.91 201
8000 400 400 0 6.76 14 401 15.1 401
8000 800 800 0 6.82 14.46 828 15.51 828
8000 1600 1600 0 6.51 15.24 1752 16.36 1752
8000 3200 3200 0.01 5.58 16.83 3977 17.73 3977
8000 6400 6400 0 3.83 19.09 9660 19.98 9660
8000 12800 12800 0 1.79 22.16 26934 23.03 26934
all times seem to be O(size^2) but nearly independent of nnz. Why? The 
iterator produce very much overhead. (The decreasing times for the 
assign are interesting ...)
I'll put more extended numbers on my page.
mfg
Gunter