$include_dir="/home/hyper-archives/ublas/include"; include("$include_dir/msg-header.inc") ?>
Subject: [ublas] Performance woes affecting ublas
From: Rui Maciel (rui.maciel_at_[hidden])
Date: 2010-05-10 12:47:47
I've just managed to migrate a small finite element application that I'm 
writing from ublas to eigen and I have to say that I've saw an abysmal 
difference in performance.  
I've migrated my code in the following two steps:  
The first one consisted in migrating the global stiffness matrix, global nodal 
force vector and solver (in effect, the part that dealt with the K*d=f 
equation) from ublas and custom code to eigen. In short, this migration 
consisted in replacing ublas' compressed_matrix with eigen's 
DynamicSparseMatrix and replacing ublas dense vector with an object of eigen's 
Matrix<double, Dynamic,1> class.  
As a result, this step alone lead my small pet program to go from taking over 
6 minutes to run the analysis down to around 20 seconds.  Granted, 
I had implemented the solvers myself without much info concerting the inner 
workings of ublas' components, which means that they certainly suffered from 
performance problems.  Nonetheless, I've implemented 3 different solvers (Gauss 
factorization with partial pivoting, Cholesky decomposition and conjugate 
gradient method) and all three solvers took grossly the same order of time to 
solve a given system, including the cg method which is basically a series of 
algebraic operations.
Having finished that step I've moved on to migrate the remaining ublas code to 
eigen.  The second part consisted of a hand full of dense matrices which were 
subjected basically to a series of matrix assignments and multiplications, 
along with the inversion and the calculation of the determinant of a 3x3 
matrix.  This step sliced the time it took to run the analysis from around 20 
seconds down to 5 seconds. 
So, summing things up, migrating from ublas and a set of hand-made solvers to 
eigen made it possible for my program to go from taking over 6 minutes to 
solve a simple problem to taking around 5 seconds to perform the same task.
Again, I acknowledge that certainly my sloppy code had a lot to do with that 
abysmal performance penalty experience in the ublas version of my program.  
Nonetheless this problem could be at least avoided in part if the 
documentation was improved in key areas, such as common gotchas associated 
with sparse matrices and the efficiency associated with basic operations.
Also, through my migration it was also possible to notice that ublas is far 
from efficient even when used to perform simple tasks such as products between 
smallish dense matrices (from 3x3 to 81x6) and between dense matrices and 
dense vectors, a aspect of ublas whose tuning was supposed to be focused on.  
No matter how sloppy any code is, if your code takes a 3.4x performance 
penalty just for performing basic tasks such as products between small dense 
data types...  Well, that is a good sign that something isn't working right.  
I'm aware that there were no promises made regarding efficiency but a difference 
of this magnitude leaves a lot to be desired.
Rui Maciel