$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: Leland Brown (lelandbrown_at_[hidden])
Date: 2006-06-09 02:50:32
John Phillips <phillips <at> delos.mps.ohio-state.edu> writes:
> The basis for chosing the units is
> instead attempting to maximize numeric stability, and so choosing units
> that keep things as close to order 1 as possible.
BTW, this post will be only of interest to those with a mathematics or 
numerical analysis background.  Others can safely skip it without missing 
anything.
This is a very interesting issue to me!  I'm not sure what you have in mind as 
far as keeping things close to order 1 improving numerical stability, but let 
me explain what comes to mind for me.  First, there may be a possibility of 
overflow or underflow with values significantly far from order 1, especially 
with single precision floating point values.  In my application, I use double 
precision for everything, so this doesn't concern me.
The second issue relates to things like inversion of ill-conditioned matrices, 
where a matrix may be nearly singular due to a large difference in the orders 
of magnitude of its eigenvalues.  Clearly this can cause severe numerical 
problems, and keeping things near order 1 helps prevent this.
In some cases FWIW I think this problem can be resolved in another way, or even 
seen as a symptom of a different problem, which can be addressed - as I'll 
explain.  And the discipline imposed by strongly-typed dimensional analysis may 
force the implementer of the algorithm to correct the problem.  (Of course, you 
can read this as "annoy the implementer because his mathematically correct 
algorithm won't compile," depending on your perspective :-) )
Consider, for example, a function that inverts a matrix A using an SVD, in 
order to avoid numerical problems.  Let's assume A is a symmetric, positive 
definite matrix composed of values having different physical dimensions - such 
as a covariance matrix of a set of parameters having different dimensions.  I 
don't use SVD (so forgive me if I display my ignorance); but as I understand, 
it would essentially find the eigenvalues and eigenvectors, and invert the 
eigenvalues; but for any very small eigenvalues it would replace the very large 
reciprocal value with a zero instead.  If we're solving Av=b, then loosly 
speaking the numerical problem is that many values of v solve the equation, 
some with very large coordinate values.  Geometrically, the SVD can be 
described as choosing from among these values of v the one closest to the 
origin.
What has always bugged me about this description is that "closest" requires a 
distance metric, like d=sqrt(x*x+y*y), but this is meaningless in a physical 
sense if the components x and y have different physical dimensions.  Sure, if 
you have numerical values in some particular units system, you can do the 
computation and get a number.  But what doesn't sit right with me is that if 
you change your units, you can get a different "closest" point!  So even 
without roundoff errors and the like, the answer that comes out of the 
algorithm will change depending on what physical units are chosen.  To me, it 
seems that a physical problem should have a particular solution regardless of 
the units chosen (aside from issues of computational accuracy).
Likewise, the eigenvalues of this matrix will have some sort of hybrid 
dimensions just like the distance metric above.  And the eigenvectors can vary 
based on choice of units, because even the notion of two vectors being 
orthogonal is dependent on the relationship among the units of the parameters 
defining the matrix.  Because of this, as you pointed out in another post, a 
strongly-typed dimensions library would not even be able to compile the SVD 
(unless all the elements of the matrix are of the same dimensionality).
And how do I set the threshold of what's a "very small" eigenvalue for the 
SVD?  Perhaps .000001?  In what units, since the matrix elements are of 
different dimensions?  Again, the meaning of my threshold changes depending on 
the relationship among my parameter units.
I used to think all this was just the price you pay for using an SVD.  But with 
the way I've constructed my matrix, it can be decomposed as:
   A = DCD
where D is a diagonal matrix of dimensioned quantities, and C is a 
dimensionless symmetric matrix.  It's easy, in fact, to find such a diagonal 
matrix D, with no a priori knowledge of the magnitude of the values, by taking 
the square roots of the diagonal elements of A.  Then C is just the correlation 
matrix corresponding to the covariance A, and its elements tend to be close to 
order 1.  Now the matrix to be inverted (C) is likely to avoid numerical 
problems caused by large differences in magnitude, and the solution will be 
independent of the units chosen - since they will just become multipliers on 
the elements of D.  The SVD/eigenvalue computation can be done without 
violating the strong typing, and the choice of units no longer impacts the 
numerical stability of the matrix inversion.
There may be a small penalty (such as calculating the square roots), but as 
compared to an SVD algorithm I think it's almost negligible.  I expect a 
similar technique could be found for cases of nondefinite matrices or non-
square matrices, and other types of matrix calculations.  In parts of the 
algorithm that are not "mixing" dimensions (as the distance formula above 
does) - where strongly-typed computations would compile - you're always adding 
apples to apples, so the choice of units won't influence numerical problems 
like summing values of very different magnitudes.  Most matrix multiplications 
should probably be type-safe, for instance, and thus independent of units as 
far as numerical stability.
These ideas would never have occurred to me when I was just manipulating 
matrices as set of numbers.  It wasn't until I started to think of them as 
strongly-typed physical quantities that this perspective emerged.  I think 
that's also part of the value of using a dimensional analysis library.
I hope what I've suggested here makes some sense.  There may be reasons it 
doesn't apply in certain situations.  But if anyone finds it helpful, it's 
something you can try.
Regards,
-- Leland