$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [boost] [xint] Boost.XInt formal review
From: Jeffrey Lee Hellrung, Jr. (jhellrung_at_[hidden])
Date: 2011-03-02 16:46:33
On 3/2/2011 2:58 AM, Vladimir Prus wrote:
> Boosters,
>
> the formal review of the Boost.XInt library, by Chad Nelson, starts now and will
> run through March 11.
>
> The documentation for the current version is available at:
> 	
> 	http://www.oakcircle.com/xint_docs/
>
> The source code is available via Subversion at:
>
> 	http://svn.boost.org/svn/boost/sandbox/xint
[...]
This (like Mathias) is not a review, but rather a random list of 
comments upon a first pass through the documentation (and a little 
through the implementation, but only because the documentation proved 
insufficient to answer some questions).
I have two general and major comments.
1) I was hoping the bignum algorithms would be factored out and allowed 
to operate on, e.g., ranges of digits, rather than being coupled to the 
xint::integer_t data structure.  I know this had been mentioned before. 
  Chad, what are your thoughts on this?  I know the purpose of this 
library is to provide out-of-the-box bignum functionality with minimal 
hassle, but, to me, the algorithms should be part of the interface as 
well and decoupled from the xint::integer_t data structure.
2) In a similar vein, I would like to see the cryptographic 
functionality decoupled from the xint::integer_t data structure as well, 
and released separately.
Also, good things: I like the policy-based design, and I think it's the 
right direction.  The Boost.Parameter arity thing may get onerous (see 
comment below), and one thing you could do to work around it is to just 
accept a mpl::map of tags, since I'm guessing your not using 
Boost.Parameter's deduction functionality  (awesome and powerful, but 
unnecessary in this case, I think).
On to more specific comments...
Documentation Comments
----------------------
* http://www.oakcircle.com/xint_docs/
"Why would I use it? ... Because it's fast."
Do you have any benchmarks to back this up?  Fast relative to...?
"Why would I use it? ... Because it has the features you need."
Included in here is "[c]ryptographically-secure prime number 
generation".  This seems a little too domain-specific to be included in 
this library.  Is there any reason the prime number generation couldn't 
be factored out and, further, be bignum-implementation agnostic? 
Boost.Prime, for example?
* http://www.oakcircle.com/xint_docs/namespaceboost_1_1xint.html
integer_t<...> square (const integer_t<...> n)
The complexity is just O(n^2).  If you want to more precise, don't use 
O(...), just state the number of additions/subtractions and multiplications.
integer_t<...> pow (const integer_t<...> n, const integer_t<...> e)
I suspect this complexity is, really, O( # of bits in the result ), as 
the last one or two multiplies are what dominate.
I was also thinking about adding an overload where the power is 
specified as an unsigned long, but I don't know if it would actually be 
worth it...
integer_t<...> square_root (const integer_t<...> n)
You should mention that the algorithm is just Newton iteration with 
initial guess 2^floor(log2(n)/2) (where n is the input value).  Also, 
your complexity claim implies a bounded number of Newton iterations, 
independent of the inputs, as each loop through the while loop is O(n^2) 
(where n is the # of bits) due to the division...that may be the case, 
and if so, you might want to back that up with a reference or something, 
but I suspect you should have maybe a log(n) in there (where n is the # 
of bits), as Newton has "only" quadratic convergence.
I believe there is an O(n) square_root algorithm that would be 
asymptotically faster, but...I'd have to look it up...
int is_prime (const integer_t<...> n, callback_t callback=no_callback)
Given the documentation, I think a more descriptive name, such as 
is_probably_prime_rabin_miller, would be better.  Also, I'm not sure how 
small "infinitesimally" small is...it seems the wikipedia page only 
guarantees at most a 1/4^5 chance of a false positive (given the current 
implementation), but in practice it states the chance is much much less. 
  Perhaps you should allow the iteration count (the "k" from the 
wikipedia article) to be passed as a parameter as well (possibly 
defaulted), for those that want more control on their primality test.
But, similar to a comment above, I'm not sure any cryptographic or 
number theoretic features outside of basic bignum arithmetic and 
operations should be included in this library; I would like to see them 
factored out and independent of xint.
* 
http://www.oakcircle.com/xint_docs/structboost_1_1xint_1_1options_1_1threadsafe.html
"Note that even with this option, it is still your responsibility to 
ensure that only one thread accesses a particular integer object at a time."
You seem to imply that it is not thread safe under any circumstances for 
multiple threads to even have *read* access to an xint::integer_t 
object.  Is that the case, and if so, why?
* http://www.oakcircle.com/xint_docs/classboost_1_1xint_1_1integer__t.html
template<...> class boost::xint::integer_t<>
"You can specify several template parameters for the type. Most of these 
parameters must be specified via helper classes defined in the 
boost::xint::options namespace. See the list and descriptions there."
So that covers "most" of the template parameters...what about the rest?
Also, it would be helpful to document here what the default settings are.
integer_t<...> & operator+= (const integer_t<...> b)
integer_t<...> & operator-= (const integer_t<...> b)
Why are these parameters passed by value?
int sign (bool signed_zero=false) const
I think you should separate the 2 different meanings of signedness into 
2 separate functions.  There are performance considerations, of course, 
but it's also difficult to the casual reader to infer what x.sign(true) 
is suppose to mean without looking it up.  A second function called 
sign_with_signed_zero might be more appropriate.
* http://www.oakcircle.com/xint_docs/threadsafe.html
"When using copy_on_write, identical integer objects are allowed to 
share storage, which is more efficient for both CPU power and memory..."
I'm (still) not sold that copy-on-write gives any noticeable performance 
benefits in this library over move semantics except in contrived 
examples, so I think some kind of justification of the above claim would 
be justified.
I'm also not a fan of the integer_t::integer_t(const integer_t& b, bool 
force_thread_safety) constructor; isn't there already an "ensure_unique" 
member function already there (in the implementation) that ensures an 
integer_t object has sole ownership of its storage?  Indeed, looking 
more at some source files, I would guess this is _make_unique.  I would 
consider such a member function preferable to this constructor overload, 
especially given how the mysterious boolean parameter looks in client 
code (see integer_t::sign comment above).
* http://www.oakcircle.com/xint_docs/basic__types__and__includes_8hpp.html
"#define BOOST_PARAMETER_MAX_ARITY 6"
This is not guaranteed to actually have any effect.  Think about if 
Boost.Parameter gets included before any xint header does.  Better (and, 
yes, more onerous) to document that xint::integer_t uses Boost.Parameter 
and hence the user must set this macro to at least 6.
Miscellaneous Comments
----------------------
Why is digit_t in the xint::detail namespace, when it is expected to be 
part of the interface for purposes of supplying an allocator?
How did you create your documentation?  Your documentation generator may 
appreciate an acknowledgment in your documentation.
Also, regarding the complexity notation, it would probably be best to 
state the complexity in terms of some other letter than n, so as not to 
confuse n with the input value.  I know I did *not* do this in my 
comments above, as I was trying to follow your notation, but it easily 
gets confusing.  Or just use log(n) instead of n to mean the # of bits 
or digits.
That's it for now,
- Jeff