$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [boost] Formal Review: Boost.Polygon starts today August 24, 2009
From: Barend Gehrels (barend_at_[hidden])
Date: 2009-08-31 18:33:50
>
> The formal review of the Boost.Polygon library by Lucanus Simonson 
> starts today, August 24, 2009 and will finish September 2, 2009.
>
First, congratulations Luke, and also Gyuszi, with having finished this 
library and having it for review.
As one of the authors of another Geometry Library, also aimed for Boost, 
I'll try to be as objective as possible. I'll not compare it too much to 
our library, though I'll give some performance measures compared with 
many other libraries, also ours (ggl).
When I started testing Boost.Polygon, in February or March, there was 
not yet any documentation, it was difficult to write the benchmarks, it 
didn't compile using MSVC 2005. That is all solved now, it is 
documented, it compiles using MSVC 2005 now, there are some positive 
reviews on design, implementation and the paper. So this is a good step 
forward.
In my domain (GIS) we, normally, only use floating point coordinates on 
arbitrary angle polygons. I'll not review the 45/90 polygons because I 
don't know them and I'm not interested in them.
Expectations: I think a templated Boost.Polygon library should at least:
1) support most of the basic polygon geometry algorithms
2) have a performance comparable to other libraries
3) support both integer and floating point coordinate types
1) the library supports many operations, but some basic polygon 
algorithms are missing:
- convex hull
- centroid
- is_convex
Those operations are found in many libraries and I think they should be 
supported by Boost.Polygon. It would also make the performance 
comparison (even) more interesting.
2) though some reviewers were satisfied about the performance, and the 
paper indicates it is well performing, our benchmark shows that it is in 
general slower or much slower than other libraries:
- the performance of the "contains" operation is perfect, fastest available
- the "area" operation is about a factor 1.5 slower than most other 
libraries. This is surprising because "area" is quite a simple 
operation, it should be possible to implement it with the same 
performance as e.g. CGAL or wykobi.
- the "intersection" operation is too slow compared with other 
libraries. The paper (referred to in the documentation) compares the 
intersection performance with GPC and CGAL, but does not give exact 
figures. It gives some plots about scalability using a very high number 
of points per polygon.
In normal usage, in nearly all cases, GPC outperforms Boost.Polygon with 
a high factor:
--> GPC: 9.0 seconds, GTL/Boost.Polygon: 92.2 seconds for the same 
operation (for 100 points per overlayed ellipse)
--> GPC: 13.7 seconds, GTL/Boost.Polygon: 115.3 seconds for the same 
operation (1000 points per overlayed ellipse, which is already large).
--> Also with 10000 points GPC is still much faster (factor 5).
--> Only in high-intersection-densities with a very high number of 
points GTL is faster than GPC.
If I was a user and license would allow me, I would definitely take GPC 
and not BP, because of performance and because of floating point...
--> compared with CGAL: in normal usage, GTL is indeed faster. But in 
intersection-dense environments CGAL scales better (this in contrary to 
the paper)
--> compared with GGL: in normal usage Boost.Polygon is a factor 81.1 
slower than GGL. While it was suggested that we should use BP for our 
boolean operations, and we evaluated this (resulting in this 
comparison), we're glad that we first compared it to other libraries 
because a user can wait for 1 second, but cannot wait for 92 seconds... 
for a less precise result... In all tests GGL is way faster.
--> for the complete overview, see 
http://trac.osgeo.org/ggl/wiki/Performance and 
http://trac.osgeo.org/ggl/wiki/IntersectionMore
Most of our benchmark results are contrary to the results of your paper. 
I think that if you write things like this, you should publish the 
benchmark sources because it is very sensitive information. People 
should be able to reproduce them. Also exact figures in the paper would 
help to interpret the results.
The Boost.Polygon library is specialized for boolean operations and the 
paper is all about performance. But in the end it is really not 
performing well...
3) Boost.Polygon supports integer but does not support double for 
boolean operations.
For the operations "area" and "contains", coordinates in "double" 
floating precision are supported. Also the boolean operations accept 
double coordinates, but crashes immediately. I think it should not 
crash, slightly better is to prevent floating point coordinates by 
giving a compilation error. Much better: floating point coordinates 
should be implemented. For me as GIS-user, a library without floating 
point support is unusable.
For the comparison I converted to integer coordinates (multiplying using 
a factor of 10000.0). But in the resulting summed area you can see that 
this is not working perfectly. The area differs, and where it should 
increase (star-ellipse with more and more points) it is decreasing... I 
tried other integer-double-conversion factors as well but that made 
things worse.
>
>
> - What is your evaluation of the design?
It is difficult to answer this question because I (with Bruno, and some 
folks of this list) designed another library in another way, it is hard 
to be objective here. I would prefer a geometry library, instead of a 
polygon library. Polygons come seldom alone. It would be useful to have 
other geometry types as well.
The 45/90 polygons could be implemented as specializations (special 
cases) of the normal polygons, overriding implementations. This library 
is designed the other way round, it started with 90/45 polygons and 
arbitrary polygons were added to be submitted as a boost library.
The diagram shows the polygon_90 as a refinement to a rectangle, the 
polygon_45 as a refinement to a polygon_90, the polygon as a refinement 
to a polygon_90, and a polygon_45_with_holes as a refinement to both a 
polgon_90_with_holes and a polygon_45. All this makes really no sense to 
me, I think it could and should be much more simple.
> - What is your evaluation of the implementation?
It is curious that Boost.Polygon went from Tag Dispatching to SFINAE 
last year, while we (GGL) went from SFINAE to Tag Dispatching. We're 
very glad with our change, while most compiler problems and warnings in 
Boost.Polygon origin from the SFINAE implementation. I'm still convinced 
that Tag Dispatching, suggested to us on this list by Dave Abrahams, is 
a very good choice in a geometry library implementation. We're happy 
with it and all compiler and compatability problems are vanished.
Things are copied from other boost libraries. File isotropy.hpp contains 
a nearly literal copy of enable_if. There is also an extraction of MPL. 
While this is an explicit choice, I don't think it is a good choice. The 
Boost Concept Check Library is not used to check concepts.
I didn't try everything, but I'm getting the feeling that the concepts 
are not implemented as they should be. The file polygon_set_concept.hpp, 
which should give an outline of the polygon set concept, is dependant on 
polygon_set_data.hpp (which implements that concept). That should be the 
other way round. Inside that concept file, points are often copied into 
a polygon_set_data temporary. That is not how concepts should be 
implemented, and is very bad for performance.
The file transform.hpp contains a large number of enumarions for 
transforming polygons up/down/west/east etc. I think this is not 
necessary, and besides that "west" / "east" gives the impression of 
spherical coordinate systems while this library is cartesian only. 
Left/right should already be better, but why not two offset numbers in 
two dimensions?
> - What is your evaluation of the documentation?
I didn't read all. It is looking good but contains several errors in the 
details. E.g. documentation of the polygon_90_with_holes_concept, 
function begin_holes has the description  "Returns the begin iterator 
over the range of coordinates that correspond to horizontal and vertical 
edges.". This is probably copy/paste error. Another example is the "void 
clean() const"  function, which cannot be const of course, indicating 
that this documentation is written / copied / pasted and not generated 
with e.g. Doxygen.
> - What is your evaluation of the potential usefulness of the library?
I think most people using geometry / polygons are using floating point 
coordinates. As these are not supported in the algorithms where this 
library is specialized on (booleans), I think it is useful for a limited 
public (VSLI?)
> - Did you try to use the library?  With what compiler?  Did you have any
> problems?
MSVC 2005, MSVC 2008, GCC 4.4 using MinGW. Everything compiled but the * 
notation for intersections didn't compile anymore (it did before)
> - How much effort did you put into your evaluation? A glance? A quick 
> reading? In-depth study?
An in-depth study to several aspects of the library
> - Are you knowledgeable about the problem domain?
I'm designing and implementing GIS / geometry libraries since 1992.
I hope that my review was as objective as can be expected from an 
alternative library writer.
I conclude with the remark that I've asked several times on this list 
and off-list for cooperation, I suggested that Luke could join us 
writing a combined Geometry Library for Boost. All these requests were 
never answered. I would still be in favour of one library.
If this library is accepted it will create a different situation for 
subsequent geometry libraries. In reviews it will be expected that these 
libraries work together or that one is built on top of the other. If 
that is the case, the base should be very very good. It is to the other 
reviewers to decide that this is the case.
Regards, Barend