Subject: Re: [boost] [http] Formal Review
From: Lee Clagett (forum_at_[hidden])
Date: 2015-08-12 18:27:54


On Wed, Aug 12, 2015 at 2:58 PM, Antony Polukhin <antoshkka_at_[hidden]>
wrote:

> Hi,
>
> This is my formal review for Boost.HTTP library.
>
> 1. Should Boost.Http be accepted into Boost?
>
> No. It has some big architectural issues and a lot of issues in code.
>
>
> 2. What is your evaluation of the design?
>
> That's the fatal problem of the library. It seems like the library attempts
> to get best from ASIO and cpp-netlib and unfortunately fails. Now let's get
> into details:
>
> * ASIO is a low level abstraction over protocol that does not take care
> about memory management of buffers (in most cases it's the responsibility
> of user to allocate and keep alive the required buffers).
> * cpp-netlib takes care of memory management, hides low-level stuff.
>
> Boost.Http is stuck somewhere in the middle: sometimes it allocates and
> extends buffers, sometimes not. Because of that the design does not seem
> solid. Any attempt to hide all the memory management inside the library
> will end in an another implementation of cpp-netlib. So it seems right to
> remove all the memory management from the library internals and make the
> library closer to ASIO.
>
>
This is one my complaints in my in-progress review. The documentation
doesn't explicitly state how the provided SequenceContainer is being used
during read/writes of the body. And SequenceContainer is a lax concept -
std::list models SequenceContainer, which means all read/writes to the body
require an internal copy to meet the concept correctly.

> Here are my recommendations for separation of flies and dishes:
>
> * Stick to the idea that http socket must ONLY manage connection without
> extending the ASIO's send/receive function signatures. That means that
> http1.0 and http1.1 sockets must be almost a typedef for tcp/ssl socket.
> Http2.0 socket must be very close to SSL socket. No additional methods
> (like async_read_trailers or async_write_response) must be provided.
>
> * It must be a user's responsibility to provide buffers for messages. Never
> extend buffers in socket-related methods.
>
>
* Provide helper classes that parse and generate http
>
> * Provide out-of-the-box completion conditions for async_read operation:
> http::completions::full, http::completions::headers,
> http::completions::body...
>
> Those bullets will allow to write following code (based on
>
> http://www.boost.org/doc/libs/1_58_0/doc/html/boost_asio/tutorial/tutdaytime3/src.html
> with minimal modifications):
>
> #include <ctime>
> #include <iostream>
> #include <string>
> #include <boost/bind.hpp>
> #include <boost/shared_ptr.hpp>
> #include <boost/enable_shared_from_this.hpp>
> #include <boost/asio.hpp>
>
> std::string make_daytime_string() {
> using namespace std; // For time_t, time and ctime;
> time_t now = time(0);
> return ctime(&now);
> }
>
> class http_connection : public
> boost::enable_shared_from_this<http_connection> {
> public:
> typedef boost::shared_ptr<http_connection> pointer;
>
> static pointer create(boost::asio::io_service& io_service) {
> return pointer(new http_connection(io_service));
> }
>
> http::socket& socket() {
> return socket_;
> }
>
> void start() {
> message_.resize(4 * 1024);
>
> boost::asio::async_read(socket_, boost::asio::buffer(message_),
> http::completions::full(message_), // read until all the whole
> request is in `message_`
> boost::bind(&http_connection::handle_read, shared_from_this(),
> boost::asio::placeholders::error,
> boost::asio::placeholders::bytes_transferred));
> }
>
> private:
> http_connection(boost::asio::io_service& io_service)
> : socket_(io_service)
> {}
>
> void handle_write(const boost::system::error_code& /*error*/,
> size_t /*bytes_transferred*/)
> {}
>
> void handle_read(const boost::system::error_code& error, size_t
> bytes_transferred) {
> assert(!error); // for shortness
> // `message_` contains the whole request, including headers and body
> // because http::completions::full was provided
>
> {
> boost::system::error_code e;
> http::view v(message_, e); // represent buffer as http message
> assert(!e); // no error occured during parsing
> assert(v.read_state() == http::read_state::empty);
> std::cerr << v.version(); // "HTTP1.1"
> std::cerr << v["Content-type"]; // "plain/text"
> std::cerr << v.body(); // string_ref("Hello! This is a body
> content. What time is it?")
> message_.clear(); // `v` is now invalid to use!
> }
>
> {
> http::generator resp(message_); // use `message_` for an
> output
> resp << make_daytime_string(); // append daytime to body
> resp["Content-Type"] = "plain/text"; // set some headers
> resp.version(http::version::HTTP_1_1); // explicitly set HTTP
> version
> resp.code(200); // "200 OK"
> resp.flush(); // explicitly flush to
> buffer. Done automatically in destructor
> }
>
> boost::asio::async_write(socket_, boost::asio::buffer(message_),
> boost::bind(&http_connection::handle_write, shared_from_this(),
> boost::asio::placeholders::error,
> boost::asio::placeholders::bytes_transferred));
> }
>
> http::socket socket_; // manages the connection only
> std::string message_;
> };
>
> class http_server {
> public:
> http_server(boost::asio::io_service& io_service)
> : acceptor_(io_service, http::endpoint(http::all_versions, tcp::v4(),
> 80))
> {
> start_accept();
> }
>
> private:
> void start_accept() {
> http_connection::pointer new_connection =
> http_connection::create(acceptor_.get_io_service());
>
> acceptor_.async_accept(new_connection->socket(),
> boost::bind(&http_server::handle_accept, this, new_connection,
> boost::asio::placeholders::error));
> }
>
> void handle_accept(http_connection::pointer new_connection,
> const boost::system::error_code& error)
> {
> if (!error) {
> if (new_connection->socket().version() == http::version::HTTP_2_0) {
> // do some more stuff
> }
>
> new_connection->start();
> }
>
> start_accept();
> }
>
> http::acceptor acceptor_;
> };
>
> int main() {
> try {
> boost::asio::io_service io_service;
> http_server server(io_service);
> io_service.run();
> } catch (std::exception& e) {
> std::cerr << e.what() << std::endl;
> }
>
> return 0;
> }
>
> This will also untie Boost.HTTP from ASIO. Now users could use
> http::generator/http::view to read/generate HTTP data. This significantly
> extends the use cases of your library (for example on a previous work we
> had our own network transfer implementation, but we'd gladly use a http
> parser and generator).
>
> How is your http::stream version asynchronous? It appears it would be
blocking, unless there is some other magic occurring. It would be more
concerning on the reads (the generation could be pre-computed), because if
the underlying stream didn't have the data for .version(), etc., I think it
would need to block. Or it would have to read the entire header into one
larger buffer then parse? I guess you could use `async_read_until(..., ,
...) functionality and keep buffering (until a max of course), then throw
that to the parser?

Anyway - I was thinking along the same lines at various points. Having a
function that pre-generates HTTP messages is very useful IMO, and should
likely be included. Designing a good parsing concept would be a bit more
work I think, but probably worth it too. I'm not sure how the author
intends to swap out parsers in the current design. Having a fixed parser
seems acceptable, but the author almost seemed to suggest that it could be
selectable somehow.

> Such code is also less intrusive: only minimal chages were required to
> change a tcp example into the http example. Users will appreciate that,
> especially in cases when ASIO is already used for HTTP.
>
> BTW, after that you could easily implement http::stream (that must be close
> to tcp::stream by functionality). http::stream would be a killer feature
> that would be widely used by utility programs (as I remember current
> Boost's regression testing requires it, but uses tcp::stream with a bunch
> of workarounds).
>
> Please, implement the HTTP2.0 protocol. There are some open source
> solutions that are licensed under the Boost license and are very close to
> full implementation of HTTP2.0. Just reuse the code and chat with the
> authors for better understanding of the problems.
>
>
> 3. What is your evaluation of the implementation?
>
> Not perfect.
>
> For example
>
> https://github.com/BoostGSoC14/boost.http/blob/master/include/boost/http/socket-inl.hpp#L352
> It seems that `std::vector<asio::const_buffer> buffers;
> buffers.reserve(nbuffer_pieces);` is meant to be here. There are more such
> places...
>
> It seems that some useful features (like custom allocators) are not
> supported by the library. It's better to be fixed.
>
> Some of the implementation issues are because of the library design. For
> example requesting multiple async operations
> (async_read_request+async_read_some+async_read_trailers) is performance
> hit. It's better to read all the data at once (just like in the example
> from above).
>
>
How much of a penalty do you think there is to this split design? It seems
likely that multiple handlers will have to be given to read_some (directly
or indirectly) for ASIO to read the header, etc. anyway.

>
> 4. What is your evaluation of the documentation?
>
> First code snippet in tutorial contains mostly ASIO's stuff and almost no
> HTTP stuff. That's wired... If bullets from above would be implemented,
> then I'd rather start with parsing and generating simple http messages.
> Then an example with sockets could be provided.
>
> Also please do not use coroutines at least in first example. coroutines is
> a new feature in ASIO, many users use old Boost versions that have no
> coroutines and some users are not familiar with them. So if you start with
> coroutines user will have to go to ASIO docs and dig into coroutines to
> understand the very first example. Example without coroutines will be more
> user friendly.
>
>
> 5. What is your evaluation of the potential usefulness of the library?
>
> Library is very useful and required. But in current state it must not be
> accepted into Boost.
>
>
> 6. Did you try to use the library? With what compiler? Did you have
> any problems?
>
> Have not tryed.
>
>
> 7. How much effort did you put into your evaluation? A glance? A quick
> reading? In-depth study?
>
> A few hours digging into docs, about 3 hours looking into the sources,
> about 3 hours formulising the issues and recommending solutions.
>
>
> 8. Are you knowledgeable about the problem domain?
>
> Not an expert, but used ASIO a lot along with cpp-netlib and some other
> libraries close to the proposed one.
>
>
> 8.3333(3) Review
>
> Please, reserve a bigger window when such complex library is proposed for
> review. It's hard to review it in 10 days.
>
>
> 8.6666(6) For Reviewers
>
> I urge other reviewers to take a look at the cpp-netlib. If you treat
> Boost.Http as an out-of-box solution, then cpp-netlib is more mature,
> user-friendly and solid. If you treat Boost.Http as a basic building block
> for HTTP (just like ASIO is a basic building block for networking), then
> it's better to stick to the ASIO's design and do not *partially* manage
> memory!
>
>
>
> P.S.: I hope that this library would be changed according to my
> recommendations and submitted for a new review.
>
> I agree. More explicit memory management would differentiate this from
existing libraries, otherwise boost should likely just review one of the
other major libraries.

Lee