$include_dir="/home/hyper-archives/boost-users/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [Boost-users] Review Results for Proposed Boost.Process library
From: Boris Schaeling (boris_at_[hidden])
Date: 2011-03-08 18:45:34
On Tue, 08 Mar 2011 00:40:36 +0100, Marshall Clow <mclow.lists_at_[hidden]>  
wrote:
> [...]Thank you to everyone who participated, and especially to Boris and  
> those who wrote a review.
And thank you for having volunteered to be the review manager!
Now how do we proceed from here after the library has been rejected? Here  
are some comments from me:
There were a couple of ideas and new proposals how a process management  
library could look like. If you feel you'd like to step in and take over  
Boost.Process to give it another boost - drop me a mail and let's discuss  
how I can support you! I would very much welcome new developers actively  
working on the library!
In case there is not much happening until summer and Boost is accepted in  
this year's Google Summer of Code program again, I maybe offer to be  
available as a mentor for another Boost.Process GSoC project. In the short  
term I probably won't have time to work on Boost.Process. Nevertheless, I  
want to write down some conclusions I drew from the review on what I think  
should be done next.
Jeff had proposed an executor concept. I'm not sure if I understand  
everything about the executor concept but I can see a few advantages over  
the current design. For example the current design requires to use  
stream_ends to configure streams of a child process. A stream_ends object  
contains two handles in case the parent process wants to communicate with  
the child process. However there are use cases where you don't need two  
handles - for example when you want to redirect stdout of the child  
process to a file. Jeff's executor concept makes it possible to pass only  
one handle without a second dummy handle in a stream_ends object.
The stream_ends class wouldn't be required anymore and could be removed  
 from Boost.Process. A library user would still need to create a handle for  
the file to redirect to. But with the executor concept one could use  
boost::path. Boost.Process could support different types and the  
implementation would do the rest.
Many stream behaviors which are currently used to create stream_ends could  
be removed. Others can be redesigned to make them more useful - also  
outside of Boost.Process. This includes for example pipe, named_pipe and  
async_pipe. These could be functions which return a read and a write end  
(instead of a parent and a child end). I guess we still want a class for a  
pair of handles (stream_ends was introduced to make it clear what the  
parent and child end is; with std::pair you would need to remember).
With different classes like boost::path and a handle for pipes (whatever  
the class would be; some classes from Boost.Iostreams were proposed) to  
configure streams, the context class currently used by Boost.Process would  
turn into something hard to describe. There would need to be all kind of  
containers as different types could be used to configure streams. I think  
this also speaks for Jeff's idea. Instead of defining a context class  
which needs to support each and every possible combination of stream  
settings, providing overloaded functions seems to make more sense.
There was the requirement of reusing a context (to create multiple child  
processes with the same configuration). This would need to be given up  
with the executor concept (stream behaviors from context are used to  
create handles on demand; with the executor concept the library user would  
for example need to create pipes himself to pass read or write ends to  
child processes). I would be fine with this as I think a better design and  
API is more important than this requirement (so I agree with Jeremy that  
this is not worth the extra complexity).
When the executor is used to create a child process - whatever "use" means  
here - I wonder whether it should return a child object (currently done by  
Boost.Process) or a handle (eventually even a native handle?). With a  
handle a library user could decide himself what to do and whether he needs  
the features of the child class. The current implementation of the child  
class is in so far problematic as it does something extra on Windows only  
to match the behavior on POSIX (the handle of the child process is kept  
open to guarantee that the exit code can be fetched). Instead of a child  
class with methods, one could use a handle and free-standing functions.
There is another reason why the child class wouldn't be required anymore:  
It currently stores the stream ends the parent process uses to communicate  
with the child process. If we give up the context class and require  
library users to create handles themselves (via pipe, named_pipe and  
async_pipe) the parent process has already access to its "stream ends".  
The child class would only store a handle of the child process - but for  
that we don't need a child class of course. No process class wouldn't be  
required either. The methods wait() and terminate() would become  
free-standing functions.
Supporting asynchronous read/write operations would be as easy as it is:  
The handle can be used with Boost.Asio's classes. Whether we can reuse  
code from other libraries (Boost.Iostreams?) to support streams I don't  
know. Currently Boost.Process defines pistream and postream (but only  
because a close() method had to be added) which are based on a systembuf  
class (not sure if anything is available in other Boost libraries which  
could be reused here).
Even though I added support for an asynchronous wait() operation I argue  
now to remove that part (at least for POSIX platforms). The function is  
very easy to use because it blurs differences between systems. But this is  
more Java's than C++'s realm. Or to quote Artyom: "SIGCHLD is used for  
wait operation on the child to exit, it is its purpose." While one could  
try to implement an asynchronous wait() operation with SIGCHLD there are  
other problems. In my opinion a cleaner solution is to simply support  
signal handling. Here we give up the goal of being platform-independent.  
But if I look at all options I think having a signal handler for POSIX  
comes closest to what C++ developers and Boost users want. The signal  
handler from Dmitry Goncharov could be used for a start.
For developers on Windows support for an asynchronous wait() looks very  
different. The current implementation is based on WaitForMultipleObjects()  
which is called in a helper thread. This can be implemented as a  
Boost.Asio service (it wouldn't be the first one based on a helper  
thread). Boost.Asio doesn't ship a service based on  
WaitForMultipleObjects() (and there is currently no plan that such a  
service will be added to Boost.Asio) so it would need to be shipped with  
Boost.Process.
The Windows implementation of find_executable_in_path() has to be changed:  
It should use FindFirstFile()/FindNextFile() and PATHEXT.
If an utility function like shell() will be provided, it should use  
COMSPEC on Windows (and possibly something similar on POSIX instead of a  
hardcoded path to /bin/sh).
There was an interesting remark that create_child() (or an executor)  
should tell the parent process if the child process couldn't be created -  
even if fork() has been successful and there are already two processes. I  
guess we should indeed not write an error message to STDERR_FILENO but  
throw an exception. The user can then handle the error himself. In that  
case we need to help him though to find out whether the exception was  
thrown before fork() (then it's handled in the current process) or after  
fork() (then it's handled in the copy of the current process) - for  
example with two different exception classes.
Max proposed to use a DSEL. Whether I pass command line arguments to a  
process via an overloaded operator[] or overloaded functions like in  
Jeff's executor concept doesn't make a big difference to me in the end. I  
would probably prefer Jeff's proposal as I currently don't see what  
operator[] and operator% buy us. Max's proposal puts some emphasis on  
piping processes though which is not supported by the current draft. I  
guess it would make indeed sense to add support again (we had support in  
earlier versions) - and if it's only to think about how the library would  
need to evolve in the future. The biggest problem I have with Max's  
proposal is that it seems to be based on the assumption that declarative  
programming is automatically better than imperative programming. That may  
be true for certain use cases and some examples. However once you want to  
do something else not covered by the DSEL, you have to leave the DSEL  
behind. From the examples I've seen the DSEL seems to be optimized for  
piping processes. But changing the working directory of a child process,  
accessing a STARTUPINFO structure, setting up environment variables for a  
child process, terminating child processes or waiting for them and  
whatever other developers would like to do would all need to be done  
differently. I'm not sure whether a DSEL can be created which provides  
that much flexibility - especially as some complained that not even the  
current draft provides enough flexibility.
There were many more remarks about implementation details. But I think I  
would bore you if I wrote them all down. I stop for now - the email is  
long enough to give you lots of new opportunities to disagree. ;)
Boris