Subject: Re: [boost] Stacking iterators vs. dataflow
From: Sebastian Redl (sebastian.redl_at_[hidden])
Date: 2008-09-03 15:23:16


Phil Endecott wrote:
>
> Here's a practical example:
>
> cat email_with_attached_picture | decode_base64 | decode_jpeg |
> resize_image > /dev/framebuffer
>
> How can I convert that shell pipeline into C++? Naive approach:
>
> vector<byte> a = read_file("/path/to/email");
> vector<byte> b = decode_base64(a);
> vector<byte> c = decode_jpeg(b);
> vector<byte> d = resize_image(c);
> write_file("/dev/framebuffer",d);
>
> The problem with that is that I don't start to decode anything until
> I've read in the whole of the input. The system would be perceptibly
> faster if the decoding could start as soon as the first data were
> available.
>
> So I can use some sort of iterator adaptor stack or dataflow graph to
> process the data piece at a time. But it's important that I process
> it in pieces of the right size. Base64 encoding converts 6 input
> bytes into 4 output bytes, but it would be a bad idea to read the data
> from the file 6 bytes at a time; we should probably ask for BUFSZ
> bytes. libjpeg works in terms of lines, and you can ask it (at
> runtime, after it has read the file header) how many lines it suggests
> processing at a time (it's probably the height of the DCT blocks in
> the image). Obviously that corresponds to a variable number of bytes
> in the input.
>
> I would love to see how readers would approach this problem using the
> various existing and proposed libraries.
Since we're talking about systems that combine various approaches to
data processing, I should point out my IOChain library once more.
http://listarchives.boost.org/Archives/boost/2008/02/132953.php

It does something very similar to what you describe here.

IOStreams also does something like this with its filters.

Sebastian