There has been much discussion in the last couple of years
    concerning the STL, and the abstraction of a sequence (materialized
    by its concept of iterator) that it so elegantly uses.  Strangely
    enough, however, another major abstraction in the proposed standard
    library doesn't seem to get much attention, that of an abstract data
    sink and/or source, as materialized in Jerry Schwarz's
    streambuf.  This may be partially due to the fact that
    most people only use the formatting interface, and somehow only
    associate the streambuf with IO.  In fact, IO is only a
    special case of the more general abstraction of a data sink or
    source.
  
    In this article, I will concentrate on one particular variation of
    this abstraction, which I call a filtering streambuf.  In a
    filtering streambuf, the streambuf in question is not
    the ultimate sink or source, but simply processes the data and
    passes it on.  In this way, it acts somewhat like a pipe in UNIX,
    but within the process.  Anyone who has worked with UNIX is familiar
    with just how powerful this piping idiom can be.
  
Many different filtering streambuf's are possible: on input, we can filter out comments, or merge continuation lines; on output, we can expand tabs, or insert timestamps. Ideally, we would like to write as little code as possible, which suggests some sort of generic class to handle the boiler plating, with only the actual filtering to be written for each use.
    One small note: the iostream library has evolved some in the
    standards committee, and different compilers will have different
    versions.  For convenience, I've based the following code on the
    original release which accompanied CFront.  This is partially
    because this corresponds to the version I use most often, and
    partially because it is more or less a common least denominator;
    implementations supporting the more modern features will generally
    also support the older ones.  I've also ignored namespaces, largely
    for the same reasons. You may find that you need some adaptation, to
    make my code work with your compiler.  (All of the code has been
    tested with Sun CC 4.1, g++ 2.7.2 and Microsoft Visual C++ 5.0; with
    g++ and the Microsoft compiler, the old iostream library,
    <iostream.h>, has been used.)
  
In this article, I will also try and explain more of the general principles of writing a streambuf. Judging from questions on the network, this is a little known area of C++, so I cannot suppose knowledge of even the basic principles.
    To begin with, the abstraction behind the streambuf is that of a
    data sink and source.  Istream and ostream take care of the
    formatting, and use a streambuf as the source or sink of individual
    characters.  The class derived from streambuf takes
    care of any buffering, and of getting the characters to or from
    their final source or destination.  Buffering within the streambuf
    is done in what are known as the get and the put areas; these will
    be explained as necessary in the implementation.
  
    istream or ostream has a number of
    functions with which to interface with the streambuf.  For the most
    part, these functions work on the get or the put area (the buffers).
    If there is no room in the put area to output a character, the
    virtual function overflow is called.  If there are no
    characters in the get area when a read is attempted, the virtual
    function underflow is called.  The key to writing a
    streambuf is in overriding these two functions.  It is also
    generally necessary to override sync, and sometimes
    setbuf, and if of course, if we wish to support random
    access, we also have to override seekpos and
    seekoff.  (We will not consider random access here.
    Since the default behavior of these to functions is to return an
    error, we can ignore them.)
  
If the streambuf is to handle input and output simultaneously, we will also have to think about synchronization issues. In fact, filtering streambuf's are almost always unidirectional, and we will in fact only consider unidirectional buffers in this article.
    For various reasons, output is slightly simpler than input, so we
    will start with it.  The template class will be called
    FilteringOutputStreambuf; it will be instantiated over
    a function like object that can be called with a reference to a
    streambuf and a character to be output, and will return an int:
    either the next character in the sequence, or EOF if
    none is available.
  
First, the class definition. I'll describe the details as they are used in the functions later:
    template< class Inserter >
    class FilteringOutputStreambuf : public streambuf
    {
    public:
                            FilteringOutputStreambuf(
                                streambuf*     dest ,
                                Inserter       i ,
                                bool           deleteWhenFinished 
                                                    = false ) ;    
                            FilteringOutputStreambuf(
                                streambuf*     dest ,
                                bool           deleteWhenFinished 
                                                    = false ) ;
        virtual             ~FilteringOutputStreambuf() ;
        virtual int         overflow( int ch ) ;
        virtual int         underflow() ;
        virtual int         sync() ;
        virtual streambuf*  setbuf( char* p , int len ) ;
        inline Inserter&    inserter() ;
    private:
        streambuf*          myDest ;
        Inserter            myInserter ;
        bool                myDeleteWhenFinished ;
    } ;
  
    I'll begin with the easiest functions.  Although the recently
    adopted draft standard defines the default implementation of
    underflow to fail, this behavior was undefined in
    earlier implementations, so it is generally a good idea to override
    this function to simply return EOF -- any attempt
    to read from our output filtering streambuf should result in
    failure.
  
    We generally want to be able to look at every character, so any
    buffering is out.  To avoid buffering in the output streambuf, it is
    sufficient to never define a buffer.  As for the function
    setbuf, I generally just pass it directly on to the
    final streambuf; this is the class which should be doing the actual
    buffering anyway.  (Of course, since myDest is a pointer, I check
    for NULL before doing so.)
  
    Sync is also surprisingly simple.  Since we don't have
    any buffering, there is nothing to synchronize, so we just pass this
    one on to the final destination as well.
  
    The function inserter is just for convenience --
    I've never actually found a use for it on output, but its equivalent
    on input is sometimes useful, and I like to keep things orthogonal.
    It just returns a reference to the myInserter member.
  
    Which leaves overflow:
  
    template< class Inserter >
    int
    FilteringOutputStreambuf< Inserter >::overflow( int ch )
    {
        int                 result( EOF ) ;
        if ( ch == EOF )
            result = sync() ;
        else if ( myDest != NULL )
        {
            assert( ch >= 0 && ch <= UCHAR_MAX ) ;
            result = myInserter( *myDest , ch ) ;
        }
        return result ;
    }
  
    Although it wasn't ever specified, earlier implementations of
    iostream would flush the buffer if overflow was called with
    EOF, and some applications may count on this, so we
    want to do likewise.  Other than that, we ensure that myDest isn't
    NULL before calling the myInserter, and
    that's it.
  
    Finally, there are the constructors and the destructors.  In fact,
    there isn't much to say about them either; they simply initialize
    the obvious local variables.  The only particularity is the Boolean
    flag to transfer ownership of the targeted streambuf; a commodity
    feature for the user, which also simplifies exception safety.  In
    the destructor, I generally call sync, but it isn't
    necessary.  And of course, if the user asked for it, I delete the
    final destination streambuf.
  
    A simple use of this class might be to systematically insert a time
    stamp at the start of every line.  (The operator()
    function is simpler than it looks; the only real work is in getting
    and formatting the time.)
  
    class TimeStampInserter
    {
    public:
                            TimeStampInserter()
                                :   myAtStartOfLine( true )
        {
        }
        int                 operator()( streambuf& dst , int ch )
        {
            bool                errorSeen( false ) ;
            if ( myAtStartOfLine && ch != '\n' )
            {
                time_t              t( time( NULL ) ) ;
                tm*                 time = localtime( &t ) ;
                char                buffer[ 128 ] ;
                int                 length(
                    strftime( buffer ,
                              sizeof( buffer ) ,
                              "%c: " ,
                              time ) ) ;
                assert( length > 0 ) ;
                if ( dst.sputn( buffer , length ) != length )
                    errorSeen = true ;
            }
            myAtStartOfLine = (ch == '\n') ;
            return errorSeen
                ?   EOF 
                :   dst.sputc( ch ) ;
        }
    private:
        bool                 myAtStartOfLine ;
    } ;
  (In case you're wondering: I don't normally write functions in the class definition like this, but it seems easier for exposition to put everything in one place.)
    In this case, there really isn't any reason to ever pass an
    Inserter argument to the OutputFilteringStreambuf
    constructor, since all instances of the class are idempotent.  If
    the class didn't require state, you could write it as a function,
    and pass a pointer to the function as argument to the constructor of
    an 
    OutputFilteringStreambuf< int (*)( streambuf& , int ) >.
    (I'd seriously consider using a typedef in
    this case.  The above not only confuses human readers; it has
    confused more than one compiler I've tried it with as well.)
  
    Now we'll do the same thing for input; we'll call it
    FilteringInputStreambuf.  This class is slightly more
    complicated than the output one, because of the interface definition
    of underflow: underflow does not extract a
    character from the input stream, it simply ensures that there is a
    character in the buffer.  Which in turn means that we cannot ignore
    the issue of bufferization completely.  Anyway, here's the class
    definition; the instantiation type must be callable with a reference
    to a streambuf, and return an int (either the character
    read or EOF):
  
    template< class Extractor >
    class FilteringInputStreambuf : public streambuf
    {
    public:
                            FilteringInputStreambuf(
                                streambuf*          source ,
                                Extractor           x ,
                                bool                deleteWhenFinished 
                                                        = false ) ;
                            FilteringInputStreambuf(
                                streambuf*          source ,
                                bool                deleteWhenFinished 
                                                        = false ) ;
        virtual             ~FilteringInputStreambuf() ;
        virtual int         overflow( int ) ;
        virtual int         underflow() ;
        virtual int         sync() ;
        virtual streambuf*  setbuf( char* p , int len ) ;
        inline Extractor&   extractor() ;
    private:
        streambuf*          mySource ;
        Extractor           myExtractor ;
        char                myBuffer ;
        bool                myDeleteWhenFinished ;
    } ;
  
    As with output, we'll do the easy parts first: overflow
    is just an error (return EOF), and setbuf
    is forwarded.  We could argue that sync should be
    either an error or forwarded, since it isn't supposed to do anything
    on an input stream.  In fact, in our case, synchronization does have
    a meaning, since any characters in our local buffer have been
    extracted from the real input streambuf, but have not been read.
    The function extractor just returns a reference to the corresponding
    data member.  Unlike the output side, this has a definite use: some
    of the filters may remove newline characters; in such cases, the
    extractor should maintain the correct line number from the source,
    and the user should access the extractor to obtain it for e.g. error
    messages.
  
    Which brings us to the question of bufferization: we need a buffer
    of at least one character in order to correctly support the
    semantics of underflow which in turn are thus defined
    in order to support non-extracting look-ahead, for parsing things
    like numbers, where you cannot know when you have finished before
    having seen a character you don't want.  To keep things simple, we
    maintain a one character buffer directly in the class:
    myBuffer.
  
    Which gives us enough information to write underflow:
  
    template< class Extractor >
    int
    FilteringInputStreambuf< Extractor >::underflow()
    {
        int                 result( EOF ) ;
        if ( gptr() < egptr() )
            result = *gptr() ;
        else if ( mySource != NULL )
        {
            result = myExtractor( *mySource ) ;
            if ( result != EOF )
            {
                assert( result >= 0 && result <= UCHAR_MAX ) ;
                myBuffer = result ;
                setg( &myBuffer , &myBuffer , &myBuffer + 1 ) ;
            }
        }
        return result ;
    }
  Several points are worth mentioning:
setg (a member function of streambuf)
        to set the pointers into the buffer.
    
    I generally define sync to resynchronize with the
    actual source:
  
    template< class Extractor >
    int
    FilteringInputStreambuf< Extractor >::sync()
    {
        int                 result( 0 ) ;
        if ( mySource != NULL )
        {
            if ( gptr() < egptr() )
            {
                result = mySource->sputbackc( *gptr() ) ;
                setg( NULL , NULL , NULL ) ;
            }
            if ( mySource->sync() == EOF )
                result = EOF ;
        }
        return result ;
    }
  If we have a character in our buffer, we send it back, and clear the buffer. And I sync with the original source -- it may be a FilteringInputStream as well.
    The constructors are just simple initialization; the destructor adds
    a call to sync, and deletion of the source if requested.
  
I tend to use this class much more often than the output. One simple example: stripping end of line comments:
    class UncommentExtractor
    {
    public:
                            UncommentExtractor( char commentChar = '#' )
                                :   myCommentChar( commentChar )
        {
        }
        int                 operator()( streambuf& src )
        {
           int                 ch( src.sbumpc() ) ;
           if ( ch == myCommentChar )
           {
              while ( ch != EOF && ch != '\n' )
                 ch = src.sbumpc() ;
           }
           return ch ;
        }
    private:
        char                myCommentChar ;
    } ;
  
    With the above, we can already do everything necessary.  Still, it
    is often extra work to have to declare the streambuf and the istream
    or ostream separately.  So it is convenient to also define the
    corresponding template classes for istream and ostream.  Here's the
    class definition for FilteringIstream;
    FilteringOstream follows the same pattern:
  
    template< class Extractor >
    class FilteringIstream 
        :   private FilteringInputStreambuf< Extractor >
        ,   public istream
    {
    public:
                            FilteringIstream( istream& source ,
                                              Extractor x ) ;
                            FilteringIstream( istream& source ) ;
                            FilteringIstream( 
                                streambuf*          source ,
                                Extractor           x ,
                                bool                deleteWhenFinished
                                                        = false ) ;
                            FilteringIstream( 
                                streambuf*          source ,
                                bool                deleteWhenFinished
                                                        = false ) ;
        virtual             ~FilteringIstream() ;
        FilteringInputStreambuf< Extractor >*
                            rdbuf() ;
    } ;
  
    The somewhat unusual inheritance is a trick I learned from Dietmar
    Kühl; it serves to ensure that the streambuf is fully initialized
    before its address is passed to the constructor of istream, without
    having to allocate it dynamically on the stack.  It's also worth
    noting the constructors taking an istream&, instead
    of a streambuf*; again, just a convenience, but it
    means that you can pass cin directly as an argument,
    rather than having to use cin.rdbuf().  (The call to
    rdbuf is still there, of course.  In the initialization
    list of the constructor.)  And if another istream is using the
    streambuf, you certainly don't want to delete it, so we drop that
    parameter completely.
  
    With all this, if you want to read standard in, ignoring end of line
    comments, all you need is the UncommentExtractor shown
    above, and the following definition:
  
    FilteringIstream< UncommentExtractor >
                        input( cin ) ;
  That's all there is to it.
As you have seen, creating your own streambuf's can be a powerful idiom. And we've only scratched the surface of the possibilities. The complete code for all of the classes discussed in this article, along with a number of additional inserters and extractors, can be downloaded from this site, so you can try it yourself.
It would be unfair if I tried to take credit for the entire concept. First and foremost, if Jerry Schwarz hadn't come up with the original idea of separating the sinking and sourcing of the data from the formatting, none of this would have been possible. And most of what I know about iostream, I learned from contributors in the C++ newsgroups, particularly Steve Clamage, who has always taken the time to answer most of the serious questions posed there. More recently, people like Dietmar Kühl have been pursuing similar paths of research.
Finally, I owe particular thanks to the customer site at which I first applied this technique, the LTS division of Alcatel SEL, in Stuttgart, and to my boss there, Ömer Oskay. The freedom they gave me to pursue new ways of doing things was amazing, and while I like to think that it always paid off for them in the end, it certainly wasn't always obvious beforehand that it would. Without their confidence in me, most of this work would not have been possible.