$include_dir="/home/hyper-archives/boost-users/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [Boost-users] [asio] Problem with async_read/async_read_some not reacting to incoming data
From: Bill Somerville (bill_at_[hidden])
Date: 2009-03-19 11:52:58
Stig Sandø wrote:
> Hei,
>
> We're rewriting a client with boost asio but have run into some 
> problems when stresstesting the client.  The client is fetching 
> textual and graphics from a server with one connection that is open at 
> all time.  When the client is getting large amounts of graphics data 
> it will after awhile suddenly stop receiving data, and eventually it 
> will hit our timeouts and the last call sent is an 
> async_read/async_read_some.  We are keeping the server well-fed with 
> requests so there should be graphics forthcoming without pause.  The 
> problem has been seen on win32, linux and darwin when testing on a 
> gigabit net fetching raw 1080i graphics (4M for each field), and is 
> most frequent on darwin.  This is naturally an absolute show-stopper 
> for us.
>
> So we are a bit loss what is going wrong and why 
> async_read/async_read_some stops reacting in the middle of the 
> fetch-queue, despite wireshark showing that the data is incoming.  
> When using compression on the data the problem is harder to reproduce, 
> which might suggest a race-condition somewhere.  But our code is just 
> using a single thread for io_service and all async-communication is 
> triggered from this io-thread which has a work-object to keep the 
> io_service spinning.  We're also making sure there is at most one 
> async_read and one async_write in effect at a time, roughly similar to 
> the chat_client sample.
I would be suspicious of the 'incoming_request' queue, where is that 
data being popped from the queue, if it is not from the context of the 
io_service thread then it is not thread safe.
>
> Has anyone seen something similar or have any input on how best to 
> figure out what goes wrong?  Are there invariants that says you cannot 
> read and write at the same time?
> Some symptoms are the same in each test.  When we get the last image 
> from the socket the buffersize is zero afterwards, and the next 
> async_read request is to transfer_at_least(1).  The async_read never 
> calls the handler for completion of this byte, so Nagle would have 
> kicked in.  It is also fairly hard to strip down to a small example 
> using a mock server.
>
>
> I have included some stripped down code below in case that might be 
> helpful spotting something that we cant see.
>
> Cheers,
> Stig
[snip ...]
HTH
-- Bill Somerville