$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: Justin McManus (justin_at_[hidden])
Date: 2020-09-05 18:23:36
I have some code that works as intended, but it requires setting a
buffer_size parameter to zero on a std::ifstream pushed onto a filtering
chain, and I'd like to understand why, to ensure I'm not introducing a bug
or a hack.
I have essentially the following code:
--------------------------------------------------------------------------------------------------------
std::ifstream m_jf("json_filename", std::ios_base::in |
std::ios_base::binary);
std::locale utf8_locale("en_US.UTF-8");
m_jf.imbue(utf8_locale);
boost::iostreams::filtering_istream m_inbuf;
m_inbuf.push(boost::iostreams::bzip2_decompressor());
m_inbuf.push(m_jf);
std::string m_line;
while (std::getline(m_inbuf, m_line)) {
// Process the current line from the JSON file
}
--------------------------------------------------------------------------------------------------------
What I find is that the std::getline call will fail before the code has
reached the EOF. It will always fail at the same line in a given JSON file,
but it will fail on different lines in different JSON files. It's perfectly
reproducible.
However, if I change lines 4 and 5 to
m_inbuf.push(boost::iostreams::bzip2_decompressor(), *0*);
m_inbuf.push(m_jf, *0*);
then the problem goes away.
My question is, Why does setting the buffer_size parameter to zero solve
the issue? What does this do, exactly? I saw the suggestion to set the
buffer size this way from an old post in 2009, and it appears to work, but
I'd like a deeper understanding of what's happening under the hood. If the
buffer size is set to zero, what does the underlying implementation do, and
how might this influence whether std::getline fails before the EOF?
Thanks very much,
Justin
-- Justin McManus, Ph.D. Principal Scientist Lead Computational Biologist and Statistical Geneticist Kallyope, Inc. 430 East 29th Street, Suite 1050 New York, NY 10016