Subject: Re: [boost] NuDB: A fast key/value insert-only database for SSD drives in C++11
From: Olaf van der Spek (ml_at_[hidden])
Date: 2017-03-22 10:50:55


On Wed, Mar 22, 2017 at 11:43 AM, Niall Douglas via Boost
<boost_at_[hidden]> wrote:
> Plucking straight from random as it was too long ago I examined your
> source, but a classic mistake is to assume this is sequentially consistent:
>
> int keyfd, storefd;
> write(storefd, data)
> fsync(storefd)
> write(keyfd, key)
> fsync(keyfd)
>
> Here the programmer writes the value being stored, persists it, then
> writes the key to newly stored value, persists that. Most programmers
> unfamiliar with filing systems will assume that the fsync to the storefd
> cannot happen after the fsync to the keyfd. They are wrong, that is a
> permitted reorder. fsyncs are only guaranteed to be sequentially
> consistent *on the same file descriptor* not different file descriptors.

Just curious, how is that permitted?

Isn't fsync() supposed to ensure data is on durable storage before it returns?