$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
Subject: Re: [boost] Any interest in hashing algorithms SHA and/or FNV1a?
From: Niall Douglas (s_sourceforge_at_[hidden])
Date: 2013-11-13 13:18:01
On 13 Nov 2013 at 12:48, Jeff Flinn wrote:
> > Be aware that AFIO (currently in the peer review queue) is slowly
> > gaining an asynchronous batch hash engine. It's a bit different from
> > your normal hash engine, because it can do things like use SIMD to
> > process four SHA256 streams in parallel, and then using AFIO's
> > closure engine to parallelise that 4-SHA256 processing across
> > multiple cores i.e. achieve sixteen parallel SHA256 processing
> > streams. This lets one drop the normal 14.9 cycles/byte down to
> > around 1.4 cycles/byte amortised for SHA256 [1], a big win. The batch
> > hash engine uses a compile-time plugin system, so it can be
> > arbitrarily extended with other hash implementations if suitably
> > rewritten to fit.
> 
> 0Sounds interesting. Is the hash engine customizable to composite 
> algorithms? Particularly MD5/SHA1?
The API currently works by you instantiating a hash_engine template 
with the hash you want e.g. hash_engine<SHA256>. In the resulting 
class instance you have a member function for creating new hashes 
which returns a vector of fresh handles, a batch function for 
enqueuing extra memory ranges to a hash handle which returns a vector 
of futures representing the completed processing of that memory 
range, and if you pass in a null pointer block it enqueues 
termination of the hash. Each hash handle provides a future which 
becomes ready when the hash is terminated.
There is absolutely nothing stopping anyone from creating a 
hash_engine<MD5> and a hash_engine<SHA1> and feeding incoming memory 
ranges to both engines. Enqueuing ranges is instantaneous, as hashing 
occurs asynchronously. I would assume the MD5 engine would likely 
complete before the SHA1 engine, but that's easily coped with thanks 
to the futures.
> I'm blocked from using AFIO(on Windows) for my file processing as I need 
> to go thru BackupRead/BackupWrite api's which are explicitly 
> incompatible with overlapped io. I haven't looked if AFIO could offer 
> benefits with that restriction.
AFIO *always* opens all handles with backup semantics, because AFIO 
understands symlinks on Windows and treats them (nearly) identically 
to POSIX. If you go direct to the NT kernel API, the kernel provides 
a lovely function NtQueryInformationFile which when used with the 
FILE_ALL_INFORMATION class returns all possible metadata about a file 
and another function NtSetInformationFile which can set all possible 
metadata about a file. All you need to do above that is to query and 
store extended attributes and the security object, again both of 
which are trivial to do when using the NT kernel API directly.
My point is that it is relatively easy to write your own BackupRead 
and BackupWrite functions, if you skip the Win32 layer and go 
straight to the kernel. BTW I would be more than happy to accept any 
patches adding such a feature to AFIO because BackupRead and 
BackupWrite are seriously broken according to Microsoft themselves :)
Niall
-- Currently unemployed and looking for work. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/