From: Joaquin M López Muñoz (joaquinlopezmunoz_at_[hidden])
Date: 2025-06-16 08:57:56


El 13/06/2025 a las 14:47, Kostas Savvidis via Boost escribió:
>
>> On 12 Jun 2025, at 21:56, Joaquin M López Muñoz via Boost <boost_at_[hidden]> wrote:
>>
>> For branchful lookup, a cost model for mixed
>> lookup would be
>>
>> P*M*(A*K) + (1-P)*M*C
>>
>> where P is the percentange of successful lookups and M is an
>> extra cost factor on top of the non-mixed approach due to branch
>> mispredictions.
>
> I agree this effect will affect measurements to the point that you
> will never be able to be 100% sure.
> Lets just agree on the obvious consensus: an extra 64bit multiplication
> cannot affect the running time of any program on modern CPUs.

I finally implemented the change here:

https://github.com/joaquintides/bloom/commit/b28e526b9e2994d5470adb6ae5024811ff9f7e1d

There's a measureable effect, at least on microbenchmarks, compare results
without * (before) and with * (after):

https://github.com/boostorg/boost_bloom_benchmarks/blob/alternative-hash-production/README.md

But, well, it is what it is.

> The multiplier is from THE ART OF COMPUTER PROGRAMMING by D. Knuth, volume II, page 108.

I've settled for two multipliers (one for 64-bit and another for 32-bit
mode) from
Steele and Vigna:

https://arxiv.org/pdf/2001.05304

Joaquin M Lopez Munoz