Boost mailing page: Re: [bloom] Benchmarks with Knuth multiplier-based hash production

Date view	Thread view	Subject view	Author view

From: Joaquin M LÃ³pez MuÃ±oz (joaquinlopezmunoz_at_[hidden])
Date: 2025-06-09 08:31:28

Next message: Ivan Matek: "Re: [bloom] Benchmarks with Knuth multiplier-based hash production"
Previous message: Ivan Matek: "Re: [bloom] Benchmarks with Knuth multiplier-based hash production"
In reply to: Ivan Matek: "Re: [bloom] Benchmarks with Knuth multiplier-based hash production"
Next in thread: Ivan Matek: "Re: [bloom] Benchmarks with Knuth multiplier-based hash production"
Reply: Ivan Matek: "Re: [bloom] Benchmarks with Knuth multiplier-based hash production"

El 08/06/2025 a las 17:13, Ivan Matek escribiÃ³:
>
>
> On Sat, Jun 7, 2025 at 8:05â€¯PM Joaquin M LÃ³pez MuÃ±oz via Boost
> <boost_at_[hidden]> wrote:
>
>
> Anyway, why don't you run it locally and play with the #pragmas?
>
>
> Because when I quickly go to benchmark something 9 hours later I am
> just quickly benchmarking something :)
> Also assuring reproducibility is pain, e.g. I do not have unused
> machine on which I can SSH into, to avoid my browser use or random
> background process messing with benchmark, especially considering
> bloom uses L3 cache a lot.

Hey, thanks so much for running the benchmarks! Yes, variance hurts
analysis. I'm plannning to move my GHA-based benchmarks to dedicated
machines so that results are more stable.

> Besides, I'm interested in results outside my local machine and GHA.
> You just have to compile this in release mode (note the repo branch):
>
> https://github.com/joaquintides/bloom/blob/feature/alternative-hash-production/benchmark/comparison_table.cpp
>
>
> Â Well it was more complicated since I already have modular boost on my
> machine so I had to do some hacks to get CMakeLists.txt to work and
> also benchmark did not have CMakeLists.txt, and also I did use
> march=native, mtune=native instead of what your scripts do...
>
> But to quickly recap:
>
> 1. There seems to be no unrolling happening without me doing it with
> pragmas.
> 2. I have increased constants to reduce chance of noise affecting
> results:
> - Â static const int Â Â Â Â Â Â Â num_trials=10;
> - Â static const milliseconds Â Â min_time_per_trial(10);
> + Â static const int Â Â Â Â Â Â Â num_trials=20;
> + Â static const milliseconds Â Â min_time_per_trial(50);
> 3. I did this to make tables more aligned:
> - Â Â "<table>\n"
> + Â Â "<table style=\"font-family: monospace\">\n"
> 4. In terms of benchmark setup I would add 5% of "opposite"
> lookups(e.g. success in failures) since I presume current setup
> does not penalize branchy code as realistic scenarios
> would(although it is possible real code might also might have
> close to 100% of successes or failures). Just to be clear: I did
> not make this change.
> 5. I would suggest to to consider switching benchmark repo to use
> native instead of mavx2
>

So, unrolling does not happen, this is out of the way, thanks for
investigating.
I'll use -native as you suggest. As for the difference between the original
hash production scheme and the one proposed by Kostas (cells marked
with *), numbers are not very conclusive, but looks like Kostas's approach
incurs a slight degradation in execution time. I hope we can see this more
clearly with the upcoming GHA benchmarks on dedicated machines.

Joaquin M Lopez Munoz

Next message: Ivan Matek: "Re: [bloom] Benchmarks with Knuth multiplier-based hash production"
Previous message: Ivan Matek: "Re: [bloom] Benchmarks with Knuth multiplier-based hash production"
In reply to: Ivan Matek: "Re: [bloom] Benchmarks with Knuth multiplier-based hash production"
Next in thread: Ivan Matek: "Re: [bloom] Benchmarks with Knuth multiplier-based hash production"
Reply: Ivan Matek: "Re: [bloom] Benchmarks with Knuth multiplier-based hash production"

Date view	Thread view	Subject view	Author view