$include_dir="/home/hyper-archives/boost/include"; include("$include_dir/msg-header.inc") ?>
From: Ivan Matek (libbooze_at_[hidden])
Date: 2025-06-10 12:52:09
One more thing I noticed/checked and I believe may be of interest, again
not in particular for this proposed Knuth multiplier change, but in general.
Some filters have much weaker performance for mixed lookup.
To be honest I have expected that for some(branch misprediction) but I do
not know enough about all filters to say if slowdowns I have observed are
as expected.
Interesting that some fast_ have bad performance for mixed lookup, although
naively I would assume SIMD code is branchless. I presume the issue is that
sometimes we operate on more than 256 bits so there is branching...
What I find interesting beside this slowdown is that unordered_flat_set also
has bad performance for mixed lookup so bloom filters that do not degrade
on mixed lookup look much better than if we just compare 100%/0% cases.
For example if we round performance to 1 decimal digit and compare
unordered_flat set
4.7/3.1/*9.4*
vs
filter<int, 1ul, block<unsigned long, 7ul>, 1ul, hash<int>, allocator<int>,
mcg_and_fastrange>
3.3/3.3/*3.3*
comparison looks much much different if we drop mixed performance(bolded
part).
Here is output my small test run for 1M items, I have also attached source
if somebody wants to run it on his machine, it is modified benchmark code.
Note that I did not add mixed numbers to html output, they are printed on
cerr for cases when slowdown is detected.
usual disclaimer: my machine, only 1 compiler, etc.
Slow mixed lookup unordered_flat_set_filter<int>
FPR 0
insertion_time 17.5341
successful_lookup_time 4.68546
unsuccessful_lookup_time 3.10828
mixed_lookup_time 9.3794
Slow mixed lookup filter<int, 1ul, fast_multiblock32<11ul>, 0ul,
hash<int>, allocator<int>, mcg_and_fastrange>
FPR 0.1169
insertion_time 3.13323
successful_lookup_time 3.11569
unsuccessful_lookup_time 2.1553
mixed_lookup_time 9.8229
Slow mixed lookup filter<int, 1ul, fast_multiblock32<13ul>, 0ul,
hash<int>, allocator<int>, mcg_and_fastrange>
FPR 0.028
insertion_time 3.35237
successful_lookup_time 3.57547
unsuccessful_lookup_time 2.49646
mixed_lookup_time 10.173
Slow mixed lookup filter<int, 1ul, fast_multiblock32<11ul>, 0ul,
hash<int>, allocator<int>, fastrange_and_fixed_mcg>
FPR 0.1201
insertion_time 3.19758
successful_lookup_time 3.23826
unsuccessful_lookup_time 2.24502
mixed_lookup_time 9.93281
Slow mixed lookup filter<int, 1ul, fast_multiblock32<13ul>, 0ul,
hash<int>, allocator<int>, fastrange_and_fixed_mcg>
FPR 0.0319
insertion_time 3.32857
successful_lookup_time 3.33717
unsuccessful_lookup_time 2.60953
mixed_lookup_time 10.5385
Slow mixed lookup filter<int, 1ul, fast_multiblock32<11ul>, 1ul,
hash<int>, allocator<int>, mcg_and_fastrange>
FPR 0.084
insertion_time 3.33082
successful_lookup_time 3.55582
unsuccessful_lookup_time 2.28511
mixed_lookup_time 9.61687
Slow mixed lookup filter<int, 1ul, fast_multiblock64<11ul>, 0ul,
hash<int>, allocator<int>, mcg_and_fastrange>
FPR 0.0761
insertion_time 4.61749
successful_lookup_time 4.8703
unsuccessful_lookup_time 3.57926
mixed_lookup_time 11.4899
Slow mixed lookup filter<int, 1ul, fast_multiblock64<11ul>, 1ul,
hash<int>, allocator<int>, mcg_and_fastrange>
FPR 0.0629
insertion_time 4.47389
successful_lookup_time 4.73559
unsuccessful_lookup_time 3.62872
mixed_lookup_time 11.0488
Slow mixed lookup filter<int, 1ul, fast_multiblock32<13ul>, 1ul,
hash<int>, allocator<int>, mcg_and_fastrange>
FPR 0.0183
insertion_time 3.49621
successful_lookup_time 3.31442
unsuccessful_lookup_time 2.77045
mixed_lookup_time 9.50943
Slow mixed lookup filter<int, 1ul, fast_multiblock64<13ul>, 0ul,
hash<int>, allocator<int>, mcg_and_fastrange>
FPR 0.0154
insertion_time 5.32834
successful_lookup_time 6.67898
unsuccessful_lookup_time 4.46843
mixed_lookup_time 12.1819
Slow mixed lookup filter<int, 1ul, fast_multiblock64<14ul>, 1ul,
hash<int>, allocator<int>, mcg_and_fastrange>
FPR 0.0121
insertion_time 6.06002
successful_lookup_time 6.80858
unsuccessful_lookup_time 4.30832
mixed_lookup_time 12.2018
Slow mixed lookup filter<int, 1ul, fast_multiblock32<11ul>, 1ul,
hash<int>, allocator<int>, fastrange_and_fixed_mcg>
FPR 0.0914
insertion_time 3.14643
successful_lookup_time 3.08782
unsuccessful_lookup_time 2.14325
mixed_lookup_time 9.40797
Slow mixed lookup filter<int, 1ul, fast_multiblock64<11ul>, 0ul,
hash<int>, allocator<int>, fastrange_and_fixed_mcg>
FPR 0.0757
insertion_time 4.24599
successful_lookup_time 4.46389
unsuccessful_lookup_time 3.33413
mixed_lookup_time 11.2218
Slow mixed lookup filter<int, 1ul, fast_multiblock64<11ul>, 1ul,
hash<int>, allocator<int>, fastrange_and_fixed_mcg>
FPR 0.0638
insertion_time 4.29631
successful_lookup_time 4.46474
unsuccessful_lookup_time 3.33839
mixed_lookup_time 11.0688
Slow mixed lookup filter<int, 1ul, fast_multiblock32<13ul>, 1ul,
hash<int>, allocator<int>, fastrange_and_fixed_mcg>
FPR 0.0191
insertion_time 3.28589
successful_lookup_time 3.28611
unsuccessful_lookup_time 2.50493
mixed_lookup_time 9.61269
Slow mixed lookup filter<int, 1ul, fast_multiblock64<13ul>, 0ul,
hash<int>, allocator<int>, fastrange_and_fixed_mcg>
FPR 0.0145
insertion_time 5.298
successful_lookup_time 5.89132
unsuccessful_lookup_time 5.36365
mixed_lookup_time 15.9068
Slow mixed lookup filter<int, 1ul, fast_multiblock64<14ul>, 1ul,
hash<int>, allocator<int>, fastrange_and_fixed_mcg>
FPR 0.0125
insertion_time 6.79997
successful_lookup_time 6.54907
unsuccessful_lookup_time 5.04036
mixed_lookup_time 13.1906