AHash is designed to prevent an adversary that does not know the key from being able to create hash collisions or partial collisions.
If you are a cryptographer and would like to help review aHash's algorithm, please post a comment here.
In short, this is achieved by ensuring that:
finish
call at the end of the hash is designed to not expose individual bits of the internal state.More details are available on the wiki.
Cryptographic hashes are designed to make is nearly impossible to find two items that collide when the attacker has full control over the input. This has several implications:
For a HashMap the requirements are different.
u32
or similar, and to be effective the bucket that it should be hashed to needs to be computed in just a few CPU cycles.From a cryptography point of view, a hashmap needs something closer to a block cypher. Where the input can be quickly mixed in a way that cannot be reversed without knowing a key.
It is not designed to be. Attempting to use aHash as a secure hash will likely fail to hold up for several reasons:
There are efforts to build a secure hash function that uses AES-NI for acceleration, but aHash is not one of them.
AHash uses a number of tricks.
One trick is taking advantage of specialization. If aHash is compiled on nightly it will take advantage of specialized hash implementations for strings, slices, and primitives.
Another is taking advantage of hardware instructions. When it is available aHash uses AES rounds using the AES-NI instruction. AES-NI is very fast (on an intel i7-6700 it is as fast as a 64 bit multiplication.) and handles 16 bytes of input at a time, while being a very strong permutation.
This is obviously much faster than most standard approaches to hashing, and does a better job of scrambling data than most non-secure hashes.
On an intel i7-6700 compiled on nightly Rust with flags -C opt-level=3 -C target-cpu=native -C codegen-units=1
:
Input | SipHash 1-3 time | FnvHash time | FxHash time | aHash time | aHash Fallback* |
---|---|---|---|---|---|
u8 | 9.3271 ns | 0.808 ns | 0.594 ns | 0.7704 ns | 0.7664 ns |
u16 | 9.5139 ns | 0.803 ns | 0.594 ns | 0.7653 ns | 0.7704 ns |
u32 | 9.1196 ns | 1.4424 ns | 0.594 ns | 0.7637 ns | 0.7712 ns |
u64 | 10.854 ns | 3.0484 ns | 0.628 ns | 0.7788 ns | 0.7888 ns |
u128 | 12.465 ns | 7.0728 ns | 0.799 ns | 0.6174 ns | 0.6250 ns |
1 byte string | 11.745 ns | 2.4743 ns | 2.4000 ns | 1.4921 ns | 1.5861 ns |
3 byte string | 12.066 ns | 3.5221 ns | 2.9253 ns | 1.4745 ns | 1.8518 ns |
4 byte string | 11.634 ns | 4.0770 ns | 1.8818 ns | 1.5206 ns | 1.8924 ns |
7 byte string | 14.762 ns | 5.9780 ns | 3.2282 ns | 1.5207 ns | 1.8933 ns |
8 byte string | 13.442 ns | 4.0535 ns | 2.9422 ns | 1.6262 ns | 1.8929 ns |
15 byte string | 16.880 ns | 8.3434 ns | 4.6070 ns | 1.6265 ns | 1.7965 ns |
16 byte string | 15.155 ns | 7.5796 ns | 3.2619 ns | 1.6262 ns | 1.8011 ns |
24 byte string | 16.521 ns | 12.492 ns | 3.5424 ns | 1.6266 ns | 2.8311 ns |
68 byte string | 24.598 ns | 50.715 ns | 5.8312 ns | 4.8282 ns | 5.4824 ns |
132 byte string | 39.224 ns | 119.96 ns | 11.777 ns | 6.5087 ns | 9.1459 ns |
1024 byte string | 254.00 ns | 1087.3 ns | 156.41 ns | 25.402 ns | 54.566 ns |
As you can see above aHash like FxHash
provides a large speedup over SipHash-1-3
which is already nearly twice as fast as SipHash-2-4
.
Rust's HashMap by default uses SipHash-1-3
because faster hash functions such as FxHash
are predictable and vulnerable to denial of service attacks. While aHash
has both very strong scrambling and very high performance.
AHash performs well when dealing with large inputs because aHash reads 8 or 16 bytes at a time. (depending on availability of AES-NI)
Because of this, and its optimized logic, aHash
is able to outperform FxHash
with strings. It also provides especially good performance dealing with unaligned input. (Notice the big performance gaps between 3 vs 4, 7 vs 8 and 15 vs 16 in FxHash
above)
Hardware AES instructions are built into Intel processors built after 2010 and AMD processors after 2012. It is also available on many other CPUs should in eventually be able to get aHash to work. However, only X86 and X86-64 are the only supported architectures at the moment, as currently they are the only architectures for which Rust provides an intrinsic.
aHash also uses sse2
and sse3
instructions. X86 processors that have aesni
also have these instruction sets.