Adaptive Hashing: Faster Hash Functions Perhaps with Fewer Collisions

Published: 7 May 2024

Abstract

Hash tables are ubiquitous, and the choice of hash function, which maps a key to a bucket, is key for their performance. We argue that the predominant approach of fixing the hash function for the lifetime of the hash table is suboptimal and propose adapting it to the current set of keys. In the prevailing view, good hash functions spread the keys ``randomly'' and are fast to evaluate. General-purpose ones (e.g. Murmur) are designed to do both while remaining agnostic to the distribution of the keys, which limits their bucketing ability and wastes computation. When these shortcomings are recognised, the user of the hash table may specify a hash function more tailored to the expected key distribution, but doing so almost always introduces an unbounded risk in case their assumptions do not bear out in practice. At the other, fully key-aware end of the spectrum, Perfect Hashing algorithms can discover hash functions to bucket a given set of keys optimally, but they are costly to run and require the keys to be known and fixed ahead of time. Our main conceptual contribution is that adapting the hash table's hash function to the keys online is necessary for the best performance as adaptivity allows for better bucketing of keys and faster hash functions. We instantiate the idea of online adaptation with minimal overhead and no change to the hash table API. The experiments show that the adaptive approach marries the common-case performance of weak hash functions with the robustness of general-purpose ones.

Authors

Gábor Melis

Venue

European Lisp Symposium 2024

Gemini

Gemma

Generative models

Experiments

Projects

Publications

News

AI for biology

AI for climate and sustainability

AI for mathematics and computer science

AI for physics and chemistry

AI transparency

News

Careers

Milestones

Education

Responsibility

The Podcast

Adaptive Hashing: Faster Hash Functions Perhaps with Fewer Collisions

Abstract

Authors

Venue

Adaptive Hashing: Faster Hash Functions Perhaps with Fewer Collisions

Share

Abstract

Authors

Venue