The Hash Map core type offers an ordered key-value map with a very simple API.
The Hash Map is ordered by order of insertion.
The core Hash Map type is provided in a single file library fio_hashmap.h
, which allows it's functions to be inlined for maximum performance (similarly to macros).
The Hash Map defaults to uint64_t
keys, meaning that matching is performed using a simple integer comparison. However, this could be changed by defining macros before including the library file for increased collision protection.
Much like the Array in the introduction to the simple core types, Hash Map containers can be placed on the stack as well as allocated dynamically.
The Core Hash uses the fio_hash_s
type.
The data in in the fio_hash_s
type should be considered opaque and shouldn't be accessed directly. Functional access should be preferred.
The fio_hash_s
utilizes two arrays using internal data types that contain duplicates of the Hash key data, maximizing memory locality for Hash Map operations and minimizing cache misses to increase performance.
typedef struct fio_hash_data_ordered_s {
FIO_HASH_KEY_TYPE key; /* another copy for memory cache locality */
void *obj;
} fio_hash_data_ordered_s;
typedef struct fio_hash_data_s {
FIO_HASH_KEY_TYPE key; /* another copy for memory cache locality */
struct fio_hash_data_ordered_s *obj;
} fio_hash_data_s;
/* the information in the Hash Map structure should be considered READ ONLY. */
struct fio_hash_s {
uintptr_t count;
uintptr_t capa;
uintptr_t pos;
uintptr_t mask;
fio_hash_data_ordered_s *ordered;
fio_hash_data_s *map;
};
fio_hash_new
void fio_hash_new(fio_hash_s *hash)
Allocates and initializes internal data and resources.
It's important to call this function before using the Hash Map.
Lazy initialization of the Hash Map is possible by initializing the fio_hash_s
to 0 (the default for static variables).
fio_hash_free
void fio_hash_free(fio_hash_s *hash)
Deallocates any internal resources.
It is critical that this function is called to prevent memory leaks. However, this function is not enough to prevent memory leaks - custom keys (if any) and object data should also be deallocated (see also FIO_HASH_FOR_FREE
).
fio_hash_insert
void *fio_hash_insert(fio_hash_s *hash,
FIO_HASH_KEY_TYPE key,
void *obj)`
Inserts an object to the Hash Map Table, rehashing if required, returning the old object if it exists.
Set obj
to NULL to remove an existing object (the existing object will be returned).
The FIO_HASH_KEY_TYPE
defaults to uint64_t
, this could be changed by defining macros before including the library file for increased collision protection.
fio_hash_find
void *fio_hash_find(fio_hash_s *hash, FIO_HASH_KEY_TYPE key)
Locates an object in the Hash Map Table according to the hash key value.
fio_hash_count
size_t fio_hash_count(const fio_hash_s *hash)
Returns the number of elements currently in the Hash Table.
fio_hash_capa
size_t fio_hash_capa(const fio_hash_s *hash)
Returns a temporary theoretical Hash map capacity.
This could be used for testing performance and memory consumption.
fio_hash_compact
size_t fio_hash_compact(fio_hash_s *hash)
Attempts to minimize memory usage by removing empty spaces caused by deleted items (freeing their custom keys, if any) and rehashing the Hash Map.
Returns the updated hash map capacity.
fio_hash_rehash
void fio_hash_rehash(fio_hash_s *hash)
Forces a rehashing of the hash, increasing memory consumption as well as minimizing internal collisions (possibly improving seek times).
This function is called automatically when needed, it's unlikely that you should want to call the function yourself.
fio_hash_each
size_t fio_hash_each(fio_hash_s *hash,
const size_t start_at,
int (*task)(FIO_HASH_KEY_TYPE key,
void *obj,
void *arg),
void *arg);
Iteration using a callback for each entry in the Hash Table.
The callback task function must accept the hash key, the entry data and an opaque user pointer:
int example_task(FIO_HASH_KEY_TYPE key, void *obj, void *arg) {return 0;}
If the callback returns -1, the loop is broken. Any other value is ignored.
Returns the relative "stop" position, i.e., the number of items processed + the starting point.
FIO_HASH_FOR_LOOP
FIO_HASH_FOR_LOOP(hash, i)
A macro for a for
loop that iterates over all the hashed objects (in
order).
hash
a pointer to the hash table variable and i
is a temporary variable
name to be created for iteration.
i->key
is the key and i->obj
is the hashed data.
The i
variable can be names differently (i.e. FIO_HASH_FOR_LOOP(hash, pos)
for pos->key
and pos->obj
).
FIO_HASH_FOR_FREE
FIO_HASH_FOR_FREE(hash, i)
A macro for a for
loop that will iterate over all the hashed objects (in
order) and empties the hash, later calling fio_hash_free
to free the hash.
hash
a pointer to the hash table variable and i
is a temporary variable
name to be created for iteration.
i->key
is the key and i->obj
is the hashed data.
Free the objects and the Hash Map container manually (if required). Custom keys will be freed automatically when using this macro.
FIO_HASH_FOR_EMPTY
FIO_HASH_FOR_EMPTY(hash, i)
A macro for a for
loop that iterates over all the hashed objects (in
order) and empties the hash.
This will also reallocate the map's memory (to zero out the data), so if this
is performed before calling fio_hash_free
, use FIO_HASH_FOR_FREE
instead.
hash
a pointer to the hash table variable and i
is a temporary variable
name to be created for iteration.
i->key
is the key and i->obj
is the hashed data.
Free the objects and the Hash Map container manually (if required). Custom keys will be freed automatically when using this macro.
The Hash Map is collision resistant as long as it's keys are truly unique.
If there's a chance that the default uint64_t
key type will not be be able to uniquely identify a key, the following macros should all be defined, before including the fio_hashmap.h
file in the .c
file, allowing the default key system to be replaced:
FIO_HASH_KEY_TYPE
This macro sets the type used for keys.
FIO_HASH_KEY_INVALID
Empty slots in the Hash Map are initialized so all their bytes are zero.
This macro should signify a static key that has all it's byte set to zero, making it an invalid key (it cannot be used, and objects placed in that slot will be lost).
FIO_HASH_KEY2UINT(key)
This macro should convert the key to a unique unsigned number.
This, in effect, should return the hash value for the key and cannot be zero.
The value is used to determine the location of the key in the map (prior to any collisions) and a good hash will minimize collisions.
FIO_HASH_COMPARE_KEYS(k1, k2)
This macro should compare two keys, excluding their hash values (which were compared using the FIO_HASH_KEY2UINT
macro).
FIO_HASH_KEY_ISINVALID(key)
Should evaluate as true if the key is an invalid key (all it's bytes set to zero).
FIO_HASH_KEY_COPY(key)
Keys might contain temporary data (such as strings). To allow the Hash Map to test the key even after the temporary data is out of scope, ac opy needs to be created.
This macro should return a FIO_HASH_KEY_TYPE object that contains only persistent data. This could achieved by allocating some of the data using malloc
.
FIO_HASH_KEY_DESTROY(key)
When the Hash Map is re-hashed, old keys belonging to removed objects are cleared away and need to be destroyed.
This macro allows dynamically allocated memory to be freed (this is the complement of FIO_HASH_KEY_COPY
).
Note: This macro must not end a statement (shouldn't use the ;
marker) or code blocks ({}
). For multiple actions consider using inline functions.
If the FIO_HASH_KEY_COPY
macro allocated memory dynamically or if there's a need to iterate over the values in the Hash Map before freeing the Hash Map (perhaps to free the object's memory), the FIO_HASH_FOR_FREE
macro can be used to iterate over the Hash Map, free all the keys and free the Hash Map resources (it ends by calling fio_hash_free
).
Note: These macros are localized to the specific C file in which they were defined and a specific C file can't include the fio_hashmap.h
more than once. If a few different approaches are required, it can be performed by using different C files and offering function wrappers for the Hash Map functions (wrap fio_hash_find
, fio_hash_insert
, etc').
In this example collisions are forced by setting the hash
and string length to be equal for all keys, this demonstrates how defining these macros can secure the hash against String key collisions.
(another example can be found in the pubsub.c
source code)
#define _GNU_SOURCE
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
/* the hash key type for string keys */
typedef struct {
size_t hash;
size_t len;
char *str;
} fio_hash_key_s;
/* strdup is usually available... but just in case it isn't */
static inline char *my_strdup(char *str, size_t len) {
char *ret = malloc(len + 1);
if(!ret)
exit(-1);
ret[len] = 0;
memcpy(ret, str, len);
return ret;
}
/* define the macro to set the key type */
#define FIO_HASH_KEY_TYPE fio_hash_key_s
/* the macro that returns the key's hash value */
#define FIO_HASH_KEY2UINT(key) ((key).hash)
/* Compare the keys using length testing and `memcmp` */
#define FIO_HASH_COMPARE_KEYS(k1, k2) \
((k1).str == (k2).str || \
((k1).len == (k2).len && !memcmp((k1).str, (k2).str, (k2).len)))
/* an "all bytes are zero" invalid key */
#define FIO_HASH_KEY_INVALID ((fio_hash_key_s){.hash = 0})
/* tests if a key is the invalid key */
#define FIO_HASH_KEY_ISINVALID(key) ((key).str == NULL)
/* creates a persistent copy of the key's string */
#define FIO_HASH_KEY_COPY(key) \
((fio_hash_key_s){.hash = (key).hash, \
.len = (key).len, \
.str = my_strdup((key).str, (key).len)})
/* frees the allocated string, remove the `fprintf` in production */
#define FIO_HASH_KEY_DESTROY(key) \
(fprintf(stderr, "freeing %s\n", (key).str), free((key).str))
#include "fio_hashmap.h"
int main(void) {
fio_hash_s hash;
fio_hash_key_s key1 = {.hash = 1, .len = 5, .str = "hello"};
fio_hash_key_s key1_copy = {.hash = 1, .len = 5, .str = "hello"};
fio_hash_key_s key2 = {.hash = 1, .len = 5, .str = "Hello"};
fio_hash_key_s key3 = {.hash = 1, .len = 5, .str = "Hell0"};
fio_hash_new(&hash);
fio_hash_insert(&hash, key1, key1.str);
key1.str = "oops";
if (fio_hash_find(&hash, key1))
fprintf(stderr,
"ERROR: string comparison should have failed, instead got: %s\n",
(char *)fio_hash_find(&hash, key1));
else if (fio_hash_find(&hash, key1_copy))
fprintf(stderr, "Hash string comparison passed for %s\n",
(char *)fio_hash_find(&hash, key1_copy));
fio_hash_insert(&hash, key2, key2.str);
fio_hash_insert(&hash, key3, key3.str);
fio_hash_insert(&hash, key2, NULL); /* deletes the key2 object */
fio_hash_rehash(&hash); /* forces the unused key to be destroyed */
fprintf(stderr, "Did we free %s?\n", key2.str);
FIO_HASH_FOR_EMPTY(&hash, i) { (void)i->obj; }
if (fio_hash_find(&hash, key1_copy))
fprintf(stderr,
"ERROR: string comparison should have failed, instead got: %s\n",
(char *)fio_hash_find(&hash, key1));
fprintf(stderr, "reinserting stuff\n");
fio_hash_insert(&hash, key2, key2.str);
fio_hash_insert(&hash, key3, key3.str);
FIO_HASH_FOR_FREE(&hash, i) { (void)i->obj; }
}