Boost C++ Libraries
...one of the most highly regarded and expertly designed C++ library projects in the world. — Herb Sutter and Andrei Alexandrescu , C++ Coding Standards
Class template unordered_set
boost::unordered_set — An unordered associative container that stores unique values.
The elements are organized into buckets. Keys with the same hash code are stored in the same bucket.
The number of buckets can be automatically increased by a call to insert, or as the result of calling rehash.
unordered_set public types
typedef typename allocator_type :: pointer pointer ;
value_type* if allocator_type::pointer is not defined.
typedef typename allocator_type :: const_pointer const_pointer ;
boost::pointer_to_other<pointer, value_type>::type if allocator_type::const_pointer is not defined.
typedef implementation - defined size_type ;
An unsigned integral type.
size_type can represent any non-negative value of difference_type.
typedef implementation - defined difference_type ;
A signed integral type.
Is identical to the difference type of iterator and const_iterator.
typedef implementation - defined iterator ;
A constant iterator whose value type is value_type.
The iterator category is at least a forward iterator.
Convertible to const_iterator.
typedef implementation - defined const_iterator ;
typedef implementation - defined local_iterator ;
An iterator with the same value type, difference type and pointer and reference type as iterator.
A local_iterator object can be used to iterate through a single bucket.
typedef implementation - defined const_local_iterator ;
A constant iterator with the same value type, difference type and pointer and reference type as const_iterator.
A const_local_iterator object can be used to iterate through a single bucket.
typedef implementation - defined node_type ;
See node_handle_set for details.
typedef implementation - defined insert_return_type ;
unordered_set public construct/copy/destruct
Constructs an empty container using hasher() as the hash function, key_equal() as the key equality predicate, allocator_type() as the allocator and a maximum load factor of 1.0.
Constructs an empty container with at least n buckets, using hf as the hash function, eq as the key equality predicate, a as the allocator and a maximum load factor of 1.0.
Constructs an empty container with at least n buckets, using hf as the hash function, eq as the key equality predicate, a as the allocator and a maximum load factor of 1.0 and inserts the elements from [f, l) into it.
The copy constructor. Copies the contained elements, hash function, predicate, maximum load factor and allocator.
If Allocator::select_on_container_copy_construction exists and has the right signature, the allocator will be constructed from its result.
The move constructor.
Constructs an empty container, using allocator a .
Constructs an container, copying x 's contained elements, hash function, predicate, maximum load factor, but using allocator a .
Construct a container moving x 's contained elements, and having the hash function, predicate and maximum load factor, but using allocate a .
Constructs an empty container with at least n buckets, using hf as the hash function, eq as the key equality predicate, a as the allocator and a maximum load factor of 1.0 and inserts the elements from il into it.
Constructs an empty container with at least n buckets, using hf as the hash function, the default hash function and key equality predicate, a as the allocator and a maximum load factor of 1.0.
Constructs an empty container with at least n buckets, using hf as the hash function, the default key equality predicate, a as the allocator and a maximum load factor of 1.0.
Constructs an empty container with at least n buckets, using a as the allocator, with the default hash function and key equality predicate and a maximum load factor of 1.0 and inserts the elements from [f, l) into it.
Constructs an empty container with at least n buckets, using hf as the hash function, a as the allocator, with the default key equality predicate and a maximum load factor of 1.0 and inserts the elements from [f, l) into it.
The assignment operator. Copies the contained elements, hash function, predicate and maximum load factor but not the allocator.
If Alloc::propagate_on_container_copy_assignment exists and Alloc::propagate_on_container_copy_assignment::value is true, the allocator is overwritten, if not the copied elements are created using the existing allocator.
The move assignment operator.
If Alloc::propagate_on_container_move_assignment exists and Alloc::propagate_on_container_move_assignment::value is true, the allocator is overwritten, if not the moved elements are created using the existing allocator.
Assign from values in initializer list. All existing elements are either overwritten by the new elements or destroyed.
unordered_set size and capacity
Unordered_set iterators, unordered_set modifiers.
Inserts an object, constructed with the arguments args , in the container if and only if there is no element in the container with an equivalent value.
hint is a suggestion to where the element should be inserted.
Inserts obj in the container if and only if there is no element in the container with an equivalent value.
Inserts a range of elements into the container. Elements are inserted if and only if there is no element in the container with an equivalent value.
Removes the element pointed to by position .
Removes an element with key equivalent to k .
If nh is empty, has no affect.
Otherwise inserts the element owned by nh if and only if there is no element in the container with an equivalent value.
If there is already an element in the container with an equivalent value has no effect on nh (i.e. nh still contains the node.)
Erase the element pointed to by position .
Erase all elements with key equivalent to k .
Erases the elements in the range from first to last .
Erases all elements in the container.
Swaps the contents of the container with the parameter.
If Allocator::propagate_on_container_swap is declared and Allocator::propagate_on_container_swap::value is true then the containers' allocators are swapped. Otherwise, swapping with unequal allocators results in undefined behavior.
Unordered_set lookup, unordered_set bucket interface, unordered_set hash policy.
Changes the number of buckets so that there at least n buckets, and so that the load factor is less than the maximum load factor.
Invalidates iterators, and changes the order of elements. Pointers and references to elements are not invalidated.
unordered_set Equality Comparisons
Return true if x.size() == y.size and for every element in x , there is an element in y with the same for the same key, with an equal value (using operator== to compare the value types).
Return false if x.size() == y.size and for every element in x , there is an element in y with the same for the same key, with an equal value (using operator== to compare the value types).
Swaps the contents of x and y .
- Stack Overflow Public questions & answers
- Stack Overflow for Teams Where developers & technologists share private knowledge with coworkers
- Talent Build your employer brand
- Advertising Reach developers & technologists worldwide
- About the company
Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
How to use boost hash_value in std unordered_set?
How can I use boost::hash_value<std::pair<>> in std::unordered_set<std::pair<>> without defining a functor?
The following doesn't work:
The execution aborts.
I think the above doesn't work because hasher() doesn't exist. I am using g++ -std=c++14
- Here is mcve: godbolt.org/z/nYsr5M4e1 – Marek R Jun 23, 2021 at 17:43
2 Answers 2
This seems to work, according to my tests:
- Let me rephrase my question. If I have an existing hash function, then how do I use it in STL associative containers without enclosing it in a new functor? – user3689963 Jun 24, 2021 at 6:01
- @user3689963 You don't. Neither does this answer: using doesn't introduce new types and the standard hash<> specialization gets inlined (same as for std::hash<> ) – sehe Jun 24, 2021 at 10:07
You can do it with a lambda:
- 3 Works only since c++20 where lambda are defult constructible. – rafix07 Jun 23, 2021 at 17:47
- @rafix07 Oh yes... Well, maybe the OP can use it. – Paul Sanders Jun 23, 2021 at 17:48
- @rafix07 you ahve been faster: wandbox.org/permlink/3fmBWivrTSBGsOAE – Marek R Jun 23, 2021 at 17:48
Sign up or log in, post as a guest.
Required, but never shown
Not the answer you're looking for? Browse other questions tagged c++ boost unordered-set or ask your own question .
- The Overflow Blog
- Can Stack Overflow save the day?
- Let’s talk large language models (Ep. 546)
- Featured on Meta
- We've added a "Necessary cookies only" option to the cookie consent popup
- The Stack Exchange reputation system: What's working? What's not?
- Launching the CI/CD and R Collectives and community editing features for...
- Staging Ground Beta 1 Recap, and Reviewers needed for Beta 2
- Temporary policy: ChatGPT is banned
Hot Network Questions
- What is dependency grammar and what are the possible relationships?
- Linearize conditional constraint
- Is there any reason 3d printing aircraft is hard?
- Portable Alternatives to Traditional Keyboard/Mouse Input
- Why does cat with no argument read from standard input?
- Moon's equation of the centre discrepancy
- A metric characterization of the real line
- How can I heat my buildings without fire in a low-fantasy setting?
- Two Blinkers, A Pair of Boats, etc
- When to claim check dated in one year but received the next
- Change the background color of the active windows
- Is "throw in an ape" an expression?
- Next word in this progression
- How can I check if this airline ticket is genuine?
- Can I interpret `< file1 cmd1` as `cmd1 file1`?
- Ro Laren's nose ridges
- Trying to remember a short film about an assembly line AI becoming self-aware
- What do you do after your article has been published?
- "Mainframe" with Z80
- Create a simple Latex macro which expands the format to sequence
- What kind of screw has a wide flange with a smaller head above?
- Is “senior year” a direct object or something else in “I played my senior year”? What about “perfect game” in “I threw a perfect game”?
- Explain Like I'm 5 How Oath Spells Work (D&D 5e)
- Where can I create nice looking graphics for a paper?
Chapter 15. Boost.Unordered
Boost.Unordered provides the classes boost::unordered_set , boost::unordered_multiset , boost::unordered_map , and boost::unordered_multimap . These classes are identical to the hash containers that were added to the standard library with C++11. Thus, you can ignore the containers from Boost.Unordered if you work with a development environment supporting C++11.
boost::unordered_set can be replaced with std::unordered_set in Example 15.1 . boost::unordered_set doesn’t differ from std::unordered_set .
Example 15.2 uses boost::unordered_map to store the names and the number of legs for several animals. Once again, boost::unordered_map could be replaced with std::unordered_map .
In Example 15.3 elements of type animal are stored in a container of type boost::unordered_set . Because the hash function of boost::unordered_set doesn’t know the class animal , hash values can’t be automatically calculated for elements of this type. That’s why a hash function must be defined – otherwise the example can’t be compiled.
The name of the hash function to define is hash_value() . It must expect as its sole parameter an object of the type the hash value should be calculated for. The type of the return value of hash_value() must be std::size_t .
The function hash_value() is automatically called when the hash value has to be calculated for an object. This function is defined for various types in the Boost libraries, including std::string . For user-defined types like animal , it must be defined by the developer.
Usually, the definition of hash_value() is rather simple: Hash values are created by accessing the member variables of an object one after another. This is done with the function boost::hash_combine() , which is provided by Boost.Hash and defined in boost/functional/hash.hpp . You don’t have to include this header file if you use Boost.Unordered because all containers from this library access Boost.Hash to calculate hash values.
In addition to defining hash_value() , you need to make sure two objects can be compared using == . That’s why the operator operator== is overloaded for animal in Example 15.3 .
The hash containers from the C++11 standard library use a hash function from the header file functional . The hash containers from Boost.Unordered expect the hash function hash_value() . Whether you use Boost.Hash within hash_value() doesn’t matter. Boost.Hash makes sense because functions like boost::hash_combine() make it easier to calculate hash values from multiple member variables step by step. However, this is only an implementation detail of hash_value() . Apart from the different hash functions used, the hash containers from Boost.Unordered and the standard library are basically equivalent.
Unordered set is an associative container that contains a set of unique objects of type Key. Search, insertion, and removal have average constant-time complexity.
Internally, the elements are not sorted in any particular order, but organized into buckets. Which bucket an element is placed into depends entirely on the hash of its value. This allows fast access to individual elements, since once a hash is computed, it refers to the exact bucket the element is placed into.
Container elements may not be modified (even by non const iterators) since modification could change an element's hash and corrupt the container.
std::unordered_set meets the requirements of Container , AllocatorAwareContainer , UnorderedAssociativeContainer .
[ edit ] Iterator invalidation
[ edit ] notes.
- The swap functions do not invalidate any of the iterators inside the container, but they do invalidate the iterator marking the end of the swap region.
- References and pointers to data stored in the container are only invalidated by erasing that element, even when the corresponding iterator is invalidated.
- After container move assignment, unless elementwise move assignment is forced by incompatible allocators, references, pointers, and iterators (other than the end iterator) to moved-from container remain valid, but refer to elements that are now in * this .
[ edit ] Template parameters
[ edit ] member types, [ edit ] member functions, [ edit ] non-member functions.
The member types iterator and const_iterator may be aliases to the same type. This means defining a pair of function overloads using the two types as parameter types may violate the One Definition Rule . Since iterator is convertible to const_iterator , a single function with a const_iterator as parameter type will work instead.
[ edit ] Example
- Todo with reason
- Todo no example
- Recent changes
- Offline version
- What links here
- Related changes
- Upload file
- Special pages
- Printable version
- Permanent link
- Page information
- In other languages
- This page was last modified on 17 March 2023, at 08:03.
- This page has been accessed 1,498,748 times.
- About cppreference.com
Use pair as a key in std::unordered_set in C++
This post will discuss how to use std::pair as a key in std::unordered_set in C++.
In the previous post , we have seen that the default value for the second template parameter of std::set is std::less , which will delegate to operator< . Since operator< is defined for pairs, std::set performs a lexicographical comparison on two pair objects to define order. This is the reason the following declaration works in C++:
On the other hand, std::unordered_set throws a compilation error when std::pair is used as a key.
This is because unordered containers like std::unordered_set and std::unordered_map uses std::hash for computing hash value for its keys and there is no standard specialization of std::hash for std::pair in the C++ library.
To use pair as a key in a std::unordered_set , we can follow any of the following approaches:
1. Using std::hash function
We can define the specialization for std::hash that works with std::pair .
Download Run Code
Output: four: 4 one: 1 three: 3 two: 2
The above code uses XOR as a hash combination function for simplicity. XOR should be avoided as XOR is symmetric (x ^ y == y ^ x) and maps identical values to zero (x ^ x == y ^ y == 0) . This would end up with far too many collisions in the real world. For production code, we should shift/rotate one of the hashes before XORing.
2. Using boost Library
Since there is no specialization of std::hash for std::pair in the standard library, a good alternative is to use the boost::hash from Boost.Functional . We can use it to hash integers, floats, pointers, strings, arrays, pairs, and standard containers.
Output: three: 3 four: 4 one: 1 two: 2
That’s all about using pair as a key in std::unordered_set in C++.
Rate this post
Average rating 4.9 /5. Vote count: 20
No votes so far! Be the first to rate this post.
We are sorry that this post was not useful for you!
Tell us how we can improve this post?
Custom Hash Functions for C++ Unordered Containers
For simplicity, I’ll be “using namespace std” throughout this post. In production code, one should refrain from such pollution of namespace, though.
Motivation and Background
C++ unordered containers (e.g. unordered_map , unordered_set , etc.) uses “hashing” to store objects. The STL provides hash functions for commonly used types, like string and numeric values such as int , meaning that we won’t have to provide any hash functions explicitly when creating an unordered container instance:
However, when we want to hold a more complex type, or even a user-defined class, C++ unordered containers would fail to find a hash function for itself to use. This is when custom hash functions come in. We will have to explicitly define a hash function for the type (e.g. pair<string, int> ) that we want to use as a template argument for some unordered container.
Implementation Idea: Exclusive-Or
According to the book “A Tour of C++” by Bjarne Stroustrup, “A hash function is often provided as a function object”. And creating a new hash function by “combining existing functions using exclusive-or ( ^ ) is simple and often very effective”.
Intuitively, this means that if we want to hash a pair of string and integer, we could use the following “new” hash function:
The next thing we need to do is to pass that as a template argument when creating our unordered container. For an unordered_set , that will be the second argument. For an unordered_map , that’ll be the third argument after the key type and the value type. Note that we only need the hashing function for a map’s “key”, not the value.
Method 1: Function Object with Struct
Perhaps the most commonly used method (and also my preferred method) is to create a function object (functor) using a struct, and implement its operator() function. This function is to be qualified as const , and takes a const reference to the custom type and returns a size_t . Defining a function object outside as a standalone struct also gives it the flexibility of “generic programming” with templates. Meaning that it can work with pairs of different template arguments, not just strings and ints.
Putting this all together, we have:
Method 2: Custom Lambda Hash Function
Alternatively, we can also implement our custom hash function as a lambda. Some may find this inline implementation more elegant. A drawback is that C++11 lambdas do not support templates. Fortunately, we have “generic lambdas” since C++14 (as seen below).
Another drawback is that we need to use decltype of the lambda as the extra custom hash function argument for our unordered containers, and also provide constructor arguments in parentheses after the declared variable name. Both unordered_set and unordered_map have a constructor that takes an initial number of buckets and a hashing object as inputs. Here, I just arbitrarily let the initial buckets be 10.
Method 3: Defining a Specialization of std::hash()
In this method, we add a specialization of standard-library’s hash function to the namespace std . This is quite similar to the first method, except now we must use hash as the name of our function object. We also have to specify the template class for which we are defining the specialization. The immediate advantage is that we do not need to pass in any extra template arguments to our unordered container when declaring an instance.
Here is an example of this method, suppose we want to be able to hash a user-defined class XYZ , which has a member called value :
Note that the unordered containers also need the ability to compare two different keys using == . In previous examples, I have been using std::pair , and the STL already defines operator==() for pairs, so I didn’t have to define my own equality logic. However, if we are using our own user-defined class, then we need to define the equality operator. For example, here’s how we can define operator==() for our custom class Record :
Note that operator== can access private members “id” of “another” Record object, because it is a “member function”. Member functions can access private members of its class, not only for this , but also for any instance of the same class. To read more on this topic, consider this StackOverflow thread .
With Record ’s equality operator defined, we can proceed to define our custom hash function for Record . Let’s use the third method for this example.
Now, we can finally use Record as the key to our unordered container, simply as follows:
Conclusions and Further Readings
In this short blog post, I introduced various ways to create custom functions for unordered containers that hold relatively complex STL types or user-defined class objects. If you’d like to read more on this topic, I recommend this blog post by Mark Nelson and also this page at Cppreference.com.
Building an intelligent emacs.
This post introduces the combination of Emacs and LSP, and how you can make your own editor “smarter” by using the same idea of communications between an editor client and multiple language serv...
Native Emojis in Emacs
This short post will help you set up emoji support in Emacs. First, let’s see the results: Emacs showing emojis Step one, you must have an emoji font installed. I really adore the new fluent de...
Git Gutter in Emacs
When I first started programming using Visual Studio Code, I’ve benefited much from the git gutter indicators that show added/deleted/modified code blocks that haven’t been committed by git. This a...
A Tour of C++ - Reading Notes (Part 1/2)
A Tour of C++ - Reading Notes (Part 2/2)
- ACCU Online
- ACCU 2022 Bristol
- ACCU 2021 Online
- ACCU Autumn 2019 Belfast
- ACCU 2019 Bristol
- ACCU 2018 Bristol
- ACCU 2017 Bristol
- ACCU 2016 Bristol
- ACCU 2015 Bristol
- ACCU 2014 Bristol
- All Conferences
- Journals Overview
- Overload by Issue
- Overload by Article
- Overload by Author
- C Vu by Issue
- C Vu by Article
- C Vu by Author
- Reviews Overview
- Reviews by Date
- Reviews by Title
- Reviews by Author
- Reviews by Reviewer
- Reviews by Publisher
- Reviews by Rating
- Local Groups
Advancing the State of the Art for std::unordered_map Implementations
By Joaquín M López Muñoz
Overload, 30(170):7-9, August 2022
Unordered maps can be implemented in various ways. Joaquín M López Muñoz presents a new, fast version.
Several Boost authors have embarked on a project [ Boost-1 ] to improve the performance of Boost.Unordered’s implementation of std::unordered_map (and multimap , set and multiset variants), and to extend its portfolio of available containers to offer faster, non-standard alternatives based on open addressing.
The first goal of the project has been completed in time for Boost 1.80 (launching in August 2022). We describe here the technical innovations introduced in boost::unordered_map that makes it the fastest implementation of std::unordered_map on the market.
Closed vs. open addressing
On a first approximation, hash table implementations fall on either of two general classes:
- Open addressing [ Wikipedia-2 ] (or closed hashing ) stores at most one element in each bucket (sometimes called a slot ). When an element goes to an already occupied slot, some probing mechanism is used to locate an available slot, preferrably close to the original one.
Recent, high-performance hash tables use open addressing and leverage on its inherently better cache locality and on widely available SIMD [ Wikpedia-3 ] operations. Closed addressing provides some functional advantages, though, and remains relevant as the required foundation for the implementation of std::unodered_map .
Restrictions on the implementation of std::unordered_map
The standardization of C++ unordered associative containers is based on Matt Austern’s 2003 N1456 paper [ Austern03 ]. Back in the day, open-addressing approaches were not regarded as sufficiently mature, so closed addressing was taken as the safe implementation of choice. Even though the C++ standard does not explicitly require that closed addressing must be used, the assumption that this is the case leaks through the public interface of std::unordered_map :
- A bucket API is provided.
- Pointer stability implies that the container is node-based. In C++17, this implication was made explicit with the introduction of extract capabilities.
- Users can control the container load factor.
- Requirements on the hash function are very lax (open addressing depends on high-quality hash functions with the ability to spread keys widely across the space of std::size_t values.)
As a result, all standard library implementations use some form of closed addressing for the internal structure of their std::unordered_map (and related containers).
Coming as an additional difficulty, there are two complexity requirements:
- iterator increment must be (amortized) constant time,
- erase must be constant time on average,
that rule out the textbook implementation of closed addressing (see N2023 [ López-Muñoz06 ] for details). To cope with this problem, standard libraries depart from the textbook layout in ways that introduce speed and memory penalties: for instance, Figure 2 shows how libstdc++-v3 and libc++ layouts look.
To provide constant iterator increment, all nodes are linked together, which in its turn forces two adjustments to the data structure:
- Buckets point to the node before the first one in the bucket so as to preserve constant-time erasure.
- To detect the end of a bucket, the element hash value is added as a data member of the node itself (libstdc++-v3 opts for on-the-fly hash calculation under some circumstances).
Visual Studio standard library (formerly from Dinkumware) uses an entirely different approach to circumvent the problem, but the general outcome is that resulting data structures perform significantly worse than the textbook layout in terms of speed, memory consumption, or both.
Boost.Unordered 1.80 data layout
The new data layout used by Boost.Unordered goes back to the textbook approach (see Figure 3).
Unlike the rest of standard library implementations, nodes are not linked across the container but only within each bucket. This makes constant-time erase trivially implementable, but leaves unsolved the problem of constant-time iterator increment: to achieve it, we introduce so-called bucket groups (top of the diagram). Each bucket group consists of a 32/64-bit bucket occupancy mask plus next and prev pointers linking non-empty bucket groups together. Iteration across buckets resorts to a combination of bit manipulation operations on the bitmasks plus group traversal through next pointers, which is not only constant time but also very lightweight in terms of execution time and of memory overhead (4 bits per bucket).
When inserting or looking for an element, hash table implementations need to map the element hash value into the array of buckets (or slots in the open-addressing case). There are two general approaches in common use:
- Bucket array sizes follow a sequence of prime numbers p , and mapping is of the form h → h mod p .
- Bucket array sizes follow a power-of-two sequence 2 n , and mapping takes n bits from h . Typically it is the n least significant bits that are used, but in some cases, like when h is postprocessed to improve its uniformity via multiplication by a well-chosen constant m (such as defined by Fibonacci hashing [ Wikipedia-4 ]), it is best to take the n most significant bits, that is, h → ( h × m ) >> ( N − n ), where N is the bitwidth of std::size_t and >> is the usual C++ right shift operation.
We use the modulo by a prime approach because it produces very good spreading even if hash values are not uniformly distributed. In modern CPUs, however, modulo is an expensive operation involving integer division; compilers, on the other hand, know how to perform modulo by a constant much more efficiently, so one possible optimization is to keep a table of pointers to functions f p : h → h mod p . This technique replaces expensive modulo calculation with a table jump plus a modulo-by-a-constant operation.
In Boost.Unordered 1.80, we have gone a step further. Daniel Lemire et al. [ Lemire19 ] show how to calculate h mod p as an operation involving some shifts and multiplications by p and a pre-computed c value acting as a sort of reciprocal of p . We have used this work to implement hash mapping as h → fastmod( h , p , c ) (some details omitted). Note that, even though fastmod is generally faster than modulo by a constant, most performance gains actually come from the fact that we are eliminating the table jump needed to select f p , which prevented code inlining.
Time and memory performance of Boost 1.80 boost::unordered_map
We are providing some benchmark results [ Boost-2 ] of the boost::unordered_map against libstdc++-v3, libc++ and Visual Studio standard library for insertion, lookup and erasure scenarios. boost::unordered_map is mostly faster across the board, and in some cases significantly so. There are three factors contributing to this performance advantage:
- the very reduced memory footprint improves cache utilization,
- fast modulo is used,
- the new layout incurs one less pointer indirection than libstdc++-v3 and libc++ to access the elements of a bucket.
As for memory consumption, let N be the number of elements in a container with B buckets: the memory overheads (that is, memory allocated minus memory used strictly for the elements themselves) of the different implementations on 64-bit architectures are in Table 1.
Which hash container to choose
Opting for closed-addressing (which, in the realm of C++, is almost synonymous with using an implementation of std::unordered_map ) or choosing a speed-oriented, open-addressing container is in practice not a clear-cut decision. Some factors favoring one or the other option are listed:
- The code uses some specific parts of its API-like node extraction, the bucket interface or the ability to set the maximum load factor, which are generally not available in open-addressing containers.
- Pointer stability and/or non-moveability of values required (though some open-addressing alternatives support these at the expense of reduced performance).
- Constant-time iterator increment required.
- Hash functions used are only mid-quality (open addressing requires that the hash function have very good key-spreading properties).
- Equivalent key support, i.e. unordered_multimap / unordered_multiset , required. We do not know of any open-addressing container supporting equivalent keys.
- Performance is the main concern.
- Existing code can be adapted to a basically more stringent API and more demanding requirements on the element type (like moveability).
- Hash functions are of good quality (or the default ones from the container provider are used).
If you decide to use std::unordered_map , Boost.Unordered 1.80 now gives you the fastest, fully-conformant implementation on the market.
There are some further areas of improvement to boost::unordered_map that we will investigate post Boost 1.80:
- Reduce the memory overhead of the new layout from 4 bits to 3 bits per bucket.
- Speed up performance for equivalent key variants ( unordered_multimap / unordered_multiset ).
In parallel, we are working on the future boost::unordered_flat_map , our proposal for a top-speed, open-addressing container beyond the limitations imposed by std::unordered_map interface. Your feedback on our current and future work is very welcome.
The Boost.Unordered evolution project is being carried out by Peter Dimov, Christian Mazakas and the author. This work is funded by The C++ Alliance ( https://cppalliance.org/ ).
[Austern03] ‘A Proposal to Add Hash Tables to the Standard Library (revision 4)’: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1456.html
[Boost-1] ‘Development Plan for Boost.Unordered’: https://pdimov.github.io/articles/unordered_dev_plan.html
[Boost-2] ‘Benchmarks’: https://www.boost.org/doc/libs/develop/libs/unordered/doc/html/unordered.html#benchmarks
[GNU] ‘Hash Code’ in The GNU C++ Library Manual , available at https://gcc.gnu.org/onlinedocs/libstdc++/manual/unordered_associative.html#containers.unordered.cache
[Lemire19] Daniel Lemire, Owen Kaser and Nathan Kurz, ‘Faster Remainder by Direct Computation: Applications to ompilers and Sofware Libraries’ from Software: Practice and Experience 49(6), available at https://arxiv.org/abs/1902.01961
[López-Muñoz06] ‘erase (iterator) for unordered containers should not return an iterator’: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2023.pdf
[Wikipedia-1] ‘Separate chaining’ in the topic ‘Hash table’: https://en.wikipedia.org/wiki/Hash_table#Separate_chaining
[Wikipedia-2] ‘Open addressing’ in the topic ‘Hash table’: https://en.wikipedia.org/wiki/Hash_table#Open_addressing
[Wikipedia-3] ‘Single instruction, multiple data’: https://en.wikipedia.org/wiki/Single_instruction,_multiple_data
[Wikipedia-4] ‘Fibonacci hashing’ in the topic ‘Hash table’ https://en.wikipedia.org/wiki/Hash_function#Fibonacci_hashing
This article was first published on Joaquín’s blog Bannalia: trivial notes on themes diverse ( http://bannalia.blogspot.com/2022/06/advancing-state-of-art-for.html?m=1 ) on 18 June 2022.
Joaquín M López Muñoz is a telecommunications engineer freelancing in product/innovation/technological consultancy for telco, TV, and IoT. He is the author of three Boost libraries (MultiIndex, Flyweight, PolyCollection) and has made some minor contributions to the standard, such as N3657 (heterogeneous lookup).
Copyright (c) 2018-2022 ACCU; all rights reserved.
Template by Bootstrapious .
Ported to Hugo by DevCows .
Customized for ACCU by Jim and Bob.
Hosting provided by Bytemark .
The elements are organized into buckets. Keys with the same hash code are stored in the same bucket. The number of buckets can be automatically increased by a
unordered_set();. Constructs an empty container using hasher() as the hash function, key_equal() as the key equality predicate, allocator_type() as the
This seems to work, according to my tests: #include <iostream> #include <unordered_set> #include <boost/functional/hash.hpp> typedef
C++ boost::unordered_set · 1. insert( value) - Inserts the value into the container. · 2. find(value) - Since the value is stored with respect to hash value, this
Boost.Unordered provides the classes boost::unordered_set , boost::unordered_multiset , boost::unordered_map , and boost::unordered_multimap .
Unordered set is an associative container that contains a set of unique objects of type Key. Search, insertion, and removal have average constant-time
Since there is no specialization of std::hash for std::pair in the standard library, a good alternative is to use the boost::hash from Boost.Functional . We can
Constructs an container, copying x's contained elements, hash function, predicate, maximum load factor, but using allocator a. 7. ~unordered_set();. Notes: The
For an unordered_set , that will be the second argument. For an unordered_map , that'll be the third argument after the key type and the value
Which hash container to choose ... If you decide to use std::unordered_map , Boost.Unordered 1.80 now gives you the fastest, fully-conformant implementation on