Discuss internals of a ConcurrentHashmap (CHM) in Java

Upasana | May 25, 2019 | 4 min read | 3,251 views | Multithreading and Concurrency


In Java 1.7, A ConcurrentHashMap is a hashmap supporting full concurrency of retrieval via volatile reads of segments and tables without locking, and adjustable expected concurrency for updates. All the operations in this class are thread-safe, although the retrieval operations does not depend on locking mechanism (non-blocking). And there is not any support for locking the entire table, in a way that prevents all access. The allowed concurrency among update operations is guided by the optional concurrencyLevel constructor argument (default is16), which is used as a hint for internal sizing.

hashmap internals

ConcurrentHashMap is similar in implementation to that of HashMap, with resizable array of hash buckets, each consisting of List of HashEntry elements. Instead of a single collection lock, ConcurrentHashMap uses a fixed pool of locks that form a partition over the collection of buckets.

HashEntry class takes advantage of final and volatile variables to reflect the changes to other threads without acquiring the expensive lock for read operations.

The table inside ConcurrentHashMap is divided among Segments (which extends Reentrant Lock), each of which itself is a concurrently readable hash table. Each segment requires uses single lock to consistently update its elements flushing all the changes to main memory.

put() method holds the bucket lock for the duration of its execution and doesn’t necessarily block other threads from calling get() operations on the map. It firstly searches the appropriate hash chain for the given key and if found, then it simply updates the volatile value field. Otherwise it creates a new HashEntry object and inserts it at the head of the list.

Iterator returned by the ConcurrentHashMap is fail-safe but weakly consistent. keySet().iterator() returns the iterator for the set of hash keys backed by the original map. The iterator is a “weakly consistent” iterator that will never throw ConcurrentModificationException, and guarantees to traverse elements as they existed upon construction of the iterator, and may (but is not guaranteed to) reflect any modifications subsequent to construction.

Re-sizing happens dynamically inside the map whenever required in order to maintain an upper bound on hash collision. Increase in number of buckets leads to rehashing the existing values. This is achieved by recursively acquiring lock over each bucket and then rehashing the elements from each bucket to new larger hash table.

Important facts about ConcurrentHashMap

ConcurrentHashMap Properties
Property Value

Time Complexity for Put, get, remove and containsKey

O(1) when no collision, O(1 + log k) when k elements are present in one bucket

CHM allows concurrent reads from different threads

Yes

CHM allows concurrent writes from different threads

Yes

Default Concurrency Level

16

Frequent Questions

Is this possible for 2 threads to update the ConcurrentHashMap at the same moment?

Yes, its possible to have 2 parallel threads writing to the CHM at the same time, infact in the default implementation of CHM, atmost 16 threads can write & read in parallel. But in worst case if the two objects lie in the same segment, then parallel write would not be possible.

Can multiple threads read from a given Hashtable concurrently?

No, get() method of hash table is synchronized (even for synchronized HashMap). So only one thread can get value from it at any given point in time. Full concurrency for reads is possible only in ConcurrentHashMap via the use of volatile.

What is difference between size and capacity of HashMap/ArrayList

Size defines the actual number of elements contained in the collection, while capacity at any given point in time defines the number of items that a collection can hold without growing itself.

What is Load Factor in HashMap Context?

The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. Default initial capacity of the HashMap takes is 16 and load factor is 0.75f (i.e 75% of current map size). The load factor represents at what level the HashMap capacity should be doubled.

By what amount ConcurrentHashMap/hashMap grows when its capacity is reached?

Whenever threshold (defined by load factor with default value of 0.75) of HashMap reached, it increases its size to double. Signed Left shift operator is used to double the capacity of hashmap, as shown in code below

newCap = oldCap << 1

Which is same as

newCap = oldCap x 2^1 = oldCap x 2

Why HashMap capacity is always expressed in power of 2?

HashMap always maintains its internal capacity in power of 2s, if you supply a different initial capacity then it will automatically round it nearest power of two using the below code.

        /**
         * Returns a power of two size for the given target capacity.
         */
        static final int tableSizeFor(int cap) {
            int n = cap - 1;
            n |= n >>> 1;
            n |= n >>> 2;
            n |= n >>> 4;
            n |= n >>> 8;
            n |= n >>> 16;
            return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
        }

This is done to make get(key) calls faster by utilizing the below bitwise hack to calculate appropriate bucket location for a given key. We know that modulo of power of two can be expressed in terms of below bitwise operation. Please note that modulus operator (%) is around 10x slower than the below bitwise hack, which matters for a function like HashMap’s get(key)

x % 2 inpower n == x & (2 inpower n - 1).

for example,

x % 2 == x & 1
x % 4 == x & 3
x % 8 == x & 7

But to utilize the above performance tweak, x must be power of 2. Thus HashMap’s internal capacity is always in power of 2.


Multithreading and Concurrency:
  1. Java 8 Parallel Stream custom ThreadPool
  2. Java Concurrency Interview Questions
  3. ConcurrentModificationException in Java
  4. What is purpose of Collections.unmodifiableCollection
  5. Removing elements while iterating over a Java Collection
  6. ThreadLocal with examples in Java
  7. Design an Immutable class that has an java.util.Date member
See all articles in Multithreading and Concurrency
Top articles in this category:
  1. Factorial of a large number in Java BigInteger
  2. Diamond Problem of Inheritance in Java 8
  3. Custom Thread pool implementation in Java
  4. can we write a java method that swaps two integers
  5. Difference between HashMap and ConcurrentHashMap
  6. Removing elements while iterating over a Java Collection
  7. How will you implement your custom threadsafe Semaphore in Java

Recommended books for interview preparation:

Find more on this topic:
Buy interview books

Java & Microservices interview refresher for experienced developers.