import java.util.*;
import java.util.stream.LongStream;
public class Problem2 {
public static final int SINGLE_BUCKET_SIZE = 10_000;
public static final int TOTAL_BUCKETS = 400;
final int[] buckets = new int[TOTAL_BUCKETS];
Set<Integer> bucketsWithProblem = new HashSet<>();
Map<Integer, BitSet> bucketData = new HashMap<>();
public void countItemsInEachBucket(long input) {
int bucketId = (int) (input / SINGLE_BUCKET_SIZE);
buckets[bucketId]++;
}
public void findBucketsWithLesserItems() {
for (int i = 0; i < buckets.length; i++) {
if (buckets[i] < SINGLE_BUCKET_SIZE) {
bucketsWithProblem.add(i);
System.out.println("Bucket has a problem: " + i + ", total items =" + buckets[i] + ", missing items = " + (SINGLE_BUCKET_SIZE  buckets[i]));
}
}
}
public void locateMissingNumbers(long input) {
int bucketId = (int) (input / SINGLE_BUCKET_SIZE);
if (bucketsWithProblem.contains(bucketId)) {
bucketData.computeIfAbsent(bucketId, BitSet::new).set((int) (input % SINGLE_BUCKET_SIZE));
}
}
public void printMissingNumbers() {
bucketData.forEach((integer, bitSet) > {
int nextClearBit = 0, lastClearBit = 0;
do {
nextClearBit = bitSet.nextClearBit(lastClearBit);
if (nextClearBit < SINGLE_BUCKET_SIZE) {
int missingNumber = integer * SINGLE_BUCKET_SIZE + nextClearBit;
System.out.println("missing number = " + missingNumber);
} else break;
lastClearBit = nextClearBit + 1;
} while (lastClearBit < SINGLE_BUCKET_SIZE);
});
}
public static void main(String[] args) {
HashSet<Long> missingNumbers = new HashSet<>(10);
missingNumbers.add(10L);
missingNumbers.add(100L);
missingNumbers.add(900L);
missingNumbers.add(15L);
missingNumbers.add(3999L);
Problem2 problem2 = new Problem2();
LongStream.range(0, 4_000_000L).filter(value > !missingNumbers.contains(value)).forEach(problem2::countItemsInEachBucket);
System.out.println("Pass 1 is complete now");
problem2.findBucketsWithLesserItems();
System.out.println("Pass 2 is complete now");
LongStream.range(0, 4_000_000L).filter(value > !missingNumbers.contains(value)).forEach(problem2::locateMissingNumbers);
System.out.println("Pass 3 is complete now");
problem2.printMissingNumbers();
System.out.println("Pass 4 is complete now");
}
}
Find missing numbers in 4 billion unique numbers with 50MB RAM
Carvia Tech  December 08, 2019  2 min read  24 views
4 billion numbers can not fit into 50MB RAM. Even if we take a bitset of 4 billion bits, it will need 4 GB RAM. So we need a multipass solution.
Solution

To solve this limited memory problem, we will create buckets of integers, each bucket covering 10,000 integers.

So first bucket will cover 09999, second bucket will cover 10,00019,999 and so on.

To cover 4 x 10^{9} integers, we will need 4x10^{9} / 10^{4} = 4x10^{5} buckets

We will store count of numbers (not actual numbers) in each bucket, that way we will save lot of memory at the cost of extra passes. If no number is missing in a bucket, then the number of integers in that bucket would be 10, 000

After processing all 4 billion integers, we will make a second pass to figure out the missing numbers in those buckets that have less than 10,000 items.
Time Complexity of approach: 3xO(n)
For simplicity purpose, i am providing code with much lower values, so that it can be easily tested and understood.
Top articles in this category:
 can we write a java method that swaps two integers
 How will you calculate factorial of a large number in Java
 Set precision and scale for a double value in java
 How to implement Thread pool in Java without executor framework
 Given a collection of 1 million integers, All ranging between 1 to 9, how would you sort them in Big O(n) time
 Generate Random Numbers in a range using Java 8
 How to find if the Linked List contains any Cycle/Loop
Find more on this topic:
Core Java
Core Java  OOP Concepts, Garbage Collection, Multithreading, Collections Framework, Java 8 Features, Lambda Functions, Streams.
Last updated 1 week ago
Recommended books for interview preparation:
Similar Posts
 Explain Java Exception Class Hierarchy
 Http download using Java NIO FileChannel
 CRC32 checksum calculation Java NIO
 Set precision and scale for a double value in java
 Difference between HashMap, LinkedHashMap and TreeMap
 What is difference between ExecutorService submit and execute method
 What is left shift right shift and unsigned rght shift operator in Java
 What happens when wait() & notify() method are called
 can we write a java method that swaps two integers
 Find missing numbers in 4 billion unique numbers with 50MB RAM
Provide email address to subscribe to this blog.