Phase 01 · Fundamentals

HashMap

A hash table maps keys to values using a hash function, providing average O(1) insertion, deletion, and lookup. It's the go-to tool for reducing O(n²) brute force solutions to O(n).

~20 min read
Beginner–Intermediate
Python · JavaScript · Java

Overview

A HashMap (also called Hash Table, Dictionary, or Object depending on the language) stores key-value pairs. A hash function converts the key into an array index, allowing for near-instant retrieval. This is the structure behind Python's dict, JavaScript's Map, and Java's HashMap.

The fundamental trade-off: you trade memory (the backing array) for time — lookups that would require scanning an entire list now resolve in constant time.

💡

Interview Insight

The single most impactful pattern you can learn: whenever you see an O(n²) brute force with two nested loops, ask "can I use a HashMap to remember what I've seen?" This collapses it to O(n) in most cases.

How Hashing Works

When you insert a key, the HashMap:

  1. Runs the key through a hash function → produces an integer
  2. Takes that integer mod capacity → produces a bucket index
  3. Stores the value in that bucket

On lookup, the exact same steps happen — hash the key, find the bucket, return the value. No scanning required.

Python — simplified internals
class SimpleHashMap:
    def __init__(self, capacity=16):
        self.capacity = capacity
        self.buckets = [[] for _ in range(capacity)]

    def _hash(self, key) -> int:
        return hash(key) % self.capacity  # bucket index

    def put(self, key, value):
        idx = self._hash(key)
        bucket = self.buckets[idx]
        for i, (k, v) in enumerate(bucket):
            if k == key:
                bucket[i] = (key, value)  # update
                return
        bucket.append((key, value))         # insert

    def get(self, key):
        idx = self._hash(key)
        for k, v in self.buckets[idx]:
            if k == key:
                return v
        return None

Collision Handling

Two keys can hash to the same bucket — this is a collision. There are two primary strategies:

Chaining (most common)

Each bucket holds a linked list (or array) of all entries that hash there. Python's dict uses a variation of this. Worst-case degrades to O(n) if all keys collide into one bucket.

Open Addressing

When a collision occurs, probe for the next empty slot using a deterministic sequence (linear, quadratic, or double hashing). Java's HashMap uses chaining by default but switches to tree nodes (red-black tree) when a bucket exceeds 8 entries — O(log n) worst case instead of O(n).

Load Factor Matters

When the number of entries exceeds ~70–75% of capacity (load factor), the map rehashes into a larger array. This is O(n) but amortized O(1) per insertion. Java's default load factor is 0.75; Python's is ⅔.

Time Complexity

OperationAverageWorstNotes
InsertO(1)O(n)Worst: all keys collide
LookupO(1)O(n)Worst: full bucket scan
DeleteO(1)O(n)Same as lookup
IterationO(n)O(n)Must visit all buckets

Space complexity: O(n) — proportional to number of entries, plus overhead of the backing array.

Core Patterns

Frequency Count

Count occurrences of elements — the foundation of anagram detection, majority element, and histogram problems.

Python
from collections import Counter

def is_anagram(s: str, t: str) -> bool:
    return Counter(s) == Counter(t)

# Manual version:
def is_anagram_manual(s: str, t: str) -> bool:
    if len(s) != len(t):
        return False
    count = {}
    for c in s:
        count[c] = count.get(c, 0) + 1
    for c in t:
        count[c] = count.get(c, 0) - 1
        if count[c] < 0:
            return False
    return True

Two-Pass Lookup (Classic Two Sum)

Store what you've seen in one pass, then look up complements. Reduces O(n²) brute force to O(n).

Python
def two_sum(nums: list, target: int) -> list:
    seen = {}  # value → index
    for i, n in enumerate(nums):
        complement = target - n
        if complement in seen:
            return [seen[complement], i]
        seen[n] = i
    return []

# Time: O(n)  |  Space: O(n)

Grouping by Key

Group items that share a property — anagram groups, items by category, nodes by level.

Python
from collections import defaultdict

def group_anagrams(words: list) -> list:
    groups = defaultdict(list)
    for word in words:
        key = tuple(sorted(word))  # canonical form
        groups[key].append(word)
    return list(groups.values())

# Input:  ["eat","tea","tan","ate","nat","bat"]
# Output: [["eat","tea","ate"], ["tan","nat"], ["bat"]]

Common Interview Problems

  • Two Sum — store complement lookup, O(n)
  • Valid Anagram — character frequency count
  • Group Anagrams — sort chars as key, group words
  • Longest Consecutive Sequence — store all in set, expand from sequence start
  • Subarray Sum Equals K — prefix sums stored in map
  • Top K Frequent Elements — count then bucket sort or heap
  • LRU Cache — HashMap + doubly linked list

Language Specifics

Python: use dict or collections.defaultdict. JavaScript: prefer Map over plain objects (preserves insertion order, any key type). Java: use HashMap<K,V>; use LinkedHashMap if order matters.