DSA · Topic 2 of 21

Hash Maps & Sets

50 XP

How hash maps work internally

A hash map stores key-value pairs in an array of buckets. When you call map.set(key, value), three things happen:

  1. Hash the key — a hash function converts the key into a non-negative integer.
  2. Find the bucketindex = hash(key) % buckets.length.
  3. Store the entry — the key-value pair lands in that bucket.

Collisions occur when two keys hash to the same index. The two standard resolution strategies:

  • Separate chaining — each bucket holds a linked list of entries. On lookup the engine walks that list comparing keys. Used by most production hash maps.
  • Open addressing — on collision, probe to the next empty slot (linear, quadratic, or double hashing). Faster cache behaviour, harder to implement correctly.

Load factor = entries / buckets. When it exceeds ~0.75, the map rehashes: allocates a 2× larger bucket array and reinserts every entry. Rehashing costs O(n) but happens exponentially rarely — amortised cost stays O(1).

Worst case O(n): if every key collides into the same bucket (pathological input or deliberate hash-flooding attack) every operation degrades to O(n) list traversal. In practice JavaScript engines use randomised hash seeds, and this never occurs in interviews.


JS Map vs Object

FeatureMapObject
Key typesAny value — numbers, objects, functionsStrings and Symbols only
Insertion orderGuaranteed preservedPreserved for string keys in modern JS
Size.size property, O(1)Object.keys(o).length, O(n)
Prototype pollutionNoneInherited keys like toString can conflict
Iterationfor...of, .forEach, .entries()Object.keys / values / entries

Rule of thumb: use Map when keys are dynamic, non-string, or order matters. Use a plain object for static configuration or when you are certain keys are clean strings.


Frequency counting pattern

The most common hash map operation in interviews — count how many times each element appears.

function buildFreqMap<T>(arr: T[]): Map<T, number> {
  const freq = new Map<T, number>()
  for (const x of arr) {
    freq.set(x, (freq.get(x) ?? 0) + 1)
  }
  return freq
}

freq.get(x) ?? 0 handles the first occurrence cleanly — Map.get returns undefined for unseen keys, and undefined ?? 0 evaluates to 0. Never use || 0 here; it would incorrectly reset a legitimate value of 0.


Two Sum (LC 1)

Problem: Return indices of the two numbers that sum to target.

Brute force: O(n²) — try every pair.

Hash map: for each element, check whether its complement (target - num) was already seen. One pass, O(n).

function twoSum(nums: number[], target: number): number[] {
  const seen = new Map<number, number>()  // value → index

  for (let i = 0; i < nums.length; i++) {
    const complement = target - nums[i]
    if (seen.has(complement)) {
      return [seen.get(complement)!, i]
    }
    seen.set(nums[i], i)
  }
  return []
}
// Time: O(n)  Space: O(n)

Pattern: “find a pair satisfying some condition” → store what you need to see later in a map as you scan forward.


Group Anagrams (LC 49)

Problem: Given an array of strings, group the anagrams together.

Key insight: all anagrams share the same sorted character sequence. Use that sorted string as a canonical map key.

function groupAnagrams(strs: string[]): string[][] {
  const groups = new Map<string, string[]>()

  for (const s of strs) {
    const key = s.split('').sort().join('')  // "eat" → "aet"
    if (!groups.has(key)) groups.set(key, [])
    groups.get(key)!.push(s)
  }

  return Array.from(groups.values())
}
// Time: O(n · k log k)  Space: O(n · k)  — k = max string length

O(k) alternative: instead of sorting, build a 26-slot frequency signature and use it as the key.

function groupAnagramsFast(strs: string[]): string[][] {
  const groups = new Map<string, string[]>()

  for (const s of strs) {
    const count = new Array(26).fill(0)
    for (const c of s) count[c.charCodeAt(0) - 97]++
    const key = count.join(',')          // "1,0,0,...,1,0,..." — unique per anagram class
    if (!groups.has(key)) groups.set(key, [])
    groups.get(key)!.push(s)
  }

  return Array.from(groups.values())
}
// Time: O(n · k)  Space: O(n · k)

Top K Frequent Elements (LC 347)

Problem: Return the k most frequent elements. Must be better than O(n log n).

Approach: frequency map + bucket sort. Because any element’s frequency is at most n, index into an array of n+1 buckets by frequency — then scan from the top.

function topKFrequent(nums: number[], k: number): number[] {
  // Step 1: count frequencies
  const freq = new Map<number, number>()
  for (const n of nums) freq.set(n, (freq.get(n) ?? 0) + 1)

  // Step 2: bucket sort — buckets[i] holds all numbers with frequency i
  const buckets: number[][] = Array.from({ length: nums.length + 1 }, () => [])
  for (const [num, count] of freq) {
    buckets[count].push(num)
  }

  // Step 3: collect top k scanning from highest frequency
  const result: number[] = []
  for (let i = buckets.length - 1; i >= 0 && result.length < k; i--) {
    result.push(...buckets[i])
  }
  return result.slice(0, k)
}
// Time: O(n)  Space: O(n)

Longest Consecutive Sequence (LC 128)

Problem: Find the length of the longest sequence of consecutive integers. O(n) required.

Key insight: load all numbers into a Set. Only start a count when n - 1 is not in the set — that means n is a sequence start. Then extend right as far as possible.

function longestConsecutive(nums: number[]): number {
  const numSet = new Set(nums)
  let best = 0

  for (const n of numSet) {
    if (numSet.has(n - 1)) continue   // not a sequence start — skip

    let length = 1
    while (numSet.has(n + length)) length++
    best = Math.max(best, length)
  }
  return best
}
// Time: O(n)  Space: O(n)

Why O(n) despite the nested loop: the while loop runs only when n is a sequence start. Each number is visited by the while loop at most once across the entire outer loop — total iterations is bounded by n.


Valid Anagram (LC 242)

Problem: Determine if t is an anagram of s.

Approach — character count array: increment for each character in s, decrement for t. If all slots are zero, they are anagrams.

function isAnagram(s: string, t: string): boolean {
  if (s.length !== t.length) return false

  const count = new Array(26).fill(0)
  const a = 'a'.charCodeAt(0)

  for (let i = 0; i < s.length; i++) {
    count[s.charCodeAt(i) - a]++
    count[t.charCodeAt(i) - a]--
  }

  return count.every(c => c === 0)
}
// Time: O(n)  Space: O(1) — fixed 26-element array

For Unicode strings (emoji, accented characters), the 26-slot array breaks. Use a Map<string, number> instead — same logic, handles any character.


LRU Cache (LC 146)

Problem: Design a cache with O(1) get and put. Evict the least recently used item when at capacity.

Key insight: JavaScript’s Map preserves insertion order. To mark an item as recently used, delete it and re-insert it — it moves to the tail. The head is always the least recently used entry.

class LRUCache {
  private capacity: number
  private cache: Map<number, number>   // ordered LRU → MRU

  constructor(capacity: number) {
    this.capacity = capacity
    this.cache = new Map()
  }

  get(key: number): number {
    if (!this.cache.has(key)) return -1
    const val = this.cache.get(key)!
    this.cache.delete(key)
    this.cache.set(key, val)           // move to MRU position
    return val
  }

  put(key: number, value: number): void {
    if (this.cache.has(key)) {
      this.cache.delete(key)           // remove stale entry
    } else if (this.cache.size >= this.capacity) {
      const lruKey = this.cache.keys().next().value!  // head = LRU
      this.cache.delete(lruKey)
    }
    this.cache.set(key, value)         // insert at MRU position
  }
}
// get: O(1)  put: O(1)  Space: O(capacity)

If the interviewer bans Map insertion-order tricks, implement with a doubly linked list + Map: the DLL maintains LRU→MRU order; the Map gives O(1) node lookup by key. Both get and put become: look up node in map, splice it from the list, reinsert at tail.


Subarray Sum Equals K (LC 560)

Problem: Count the number of subarrays that sum to k.

Key insight: prefix sums. The sum of elements from index i to j equals prefixSum[j+1] - prefixSum[i]. We want pairs where that difference equals k, i.e., prefixSum[i] == prefixSum[j+1] - k.

Maintain a running prefix sum and a frequency map of every prefix sum seen so far. For each new prefix sum, look up how many earlier prefix sums equal currentSum - k.

function subarraySum(nums: number[], k: number): number {
  const prefixCount = new Map<number, number>()
  prefixCount.set(0, 1)    // the empty prefix has sum 0

  let prefixSum = 0
  let count = 0

  for (const n of nums) {
    prefixSum += n
    count += prefixCount.get(prefixSum - k) ?? 0
    prefixCount.set(prefixSum, (prefixCount.get(prefixSum) ?? 0) + 1)
  }

  return count
}
// Time: O(n)  Space: O(n)

The prefixCount.set(0, 1) initialisation is critical — it handles subarrays that start at index 0. Without it, a subarray from the very beginning with sum k would be missed.


Hash set for visited / seen

Sets are the right tool when you need O(1) membership checks without storing associated values.

Deduplication:

const unique = [...new Set(arr)]

Cycle detection in linked list:

function hasCycle(head: ListNode | null): boolean {
  const seen = new Set<ListNode>()
  let node = head
  while (node) {
    if (seen.has(node)) return true
    seen.add(node)
    node = node.next
  }
  return false
}
// Time: O(n)  Space: O(n)
// Floyd's two-pointer trick gives O(1) space but this is clearer

Grid / BFS visited tracking:

const visited = new Set<string>()
// encode (row, col) as a string key
visited.add(`${row},${col}`)
if (visited.has(`${nextRow},${nextCol}`)) continue

Complexity reference

OperationAverageWorst caseNotes
map.get(key)O(1)O(n)Collision chain traversal
map.set(key, val)O(1) amortisedO(n)Includes rehash amortisation
map.has(key)O(1)O(n)
map.delete(key)O(1)O(n)
map.sizeO(1)Stored as a property
Full iterationO(n)O(n)
set.has(val)O(1)O(n)
set.add(val)O(1) amortisedO(n)

When to reach for a hash map

The single most valuable pattern-recognition rule in FAANG interviews:

“If you find yourself writing a nested loop to search an array, pause and ask: can I precompute a map in one pass and answer each inner query in O(1)?”

You see…Reach for…
Find a pair summing to XMap: store complement → index
Count occurrencesMap: element → frequency
Group by shared propertyMap: property → list
”Has this been seen before?”Set
O(1) lookup by non-index keyMap
DeduplicateSet
Cache computed resultsMap

Every O(n²) two-loop solution that becomes O(n) in interviews does so via a hash map. Internalise this reflex.