DSA: Implement an LRU Cache (Real Interview Pattern)
Solution to Last Week's Challenge
Last week's problem was: DSA: Interview Prep Strategy for 2024
The optimal solution uses dsa: interview prep strategy for 2024 pattern with careful state management.
Complexity Analysis:
- Time Complexity: O(n)
- Space Complexity: O(1)
The Least Recently Used (LRU) cache is perhaps the most iconic data structure design question in the software engineering interview circuit. It sits at the perfect intersection of low-level pointer manipulation and high-level system design. While many candidates can describe the concept—discarding the oldest items first when the cache reaches capacity—implementing it under the pressure of a 45-minute interview requires a deep understanding of how different data structures can be composed to achieve optimal performance.
In real-world production systems, the LRU pattern is everywhere. It is the backbone of the MySQL Buffer Pool, the logic behind Redis eviction policies, and the mechanism operating systems use for page replacement. As a senior engineer, I look for candidates who don't just know the "what," but understand the "why" behind the specific combination of a Hash Map and a Doubly Linked List. This combination is the only way to achieve the industry-standard $O(1)$ time complexity for both access and updates.
Theory and Core Concepts
To implement an LRU cache, we must satisfy two primary requirements: fast lookups and fast updates. A simple array or list allows for $O(1)$ additions at the end, but finding an element or removing one from the middle takes $O(n)$ time. Conversely, a standard Hash Map provides $O(1)$ lookups but lacks any inherent sense of order, making it impossible to track which item was "least recently used."
The "aha!" moment comes when we combine them. We use a Hash Map to store keys and their corresponding locations in a Doubly Linked List. The Doubly Linked List maintains the chronological order of access.
Pattern Visualization
When a get or put operation occurs, the cache must treat that specific key as the "most recently used." In our Doubly Linked List, this means moving the corresponding node to the "head" of the list. If the cache exceeds its capacity during a put operation, we simply remove the node at the "tail," as it represents the least recently used item.
Implementation
The following Python implementation uses a dummy head and tail node. This is a common "pro-tip" in interviews to avoid constant null-pointer checks, making the code cleaner and less error-prone.
class Node:
def __init__(self, key, value):
self.key = key
self.value = value
self.prev = None
self.next = None
class LRUCache:
def __init__(self, capacity: int):
self.cap = capacity
self.cache = {} # Map key to Node
# Dummy nodes for boundary safety
self.head = Node(0, 0)
self.tail = Node(0, 0)
self.head.next = self.tail
self.tail.prev = self.head
def _remove(self, node):
"""Remove an existing node from the linked list."""
prev_node = node.prev
next_node = node.next
prev_node.next = next_node
next_node.prev = prev_node
def _add(self, node):
"""Add a new node right after the head (most recent)."""
temp = self.head.next
self.head.next = node
node.prev = self.head
node.next = temp
temp.prev = node
def get(self, key: int) -> int:
# Time Complexity: O(1)
if key in self.cache:
node = self.cache[key]
self._remove(node)
self._add(node)
return node.value
return -1
def put(self, key: int, value: int) -> None:
# Time Complexity: O(1)
if key in self.cache:
self._remove(self.cache[key])
new_node = Node(key, value)
self._add(new_node)
self.cache[key] = new_node
if len(self.cache) > self.cap:
# Evict the least recently used (node before tail)
lru = self.tail.prev
self._remove(lru)
del self.cache[lru.key]
# Space Complexity: O(Capacity)Complexity Analysis Table
| Approach | Get Complexity | Put Complexity | Space Complexity | Pros/Cons |
|---|---|---|---|---|
| Array/List | $O(n)$ | $O(1)$ | $O(n)$ | Easy to implement; slow lookups. |
| Hash Map only | $O(1)$ | $O(1)$ | $O(n)$ | No order tracking; cannot evict LRU. |
| DLL + Hash Map | $O(1)$ | $O(1)$ | $O(n)$ | Optimal performance; complex pointers. |
| OrderedDict (Python) | $O(1)$ | $O(1)$ | $O(n)$ | Built-in; often disallowed in interviews. |
Common Patterns and Variations
Interviewers often add twists to the standard LRU. One common variation is the LFU (Least Frequently Used) cache, which requires tracking access counts. Another is adding a Time-to-Live (TTL), where items expire after a certain duration regardless of access.
Variation: LRU with Update Logic
Sometimes you need to update an existing key's value without changing its "recency" if the value is the same. However, the standard pattern is to always move to the head on any put or get.
Variation: Concurrency
In a real-world multi-threaded environment (like a Java web server), you would need to wrap these operations in a ReentrantLock or use a ConcurrentHashMap combined with synchronized blocks on the DLL nodes to prevent race conditions during pointer updates.
Practice Problems
To master this pattern, progress through these problems in order. The jump from LRU to LFU is significant and often appears in Senior or Staff-level interviews.
This Week's Interview Challenge
Problem: The Time-Aware LRU Cache
Design a data structure that follows the constraints of a Least Recently Used (LRU) cache, but with an added expiration policy. Each key-value pair is inserted with a ttl (time-to-live) in milliseconds.
A key is considered invalid if:
- It is the least recently used and the cache is over capacity.
- The current time has exceeded the time it was inserted plus its
ttl.
Examples:
put(1, "A", 1000)at time 0.get(1)at time 500 returns "A".get(1)at time 1500 returns -1 (expired).
Constraints:
getandputshould still aim for $O(1)$ average time complexity.- You may assume a helper function
get_current_time()exists. - Capacity will be between 1 and 1000.
Hints:
- Think about whether you should store the expiration timestamp inside the
Node. - Do you need to clean up expired nodes immediately, or can you do it "lazily" during
getcalls? - How does the
capacityeviction interact with thettleviction? (Hint: Check expiration first).
Template:
class TimeAwareLRU:
def __init__(self, capacity: int):
pass
def get(self, key: int, current_time: int) -> int:
# Your code here
pass
def put(self, key: int, value: int, ttl: int, current_time: int) -> None:
# Your code here
passSubmit your solution and check back next week for the detailed answer!
Conclusion
The LRU cache is a quintessential DSA problem because it forces you to manage state across two different data structures simultaneously. When practicing, focus on the pointer logic of the Doubly Linked List—drawing it out on paper before coding is a strategy used by even the most seasoned engineers. Mastering this pattern demonstrates to interviewers that you understand the trade-offs of memory locality, time complexity, and robust object-oriented design.