The Road LRU

Berhane Cole
7 min readSep 22, 2021
Mappe zur Wandmalerei, Hamburger Kunstverein (Jahn 33)”, Blinky Palermo, 1973

For me, crafting and implementing algorithms can be nerve-wracking ultimately thrilling exercises. Having a background in carpentry and craftwork, I find some kinship between the techniques that go into these trades and artful implementation of algorithms. Like the best stone masons, engineers’ skills are assessed not only by what they build but how they build it; how they utilize resources and optimize their labor. Ground zero for technique is artful implementation of data structures and the algorithms that dictate their behavior. The optimizing instructions we will tackle today is the LRU cache algorithm.

Cache Replacement Policies

The LRU in LRU cache refers to the eviction policy that governs storage in the cache. LRU is one way to optimize a cache amongst some other cache replacement policies. A way to think about it is to recall the simpler data structures stack and queue. If we take a either of these data structure with a fixed size and stipulate that when that size is reached we expel (or evict) the appropriate element to make space for the new item, this is a cache replacement policy. A queue is a FIFO data structure so as data begins to over flow the oldest item still extant will be retrieved and expelled. A stack works oppositely as a LIFO structure where the top of the stack i.e. the most recently inserted item will be retrieved and expelled and replaced with the new item when the cache reaches overflow.

To illustrate these eviction policies let’s tack data structures with a size of five starting with a fixed size stack:

and next the fixed size queue:

As illustrated, cache replacement policy are the logic that govern cache storage, more specifically, the rules for what to expel once the cache reaches max capacity. LIFO and FIFO are built into the construction of stacks and queues so given a fixed size, expulsion is straightforward. Intermediate and and above level cache replacement policies, such as the LRU cache, bake a bit more logic into the circumstances that direct expulsion and how the cache organizes around this replacement policy.

LRU Cache Fundamentals

In a LRU cache, the element that is evicted at max capacity is as the name suggests, the least recently used. Least recently used is a bit clumsy to reason about but what it means is the item that was least recently touched. “Touching” an item, for the sake of this problem, is accessing it one of two ways: getting or setting. When an item is set it is most recently “touched” so it is placed in the back of the cache for expulsion, much like the visualization for our fixed size queue. The distinct aspect of the LRU cache is that accessing an item and getting its value also counts as “touching” the item, thus this element must be placed at the back of the cache since it was most recently “touched.

This logic is reasonable enough but the LRU cache presents one final feature which differentiates it: 0(1) time complexity for setting, getting, and removing. This is where the optimization feature of cache replacement policies come into view. There are several non-optimal ways to implement a cache that operates similarly to a LRU cache without the time benefits but these time complexity constraints force artful use of data structures to be completed. An avenue to achieve making a proper LRU cache is implementing a doubly linked list to handle the storage and work in tandem with the rules set in the LRU cache for expulsion.

Doubly Linked List

Linked lists are data structure that stores data linearly. Linked lists circumvent some of the time complexity shortcomings of arrays because the data is not grouped in a way that holds direct reference to an items placement within the structure. With linked lists there are no indices, rather elements have a head and tail and a next property that points to the next node and potentially a previous property that points to the previous node in the case of doubly linked list.

When approaching the LRU cache problem we opt to implement a doubly linked list in storage because to achieve 0(1) time with insertion and deletion we want to deal with simply adjusting the pointers instead of splicing an array that includes re-indexing the array. Since we need both the previous and next pointers to accomplish this we cannot use the singly linked list which only provide pointers in one direction.

The LRU cache is a great example of tradeoffs with regard to complexity. What we gain in time complexity we pay for in space complexity. Within our LRU cache class, we require a doubly linked list class that contains rules for our two pointers and all the methods for insertion, deletion, and access therein. Additionally, for 0(1) access, I introduced an array that holds all references to extant keys and values within the cache that is deleted when nodes are expelled.

While we achieve our time complexity goals, the LRU cache class is a beast and its implementation highlight some considerations that must be weighed in development.

Pseudocoding the Problem

When approaching this problem, I had to painstakingly pseudocode it out to understand the flow and rules for the cache. Below are my guidelines in my implementation of the cache.

LRU cache Lifecycle:
1. new LRUcache(limit)instantiates new cache
2. The argument of the class instantiation is the size of the cache
to set key-value pairs in cache you invoke the set method.
3. The set method takes in two parameters a key and a value.
4. If the amount of keys overtake the size of the cache, space in the cache must be freed up.
5. The variables that figure into freeing up space in the cache are
whether the key has been retrieved recently and order in which
key was inserted.
6. Retrieval with the get method counts as a last use.
7. This can be seen as setting/getting figure into the order of
priority of removal equally. All that matters is that the key/value
least recently engaged is removed from the cache.
8. With this in mind when set is invoked it needs to check if the size has been reached. If it has remove the least recently engaged item, insert the newly set item into the cache along with its place on the chopping block.
9. When get is called it retrieves the value set for the key set as
the method’s argument. If the key is not found in the cache it
returns null. If the value is found it returns the value and resets
the placement of the key/value on the chopping block.
10.The doubly-linked list comes into play as being the chopping block.

Visualizing the Problem

Like other algorithms and data structures, it is crucial to whiteboard and chart out the behavior of the cache to fully understand the life cycle of the cache. To start off with we are inserting the cache to capacity. If items are only set and not accessed the least recently touched items are the oldest items to the cache much like a FIFO data structure:

Once we introduce getting into the mix the least recently touched aspect of the cache comes into play:

Every time we get an item, we must promote that item in order to keep it from being evicted prematurely. Within that promotion we are changing the pointers of the accessed node’s preceding and succeeding nodes to point to each other instead of pointing the accessed node and reinserting the accessed node at the head of the list (given that we are evicting from the tail).

This promotion aspect of the LRU cache can provide a conceptual difficulty when approaching the LRU cache. Setting and removing from the tail are pretty elementary and their implementation are simple enough with a doubly linked list. However, making sure key value pairs correlate and the appropriate nodes are moved accordingly some telegraphing to understand how the cache is operating. In order to overcome this it is beneficial to invest in understanding doubly linked list traversal.

The Point

In computing resources are limited, cache replacement policies are implemented to improve performance by only maintaining data that adheres to some caching priority. With LRU, that priority is determined by an items last use. Parts of the LRU cache like its space complexity are ‘expensive’ but the algorithm makes for speedy insertion and deletion which may be worth it when considering scale.

What implementing the LRU cache afforded me was the opportunity to test the tools I have been honed in my nascent coding career. Mastery comes from practice and implementing this LRU cache provided obstacles to deepen my understanding of data structures and algorithms more broadly. If we stick to the conceit that software engineering is a craft, getting more comfortable addressing conceptually challenging prompts such as implementing a LRU cache is crucial to growth as an engineer. I find problems such as the LRU cache compelling and further cements my desire to continue down this road of software engineering as a métier.

--

--