How Recommendation Systems Learned to Think

Recsys Series Part 2: From collaborative filtering breakthroughs to generative AI agents that can chat about your preferences and explain their reasoning

Oct 04, 2025

∙ Paid

This is Post 2 in my comprehensive RecSys series. In Post 1, we looked at the fundamental techniques—collaborative filtering, content-based filtering, and matrix factorization. But how did we get here? And where are we heading?

Picture this: It’s 1994, and the entire web has maybe 10,000 websites. A small team at the University of Minnesota launches something called GroupLens to help people find interesting Usenet articles. The system could barely handle a few thousand users, but it introduced a revolutionary idea: computers could predict what you’d like based on what similar people enjoyed.

Fast forward 30 years and we are here.

Netflix serves 230 million subscribers with AI-powered recommendations processing billions of interactions in real-time. Spotify’s Discover Weekly creates 40 million personalized playlists every single week. And now, ChatGPT-style models are starting to generate recommendations through natural conversation.

In this post, we’ll trace the complete evolution of recommendation systems, understand not just what happened but why it happened, and see how to implement the key innovations yourself. This historical perspective will give you the context to understand why certain approaches dominate different scenarios—knowledge that’s crucial for building effective systems today.

Ready to journey through 30 years of RecSys evolution? Let’s dive in.

The Pre-Internet Era: When Computers First Learned to Recommend

Grundy (1979): The First Digital Librarian

Okay, I will start with some history. It still blows my mind that Elaine Rich created what might be the world’s first recommender system called Grundy back in 1970. I mean, how awesome is that someone thought about this problem in 1970, when computers barely existed. Forget computers—even data was a thing of the future!

This “computer librarian” would interview users about their reading preferences and then classify them into stereotypes like “mystery lover” or “sci-fi fan.”

Grundy’s approach was dead simple:

Ask users direct questions about their preferences
Classify them into predefined categories
Recommend books from those categories

Sure, it was basic, but Grundy solved a fundamental problem that still bugs us today: the cold start problem. When you have zero data about a new user, how do you make recommendations? Grundy’s solution: just ask them directly.

Grundy proved that explicitly asking about preferences could bootstrap recommendation systems. This insight is still everywhere today—Netflix’s thumbs up/down ratings, Spotify’s music taste onboarding, and TikTok’s initial “what interests you?” questions all trace back to Grundy’s core insight that sometimes you just need to ask users what they want.

Information Retrieval and the Rise of Content-Based Filtering

While Grundy focused on user characteristics, researchers were working in parallel on analyzing item characteristics. Back in the 1960s, Gerard Salton introduced the Vector Space Model, which basically said “hey, what if we represent text documents as numerical vectors in a high-dimensional space?” Interestingly, Salton never actually wrote the paper he’s most famous for—there’s literally a paper called “The Most Influential Paper Gerard Salton Never Wrote”. But this was an extraordinary discovery that still matters today when we think of documents as embedding vectors.

The trick to making this work was figuring out the weight of each term in the vector. This led to Term Frequency-Inverse Document Frequency (TF-IDF). Don’t let the fancy name scare you—it’s actually pretty intuitive:

Term Frequency (TF): How often does a word appear in a document?

Inverse Document Frequency (IDF): How rare is this word across all documents?

The idea is actually very intuitive. Words that appear frequently in a specific document but rarely elsewhere get high scores - these are the words that really define what that document is about.

This content-based approach worked well for its time, but it had a fundamental limitation: it could only recommend items similar to what you’d already consumed. If you only watched action movies, you’d never discover you might love romantic comedies. The system was trapped in what we call a “content bubble.”

Tapestry: The Birth of “Collaborative Filtering”

The big conceptual leap from content-based analysis to leveraging user communities happened at Xerox PARC in 1992. Researchers built the “Tapestry” system to help users manage the crazy flow of electronic documents and emails. Now a user could create queries like, “Show me all documents that my colleague ‘dave’ has marked as ‘important.’”

Tapestry coined the term “collaborative filtering” and introduced the idea that you could create filters based on other people’s actions. Pretty revolutionary stuff for 1992!

Tapestry proved that collective intelligence could work in digital systems. The insight that “people who agreed in the past will probably agree again in the future” became one of the foundational pillar for RecSys.

GroupLens: Automating the Wisdom of Crowds

The first system to actually automate collaborative filtering was GroupLens, built in 1994 at the University of Minnesota. And this is actually where most literature starts coming into view. It was designed to help users navigate the flood of articles on Usenet newsgroups, and it transformed collaborative filtering from a manual process into something algorithmic.

Instead of manually deciding whose opinions to trust, GroupLens would automatically find users with similar tastes and use their preferences to make recommendations.

This became known as User-Based Collaborative Filtering:

Find users similar to you
Look at what they liked that you haven’t seen
Recommend those items

The Scalability Problem and Amazon’s Solution

But there was a problem. As the web exploded, user-based collaborative filtering became painfully slow. Finding similar users for millions of people in real-time? Not happening.

GroupLens worked great for thousands of users, but by 2000, Amazon had millions. Storing user similarity matrices grows as O(n²) - that’s terabytes for millions of users.

This scalability nightmare led to a brilliant innovation from Amazon. They flipped the problem on its head with item-based collaborative filtering (Item-Item CF).

Instead of “Users like you also liked...”, they said “Users who bought The Matrix also bought Blade Runner.”

For most platforms, items << users. Amazon in 2000 had millions of users but maybe hundreds of thousands of products. The genius move was to Pre-compute item similarities offline, then at recommendation time, just do fast lookups.

The Matrix Factorization Era - Catalyzed by the Netflix Prize (2006-2009)

Keep reading with a 7-day free trial

Subscribe to MLWhiz | AI Unwrapped to keep reading this post and get 7 days of free access to the full post archives.