Opinion

How recommendation engines work

A simple guide to how Google, Facebook, Amazon and other platforms make accurate product recommendations for users.

You go to the local supermarket, and you buy a pack of noodles, a crate of eggs, some spicy Cameroon pepper, and 2 kilos of frozen chicken. Then you proceed to check out your purchases. From everything you bought, it can be safely assumed that you're trying to beat the Indomie adverts at their own game. But there's one key thing missing, so the cashier says. 'Sir/Madam, won't you buy some fresh groundnut oil? We have several brands in stock'. The cashier goes ahead to list some of them and maybe their prices. You may or may not buy one eventually.

Nevertheless, the cashier has made a fairly accurate recommendation of something you ought to be interested in based on your purchases. In the simplest of terms, this is what a recommendation engine does.

> Recommendation engines are a popular form of artificial intelligence that can be found on many commercial applications.

Recommendation engines are a popular form of artificial intelligence that can be found on many commercial applications. It's behind the movie recommendations that you get on Netflix. Netflix has said that its proprietary recommendation engine is worth $1 billion per year. Imagine that!

A bunch of complex code in a computer somewhere is worth five times the dividends the Federal Government of Nigeria hopes to earn from Nigeria LNG Limited (NLNG) this year.

It's the magic behind your YouTube feed. The more makeup videos you watch, the more makeup videos are recommended for your viewing pleasure. If you made a volte-face in your viewing habits and started viewing more of Nollywood movies online, the recommendations would shift accordingly.

A typical recommendation engine takes information, in the form of data from you and other users, and processes it to produce recommendations. Think of it this way - if you buy a smartphone online and 100 previous users bought that smartphone and a headset, a recommendation engine would likely suggest that headset for you.

Clustering Algorithms — the secret sauce of recommendation engines

Recommendation engines are partly driven by clustering algorithms. A clustering algorithm like the name implies, groups items together based on their similarities. It is commonly used for market segmentation to group customers according to their preferences, based on data.

Let me explain using our perceptive shop cashier from the first paragraph. Let's call him Ade, and the name of the supermarket where he works is Sunny Bright Stores.

Every day, Sunny Bright receives 100 customers. 50 of those customers are middle-aged women who mostly buy groceries. 25 customers are young women who occasionally buy groceries, but also buy a lot of sanitary pads, toiletries, and beauty products. The remaining 25 are mostly young men who buy a lot of condoms, snacks, and groceries.

These 3 groups are 3 separate clusters because they have several things in common, such as age, sex, and general nature of purchases. When a new customer walks into Sunny Bright Stores, Ade (like a typical recommendation engine using pre-existing clusters) can guess what the person is likely to buy. A lot of times, he is fairly accurate.

When the person comes to his desk to pay, Ade can make recommendations about what the person might be interested in, based on what the relevant cluster is usually interested in. This is what happens 'behind the scenes' when he recommended a bottle of groundnut oil for the customer in the first paragraph.

> We all have profiles on the online platforms we frequently use. The more you use them, the smarter the recommendation engine gets about your preferences.

The building blocks of these clusters come from data. This data comes in several forms. When you browse any webpage, you leave a digital footprint that is matched to your IP address and can be stored to create a unique profile of you. This is implicit data. When you subscribe to a channel, post a comment, rate an app on your app store, or put items in your online shopping cart (with no intention of actually checking them out), you're providing explicit data about your preferences.

The combination of implicit and explicit data is used by algorithms to create a composite profile of you. We all have profiles on the online platforms we frequently use. The more you use them, the smarter the recommendation engine gets about your preferences.

It's like Ade, our shop clerk. Maybe the first time, you told him that you weren't interested in cooking oil because you had a stockpile at home. Give that same reply when you visit the store a second time, he's not likely to ask you the third time. In the same vein, if you were to repeatedly go against the recommendations of the algorithm, it collects this data to build your unique profile and get smarter about you.

Recommendation engines are, in essence, a bet on human psychology. We are creatures of habit who tend to repeat previous choices. 1 in 3 people who purchase something on Amazon do so based on a recommendation. That's $93 billion from just suggesting the next best thing.

Recommendation engines are great but they're not perfect

Not everything about recommendation engines is beautiful. The technology, though efficient, has its drawbacks. Because it recommends what most people prefer, it effectively pushes forward the most popular product, not the best product.

Ever wonder why a movie or a song gets so much hype, and when you listen to it, you're like, 'Is this what you all were raving about?' That's the challenge recommendation engines face.

That a hundred people bought a specific headset alongside a smartphone and Amazon recommends that one for me doesn't mean it is the best headset in the market. They probably bought it because it was cheaper or had better brand recognition.

This kind of situation is known as a popularity bias. The more popular something is, the more exposure it gets. The less popular something is, the more it gets buried. You can compare this with how search engines rank webpages. The articles on page 1 are partly there because they are the most SEO-optimized. The more optimized they are, the more visits they get. The more visits they get, the more algorithm ranks them higher, because it believes that the content there is relevant based on traffic. There might be webpages on page 10 of Google Search that are better in quality, but because they are not popular, they might not see the light of page 1, to use that expression.

Because recommendation engines are susceptible to popularity bias, it means that they can easily be manipulated by malicious activity. In May, millions of Indians went on Playstore to give negative reviews of the Chinese app, TikTok. Most people knew it was a fallout of the larger diplomatic face-off between India and China. However, the Playstore algorithm that recommends apps to download doesn't follow the news. That kind of activity could trick the algorithm into reducing the rankings of affected app. In that instance, Google had to intervene by removing the poor reviews to ensure that the algorithm wouldn't de-rank TikTok based on those reviews.

To sum up the issue at stake, recommendation engines can create popularity bubbles, where certain forms of content are promoted to a group of people, based on their similarity and popularity while leaving them unaware of other forms of content outside the bubble, simply because it doesn't match their preferences, or is not popular.

This challenge, however, creates an opportunity for meaningful human intervention, by intentionally bursting or injecting into these bubbles new and unrelated content. So using Ade as an example. Maybe Ade shouldn't stop at recommending groundnut oil for me all because I fall into the cluster of those who frequently buy groceries and the things I just purchased. He could go further to recommend that I buy a book that just came out. It's in no way connected to what I just bought, but it breaks the self-reinforcing loop of recommendations and opens other customers and me to other possibilities.

Measures like this are already in place in certain settings. For instance, some apps on Playstore have the mark 'Editor's Choice' on them, to signify that they are there, not because the recommendation engine ranked them, but someone behind the scene intentionally recommended it.

We may be creatures of habit with the tendency to repeat previous choices, but every once in a while, we tend to buck the trend. This is the next phase of the evolution of the recommendation engine – to spring surprises on us that further expand our horizons.