The Science Behind the Netflix Algorithms That Decide What You'll Watch Next

Photo: Cody Pickens If you liked 1960s Star Trek, the first non-Trek title that Netflix is likely to suggest to you is the original Mission: Impossible series (the one with the cool Lalo Schifrin soundtrack). Streaming the latest Doctor Who is likely to net you the supernatural TV drama Being Human (the UK version). Watch […]
Image may contain Human Person Face Piolo Pascual and Text

Photo: Cody Pickens

If you liked 1960s Star Trek, the first non-Trek title that Netflix is likely to suggest to you is the original Mission: Impossible series (the one with the cool Lalo Schifrin soundtrack). Streaming the latest Doctor Who is likely to net you the supernatural TV drama Being Human (the UK version). Watch From Dusk Till Dawn and 300 and say hello to a new row on your homepage: Visually Striking Violent Action & Adventure. Trying to understand the invisible array of algorithms that power your Netflix suggestions has long been a favorite sport, but what's actually going on in that galaxy of big data, those billions and billions of ratings stars? Turns out there are 800 Netflix engineers working behind the scenes at their Silicon Valley HQ. The company estimates that 75 percent of viewer activity is driven by recommendation. This summer it's unveiling a profile feature enabling family members to demarcate their preferences with individual queues. In March the company shipped its 4 billionth DVD, but in the first quarter of 2013 alone, it streamed more than 4 billion hours. We spoke with Netflix's recommendation dynamos—Carlos Gomez-Uribe, VP of product innovation and personalization algorithms (right), and Xavier Amatriain, engineering director—about how they control what you watch.

So what's actually lurking beneath that Star Trek-Mission: Impossible recommendation?

Carlos Gomez-Uribe: By looking at the metadata, you can find all kinds of similarities between shows. Were they created at roughly the same time? Do they tend to get the same ratings? You can also look at user behavior—browsing, playing, searching. Sometimes what's similar depends on who you're talking about. Take director Pedro Almodóvar. You might have four very different movies by Almodóvar. But he's such a strong voice that, by himself, he makes those videos similar to one another. For a different director—say, Spielberg—that might not be the case.

"many people tell us they watch foreign movies and documentaries, but in practice, that doesn't happen."


Who identifies show and movie characteristics for Netflix?

Xavier Amatriain: We have more than 40 people hand-tagging TV shows and movies for us. These are typically freelancers who do this to supplement their income. All of our analysts are TV and film buffs, and many have some experience working in the entertainment industry. They obviously have personal tastes, but their job as an analyst is to be objective, and we train them to work that way.


How has recommendation changed now that Netflix is focused on streaming?

Amatriain: When we were a DVD-by-mail company and people gave us a rating, they were expressing a thought process. You added something to your queue because you wanted to watch it a few days later; there was a cost in your decision and a delayed reward. With instant streaming, you start playing something, you don't like it, you just switch. Users don't really perceive the benefit of giving explicit feedback, so they invest less effort.


So predicted ratings, the cornerstone of the Netflix prize, have become less important?

Gomez-Uribe: Testing has shown that the predicted ratings aren't actually super-useful, while what you're actually playing is. We're going from focusing exclusively on ratings and rating predictions to depending on a more complex ecosystem of algorithms.


Does Netflix keep track of my viewing?

Amatriain: We know what you played, searched for, or rated, as well as the time, date, and device. We even track user interactions such as browsing or scrolling behavior. All that data is fed into several algorithms, each optimized for a different purpose. In a broad sense, most of our algorithms are based on the assumption that similar viewing patterns represent similar user tastes. We can use the behavior of similar users to infer your preferences.


So if I'm viewing on my iPad at midnight, do I see different recommendations than I would on my TV at 8 pm?

Amatriain: We have been working for some time on introducing context into recommendations. We have data that suggests there is different viewing behavior depending on the day of the week, the time of day, the device, and sometimes even the location. But implementing contextual recommendations has practical challenges that we are currently working on. We hope to be using it in the near future.


Why do I see so many three- or even two-star movies in my recommendations?

Gomez-Uribe: People rate movies like Schindler's List high, as opposed to one of the silly comedies I watch, like Hot Tub Time Machine. If you give users recommendations that are all four- or five-star videos, that doesn't mean they'll actually want to watch that video on a Wednesday night after a long day at work. Viewing behavior is the most important data we have.

Amatriain: We know that many of the ratings are aspirational rather than reflecting your daily activity.


We can't hide from you.

Gomez-Uribe: A lot of people tell us they often watch foreign movies or documentaries. But in practice, that doesn't happen very much.


Does the real estate of rows affect viewing behavior?

Gomez-Uribe: Placement matters. The closer to the first position on a row a title is, the more likely it will get played. The higher up on the page a row is, the more likely it is to generate a play.


How does your recommendation work differently from that of other companies?

Amatriain: Almost everything we do is a recommendation. I was at eBay last week, and they told me that 90 percent of what people buy there comes from search. We're the opposite. Recommendation is huge, and our search feature is what people do when we're not able to show them what to watch.


Are there limits to algorithmic recommendation?

Gomez-Uribe: I watched Tell No One, the French thriller, over a year ago. I've been trying to find similar movies. The person on the content team who acquired it said it's the only one like it in the world.

More from this issue

- ### The Cheat Code to Life