Who or what determines what’s in your Facebook News Feed? It’s a complex algorithm that aims to put what interests you most at the top of the queue. Increasingly, Facebook is focused on trying to determine what content is fake, junk, or misleading–and sending it to the bottom. But the purveyors of this content are a determined adversary. Miles speaks with their foe at Facebook, the Director of Analytics for the News Feed, Dan Zigmond. It’s an interview we did for our upcoming series on “junk news” for the PBS NewsHour.
[maxbutton id=”5″ ] [maxbutton id=”8″ ]
Miles O’Brien: Hello and welcome to another edition of Miles To Go; I’m Miles O’Brien. So do you ever wonder who, or what, determines what you see in your Facebook newsfeed? We’ve all been thinking a lot about this lately as Facebook faces an intense spotlight of scrutiny amid the rise of fake news and its impact on our political process.
And for that matter, its role in the polarization and tribalization of the world…
For the past year, producer Cameron Hickey and I have been investigating Facebook and fake news in general – although we prefer to use the less politically charged term” junk news”.
As part of this project for the PBS Newshour – the series will be rolling out in coming weeks, stay tuned for that – we got some really unusual access inside Facebook headquarters in Menlo Park California.
While we were there, we spent some time with the Director of Analytics for the News Feed, Dan Zigmond…
Dan is a Buddhist who practices “intermittent fasting” meaning he does not eat anything for about 15 hours each day.
He is the perfect person, I think, to explain how we are fed information by Facebook.
Miles O’Brien: You did a nice succinct primer on how it works. Give us the 101 primer on how the News Feed works and how things that we see are selected?
Dan Zigmond: Sure. If you take all of the stories that have been posted by your friends, by pages that you follow and then the other stories that maybe your friends have commented on or liked, gather all that together. For the average person, there’s about 2,000 of those. So, it’s a lot of content that you could see. But most of us going to the News Feed each day are probably only going to scroll through maybe around 200 of those stories.
So one of our jobs on the News Feed is to try to figure out how to rank those 2,000 stories so that it’s likely that the ones you’re most interested in are sort of towards the top. Because if you’re not in that first 200, it’s very unlikely that people are actually going to see the story.
So, a big part of it is figuring out what the right ranking is and there’s lots of ways that we can do that. You know, for example, you might say, maybe you should just rank it from 1 to 2,000 based on which of the stories that are you’re most likely to click “like” on. Or, you might want to rank it on like what are the ones that you are most likely to comment on, or maybe what are the ones that are likely to make you spend the most time on Facebook.
And you might think like at first glance, at first thought, all of those rankings would be about the same, but in fact, they end up completely different. If you rank by time, you end up putting a lot of video towards the top because video is something that takes a lot of time to view. If you rank by likes, you end up putting a lot of public content on the top, new stories, professional, celebrity kinds of pieces of content.
If you rank by comments, you might see more baby photos and wedding photos and things like that. So, the choice you make of how you rank these has a big impact on what people are actually seeing.
Miles O’Brien: So, all these criteria, 2,000 in the inventory, how do you decide which one — I mean, those things are — you’re at a point where you’re talking about how their kind of at odds with each other?
Dan Zigmond: Right. So, these different metrics, these different ways of constructing a ranking algorithm are sort of at odds. And so, in fact, we can’t just choose one of them.
We can’t say, “Okay, we’re just going to rank by time or we’re just going to rank by likes”, because we’d end up with a News Feed that was just skewed very far towards one kind of content.
So, really what it comes down to is how you blend all these different signals of quality, of relevance together, so that the most interesting things really do end up at the top and the things that people are less likely to want to see in their feed end up towards the bottom. It’s a constant struggle to try to figure out what’s the right waiting to give to each of these things. And over time, we shifted between maybe a little more of a weight on time or a little more of a weight on commenting and that’s a lot of what we’re doing on News Feed.
Miles O’Brien: So, you’ve got time, commenting, likes, shares, I’m sure there’s a few others.
Dan Zigmond: Yeah.
Miles O’Brien: What is ideally best for business at Facebook?
Dan Zigmond: You know, in the long run what’s best for business is to make the News Feed that’s most relevant to people.
So, I mean probably, in very short-term terms, it would be to prioritize time. But over the long run, if that means people are seeing a lot of things on Facebook that maybe they could see in lots of other places because it’s maybe a similar video content that they could get elsewhere, that might not be the best thing. Over the long run, it might be better to show people more of the kinds of friends and family content that they can’t get anywhere else.
Our priority really, I mean what we’re really trying to accomplish is giving people the feed that’s going to make them like Facebook at the most value out of it. And so, the feed that’s really right for them.
Miles O’Brien: It’s really a lot of variables at once.
Dan Zigmond: It’s a lot of variables at once and there’s are a lot of factors. So, we also think about which of these ranking algorithms is going to favor, maybe content to people don’t want to see in their feed. Things like clickbait for example which used to be a much bigger problem on News Feed. In some ways you can think of it as an attempt to game the algorithm and get higher than people would really want it to be if they can sort of sit back and think about it.
Miles O’Brien: I want to talk about clickbait obviously, it’s a big part of why we’re here. But I am curious listening to this — Is it possible to tweak the algorithm in an individual fashion? So, if there is an individual that really just loves videos, they get videos or does it have to be across the board?
Dan Zigmond: So the way the algorithm works is really, for each individual and each individual story, we make a set of predictions. So, let’s say for yourself, what’s the likelihood for this story that you would comment or you would like, or if it’s a video that you would watch that video.
So, if you’re someone who watches a lot of video, then there’s going to be a lot of videos for which you’re likelihood of watching is high. If I’m someone who just doesn’t like to watch video at Facebook, then that number is going to be low for almost every video.
Although the algorithm itself is not necessarily personalized because these models are making different predictions for us, there’s going to be more video in your feed and less video in my feed.
Miles O’Brien: So, the algorithm itself is blended as you speak and it’s very important then that you understand a lot about me in order to tailor that algorithm to my individual desires, right? That’s what makes it custom?
Dan Zigmond: That’s right. So over time, we learn about each person’s kind of preferences and their likelihood to do different kinds of things on the site. It may be that you’re someone who just really loves baby photos, and so you tend to click like on every single one. That’s then over time going to lead to us ranking those a little bit higher for you and so you seeing more and more of those photos.
Miles O’Brien: This is a kind of machine learning, it’s just understanding a pattern of usage, is that how it goes?
Dan Zigmond: Yeah. So, these underlying models are all machine learning models that are trained on the data that we collect on users every day, and then get better and better of predicting what those users are going to do in the future.
Miles O’Brien: Okay. So, you don’t publish all these little nuances in specific fashion, but there are people out there who spend a lot of time trying to figure out how the game it. Tell us a little bit — is that been the case since day one pretty much?
Dan Zigmond: I mean probably not literally from day one when it was just pictures of Harvard college students. But as soon as it became an important part of our kind of public conversation and important place for people to share content more broadly, then people started trying the game and figuring out how to get their content towards the top.
And so, part of the job in ranking is a little bit of a cat and mouse game of the people who are trying to game it and show people content that they might not really want to see. So, trick us into thinking people want to see these content, and us trying to figure out what are the kind of authentic signals that we can find that are telling us what content people really want to see.
Miles O’Brien: I know it’s hard to get too specific on this for a lot of proprietary and security reasons. But can you give us an idea of how you go about it?
Dan Zigmond: Well, we do lots of things. First of all, we can figure out, are the predictions we’re making actually correct? So, let’s say, we think you like baby photos and we predict that you are very likely to actually click “like” on the baby photos. But then we can compare that to the reality of when you see one, do you actually click “like” on it. And so, we can basically calibrate our algorithms or evaluate our algorithms against what’s really happening with our users.
We also do surveys. So, we will occasionally survey users and say, “Do you want to see the story in your feed?” I can’t remember the exact wording that we use. And so, just get that subjective, did they think this was the right story to show them? We also often will have panels where we bring in kind of professional raters of content and have them look at this content and say, whether they think this is high quality content or relevant content and compare that to what our models are saying we think this content is.
Miles O’Brien: So, the algorithm is never done?
Dan Zigmond: The algorithm is never done. We’re constantly refining it. We probably at any given moment have dozens of ranking experiments that we’re doing which are sometimes small, sometimes larger tweaks to these weights, sometimes new signals that we’re introducing to see if we can make it just a little bit better.
Miles O’Brien: How does misinformation complicate your task?
Dan Zigmond: Well, it complicates the task a lot. Because we are looking at all these different signals of engagement, one of the issues we face is that misinformation is often very engaging. So, people don’t create a lot of fake boring stories.
In general, if they’re going to go through the trouble of creating a fake story, it’s about something really interesting and exciting. And so those stories can get disproportionate amounts of engagement people might click more than they would click, like a more kind of honest story.
They might even like it or share it or comment on it. Sometimes people even comment on it to say that they think it’s misinformation. And so, trying to understand that kind of engagement and separate that what you might call bad engagement from good engagement really complicates the process of building this algorithm.
Miles O’Brien: There’s that expression, “Truth is stranger than fiction.” This is the opposite that fiction is actually stranger.
Dan Zigmond: Yes, often fiction is stranger than the truth or at least a little more exciting.
Miles O’Brien: You sort of need a truth detector and unless it’s like an outright factual mistake, that gets really difficult as well because there’s lot of shades of gray. There can be misleading headlines and yet otherwise factual story, et cetera.
Dan Zigmond: Yes. I mean my team for example, the data science team here, we don’t really try to get into determining truth and falsehood, I mean for one thing.
We are operating in languages that none of us understand and could possibly read and dealing with topics that we couldn’t possibly try to evaluate. So, we are really looking for these signals, these universal signals that we can pick up that tell us whether this is something that’s authentically engaging to people or something that’s trying to game the system.
Miles O’Brien: Can you share some of those signals or is that like giving away the store?
Dan Zigmond: So one very concrete thing that we do is we do work with third-party fact checkers and when they give us information that there’s a hoax, then that’s obviously a very clear signal and we use that in the algorithm, but there is also more subtle signals. We find that if the people who share a story are generally people who haven’t actually clicked on it and read it, then that’s what we call sort of uninformed sharing versus if people who share the story are all people who have actually read the story, which we call them, “informed sharing.” That’s a pretty good sign.
If the fact that you clicked on the story actually that makes you less likely to share it, then that’s a signal that there was something about the story. That if it wasn’t misleading it was at least sort of disappointing or not what you thought it would be, and so that can be a sign at the story is inauthentic.
Miles O’Brien: May there’s a lot of people just sharing headlines, the headlines are great but they don’t bother me the story which probably doesn’t support headline, right?
Dan Zigmond: Exactly. So, you do have the situations with a headline that’s misleading or exaggerated. And then when people click through they realize that and they sort of lose interest. And so the fact that most of the people sharing are not people who clicked through is a good indication or something like that might be going on.
Miles O’Brien: What I’ve realize here is that a lot of what you do is psychology and — There’s a certain component here of human nature, it’s like, “Wow! I got to share this.” It’s like, “I want to do this right this minute.” Can you build an algorithm that anticipates that?
Dan Zigmond: Well, as I said, one of the risks of any ranking algorithm like this that’s based on engagement is that engagement can be misleading. And we probably have all had that experience a little bit like eating junk food or something like that, where what you do is not what you might think you want to do if you could kind of step back and be a little more thoughtful about it.
And so yeah, trying to make sure that we are not just catering to this like instant gratification, the need to just keep clicking, clicking, clicking, that we’re actually delivering things that people if they could step back and think about what they’d like to read today would be the set of things they’d like to see. That’s a big part of our challenge.
And that goes back to that question I posed. I just want to just one more time add it. Clicking is good for business and so in a sense, right? I mean essentially, right? You want people to click. Click through stories, click through ads, et cetera. So, it does kind of find the phase of the short-term profit, right?
Dan Zigmond: But we’re very focused on the long term. I do think in the long run we want Facebook to be a place where people are finding relevant and engaging things and where they feel they’re getting authentic true information. Because in the long run if they don’t think that there are other places they can go spend their time. And so, we worry much more about ensuring that it is that platform for authentic communication and less about just like exactly what happening day by day in terms of how much time or how much they’re clicking.
Miles O’Brien: So, do you guys don’t feel a lot of like quarterly report kind of pressure then?
Dan Zigmond: We have a pretty good division of labor between the folks that are out there selling ads and worrying about that and those of us who are building the core application and ensuring that that’s just a really compelling place for people to spend their time. And I think that enables us to really be focused on the long term.
Miles O’Brien: Essentially you were talking about — yeah, I said day one, and you were ‘Harvard pictures’ which is after all how it all began. Obviously what it is today is so hard to even — from point A to point B is pretty astounding. How much of a sense of responsibility is there felt inside this building given the fact that to billion people, multiple countries uses has a public square. Does that change the game significantly?
Dan Zigmond: Oh absolutely. I mean, many of us including myself have worked in the startup world where you have no users or a handful of users and you can kind of do what you want. You can change it completely from one day to next and you don’t feel any great sense of responsibility. You’re absolutely right, we are working on something that two billion people all around the world are using as a source of information, as a way of connecting with family and friends and it’s a huge responsibility, and something that I think about all the time.
There’s never really been a time in human history where someone like me could have any impact on what two billion people were seeing. And so, we spend a lot of time thinking about that trust and trying to be as thoughtful and deliberate as we can about these changes we’re making to ensure that we are honoring that trust and that we are giving people the best experience we possibly could.
Miles O’Brien: This is unprecedented in every way isn’t it?
Dan Zigmond: It is and we are really trying to be thoughtful stewards of that.
Miles O’Brien: You said responsibility quite a bit. Is the company doing enough do you think?
Dan Zigmond: We could never do enough. I think we are doing as much as we can. I come in and work on this every day but it’s never going to be enough. We could always do better and I think all of us want to just keep doing better.
Miles O’Brien: As you start trying to dial things in such a way to eliminate the misinformation as best as you can, is it frustrating this kind of cat and mouse component?
Dan Zigmond: It can be and it’s not always clear what the right set of tradeoffs are. We worry about on the one hand we don’t want too much misinformation, on the other hand we really don’t want to be taking things away from people that they want to see, that they want to share. If you got some article that you’re really excited about and you want to share it with your friends, we’re weary of getting in the middle of that and saying, “Maybe that’s not something that they should be seeing.” So, it’s a tricky balance and we’re always trying to find that right point where we’re being responsible but never censoring anybody.
Miles O’Brien: It’s tough because I supposed you’d be tempted just to take some stuff down and delete it.
Dan Zigmond: Well, of course things that are extreme that violates our community standards or basic terms of service. I mean, we do take things down that are harmful, hateful, pornography, things like that. But there’s a huge gray area, and I guess one of things I’ve come to appreciate is really a lot of it is in that gray area where we have to be very careful and thoughtful.
Miles O’Brien: So, better in — the company would be better to put it at item 2,000 than delete it?
Dan Zigmond: Yes. I think, when we use ranking as a way of kind of reducing something’s distribution, it means that the people really who most want to see it still have it there. They can still find it. It’s not being taken out of the public square. But it’s not being sort of pushed on people who may actually not want to be seeing that kind of material.
Miles O’Brien: If you don’t do that I guess everybody has the same size megaphone and that’s not — it’s not historically — the fringe stuff has been at the fringe historically. And now, what you’re doing is by using an algorithm putting it at the fringe virtually I suppose?
Dan Zigmond: I think that’s right. I think we’re trying to ensure that that fringe content isn’t getting disproportionate distribution because it has found some way of gaming this system of engagement.
Miles O’Brien: Facebook says it is a technology company… not a media enterprise, and as such, it doesn’t want step into the role of being an editorial arbiter. But it is trying to refine its algorithm to put the stuff on the fringe… On the fringe… Which is to say at the bottom of your News Feed. The term is downranking and Dan Zigmond explained it.
Dan Zigmond: I talked about this ranking algorithm where we’ve got signals like time and comments and likes and all kinds of other things. Like, is the person who shared this a good friend of yours, is it someone that you almost never interact with, is it from a new site that maybe you spent a lot of time reading or one that you hardly ever see. So, that gets you some ranking. If we find a piece of content that we have some reason to expect is misinformation maybe because a third-party fact checker has flagged it, then essentially what we do is we take the usual ranking score and we multiply it by some fraction, so that it falls down in that ranking. What that means is that, if this was a piece of content that was already — I mean, let’s say it was number one on your list.
Even after that down ranking, it still may be in that top one or 200 and so you may still see it. So, it’s not that nobody will see it. But if it was maybe something a little more marginal maybe from a distant friend or long lost cousin that you don’t actually — you spend that much time engaging with and it was towards the bottom of that list, then it may well then drop below the line where you’re very unlikely to see it. And in that way, we ended up reducing the distribution of this misinformation without flat out banning it from the system.
Miles O’Brien: So is it working? You’ve been fiddling with it quite a bit of late.
Dan Zigmond: I think it’s making a lot of progress. I think we’re doing a lot better. I don’t think we’re seeing nearly as much misinformation on the site as we’ve seen before. But of course we’d rather have no misinformation on the site and so we’re not done. And I think there is — we’re still looking for first of all better signals of what is misinformation and then always refining these ranking changes to ensure that we’re getting that down ranking exactly right.
Miles O’Brien: Is there any way you can quantify and give me some metric for how you know it is working?
Dan Zigmond: I think we find that there once like a third party fact checker has flagged a piece of content as misinformation in our system, the distribution drops by about 80%. We think that’s pretty good. Some of the things that maybe aren’t as good as we’d like is that it takes longer for content to be flagged than we’d like and sometimes it does get a lot of distribution in those early days when it’s first out there. And then we’re always refining is 80% the right reduction, should it be reduced by 90%, trying to ensure that we’re getting that balance exactly right too. That’s a never ending process.
Miles O’Brien: The process of fact checking sounds great but it does take time. That’s really the rub here as everything spreads instantly before anyone has a chance to weigh in on what’s factual and what’s not.
There’s a famous quote you’re probably familiar with: “a lie can travel halfway around the world while the truth is still putting on its shoes.”
It’s widely attributed to Mark Twain, and I thought he coined it as a matter of fact. But after some careful fact checking of my own, it’s now clear to me, Twain did not invent the phrase.
Most likely was first said by the famous English satirist Jonathan Swift in the early 18th century.
Factual errors, fake news, rumors, and lies have always been with us… What’s changed is the instantaneous global reach.
The truth can’t even get out of bed in time to stop a lie anymore.
Please check out our series on junk news appearing on the PBS NewsHour in the coming weeks… and sign up for my weekly newsletter on all things science and technology… we do spend a lot of time checking facts by the way…
Miles O’Brien.com is the place to go to learn about all this.
I’m Miles O’Brien. This has been Miles To Go. Thanks for listening.
Banner image credit: Lawrence Jackson.