What is machine learning? How does it work? What are these artificially intelligent algorithms useful for? Considering they are used by Amazon, Google, Netflix, Facebook and many other companies we interact with on a daily basis, what are the benefits and drawbacks? Thanks to a listener suggestion, we decided to delve deeper on the subject. Miles O’Brien Productions team members Brian Truglio and Fedor Kossakovski are joined by producer and coder Cameron Hickey to hash it out on this special edition of Miles To Go.
The first Hash It Out special episode is Episode 10: Whose Best Interest – Can Facebook’s Business Model Be Repaired? Cameron is also featured in Episode 17: The Software We Wrote to Understand Junk News – with Producer Cameron Hickey.
Brian Truglio: Hello and welcome to the second episode of Hash It Out, hosted by the Miles To Go podcast series. I’m Brian Truglio, senior editor for Miles O’Brien Productions.
Fedor Kossakovski: I’m Fedor Kossakovski. I do writing, web stuff for Miles, and some producing.
Brian Truglio: Each episode, we take a science and tech topic and dive a little deeper on it usually related to something that we are working on either reporting on or working in long form on. This is our second episode of our 27 part series on the Coriolis effect! Psyche… Thank you to everyone, we know that at least one person listened to our first episode which was on Facebook and the problems that are stemming from their business model. Fedor, do you want to read the tweet that we received from our at least one listener?
Fedor Kossakovski: We got a tweet from Steven Gammon, @sevengammon, but he or she, they have disappeared off Twitter. It has been almost a month so maybe it’s a bot I don’t know.
Brian Truglio: It’s got to be the Russians.
Fedor Kossakovski: Ironically if it is a bot they really very nicely asked us to look into AI algorithms they phrase it that way I believe and they were asking what is AI, can we do a deep dive, why is it important, how does it work? All the implications which we thought was a great thing to go off of after looking at Facebook and really a lot of the underlying issues we were starting to see were from the AI components of Facebook.
Brian Truglio: As you said, you hear a lot about the dangers or evils of AI algorithms but what is an AI algorithm right? And this happens a lot especially in science and tech, we get used to terms we maybe start vilifying them and forget you know what they actually are.
Fedor Kossakovski: People get so excited about up and coming terms. Like I remember when a blockchain, I mean still is exploding right now, but companies were just appending blockchain to their company names and their stocks are rising. Like the Long Island Iced Tea Company put blockchain iced tea or something like that and their stock jump like 20 percent. But they weren’t even using it for anything.
Brian Truglio: And of course for my generation there was the dotcom boom that early period of time when people were like how much money can we make off the internet? So everything was dot com. Well it’s too bad that Steven disappeared because now we don’t know where to send his free Hash It Out toaster that he gets. All right, so before we delve into the topic, Fedor I have seen the movie The Graduate by any chance?
Fedor Kossakovski: I actually have not.
Brian Truglio: 1967 classic directed by Mike Nichols, you basically can’t get out of film school without seeing it.
Fedor Kossakovski: Well, good thing I didn’t do film school then.
Brian Truglio: So there’s a very famous scene in that movie where one of the adults approaches Benjamin who’s the main character.
Fedor Kossakovski: Is that Dustin Hoffman?
Brian Truglio: Ben, Dustin Hoffman, at his graduation party early on in the film and he says, “I have one word for you: plastics.” Very famous line so even if you haven’t seen the movie probably know that line. Fedor, I have two words for you: machine learning. So that is the 2018 version of the graduate. Okay. And the reason that I bring that up in such dramatic fashion is I turned to a book called Master Algorithm by Pedro Domingos, I hope I’m pronouncing his name correctly, professor at the University of Washington. And his thesis is that machine learning is actually so important to us that it will usher in kind of a new era of technical advancement because in a way we’ve reached the limits of human computation, let’s call it. The internet has opened up access to troves and troves of big data and there actually is so much data in fact that in the lifetime of a single human it may not even be possible to analyze all that data. So we do need to turn to something else to do that analyzing and that something else is machine learning algorithms. I know that Steven was asking us about AI or artificial intelligence algorithms. However most of the time what we’re actually talking about when we talk about things like what is Facebook you know doing with all my data is we’re really talking about machine learning algorithms. And just to be clear machine learning is technically considered a subset of artificial intelligence. Our next episode will be on artificial intelligence which is a little bit bigger topic but machine learning is basically one pathway to achieving artificial intelligence. So let’s break it down into the first question here which is: what is an algorithm versus a program? Because we hear that word algorithm used all the time and it’s actually it sounds a lot nicer than computer program and the difference is pretty basic. An algorithm is actually just the kind of blueprint for a program. It’s really just the logical flow. If anybody has done any computer programming work in school you know the first thing you tend to do is a kind of a flowchart right. And a program is basically taking that blueprint and putting it into an actual programming language. You know for me when I was in school it would have been Basic. And I my programs would have consisted of “let the computer repeat your name 50000 times on the screen” or something like that. You probably encountered a little more complex programming.
Fedor Kossakovski: Yeah, I did a little C++ and some Matlab. I also actually only just printed my name too and failed out of those classes but…
Brian Truglio: So let me give you a simple example. Okay. And this is an example, I’m actually even simplifying the example that Domingos gives in Master Algorithm but we’ll call this the Tylenol diagnostic. Okay. So I want to know should I take Tylenol and there’s two conditions I need to meet: I need to have a fever and a headache. So the algorithm for my diagnostic is if I have a fever and I have a headache then I should take Tylenol. And we are not sponsored by Tylenol. Now we can abstract that further into its logical components and it looks like if A and B then C, okay? Fever is A, headache is B, C is Tylenol. And so that very very simple example is basically one of the logical pieces that make up an algorithm. So in a computer you have many many of these logic gates and they are in a one or a zero position. They are essentially on or off. Okay so in this logical example in the computer I’m going to look at two of these gates and if A is on and B is on then I turn on C. So if fever is on and headache is on then I turn on Tylenol. Very basic example. So to get back to our first question what does an algorithm versus what is a program: algorithm think blueprint, program think take the blueprint put it into a programming language. Okay. Simple as that. So what is machine learning. It is algorithms that actually create other algorithms, the natural way of course. It’s organic computing. If you think about the Industrial Revolution as being something that automated manual work. Okay. And then the Information Revolution did the same for mental work, let’s say, but machine learning is something that automates automation itself. So think of it that way, hopefully don’t blow your mind. Computers enabled the Internet which then created a flood of data which then created an additional problem of limitless choice and machine learning helps us solve that problem of limitless choice. Let me let me take an example of an actual bookstore. OK, I know you may not remember what does look like Fedor, in my generation though I spent a lot of time.
Fedor Kossakovski: A physical store? I don’t understand what you mean.
Brian Truglio: When you go to an actual bookstore, you do not have limitless choice obviously the bookstore has already chosen what books they’re going to sell. There’s a lot of them there but when you go in you know they’ve already pared down your selection and you really are then just paring that down a little further to find what you like and make your selection. When you put that bookstore online now you have access to every single book in publication. How do you go into that bookstore and make a decision, right? Where do you start if you don’t have a specific book you’re already looking for? How do you browse?
Fedor Kossakovski: Yeah I would assume most people go off of, like, New York Times bestseller lists or something, like the popular books right?
Brian Truglio: That’s one way to do it. And interestingly enough Domingos talks about two machine learning algorithms for two different companies–one is Amazon and one is Netflix–and each algorithm operates in a very different way and their goals are really determined by their business models. So Amazon being the limitless choice online bookstore, they use their machine learning algorithm to figure out what kind of books you like and then match them up against the hits which are the more expensive books. When we look at the algorithm for Netflix, their business model is a subscription business model and so if their machine learning algorithm pushed you just toward the hits they would be too expensive for them to afford to license all the hits all the time. Instead Netflix has hundreds of thousands of movies many of them that you haven’t heard of before and so they’re machine learning algorithm really pushes you out into the more eclectic or esoteric films that they have in their collection. They’re trying to get you to plumb the depths of their hundred thousand movie collection which is a more successful business model for them. So if you bring that back to the real world example, if you entered an actual bookstore and were met by the Amazon machine learning algorithm it would take you to the shelf that has the Times bestsellers or the popular books. If you were met by the Netflix algorithm it would be trying to say, “Oh you liked A, B, and C book, so what about this book that not many people have heard about it covers you know all the same things that you like.”.
Fedor Kossakovski: How much of both Amazon and Netflix’s business comes from these algorithms?
Brian Truglio: It’s actually pretty significant. A third of Amazon’s business comes from its recommendations. And for Netflix it’s even more important: three quarters of their business they estimate comes from their recommendation algorithm essentially.
Fedor Kossakovski: So if people just watch one hit and then start watching everything that pops up after that’s like the lesser known stuff as they just they probably, binging.
Brian Truglio: Exactly. It’s designed to keep you bingeing because you’re paying a flat fee for their service so it’s only valuable if it continually has content that you can explore and like.
Fedor Kossakovski: And I heard the CEO of Netflix say in a conference once that their main competitor is sleep. Isn’t that evil?
Brian Truglio: It is evil. I sometimes wish if parallel universes did exist and we could communicate between them then I could create a parallel universe where all I did was watch Netflix. But somehow would have to report back to me.
Fedor Kossakovski: In detail. Give you reviews.
Brian Truglio: So, there you go, Rick and Morty, free episode. OK so that’s just a little bit about how machine learning is deployed in the real world. When you say an algorithm is creating other algorithms what do we mean here exactly. An algorithm has an input and an output that’s traditionally how we think about it, right? The data goes in, the algorithm does its work, and outcomes that result. So the data we’re putting in is, oh, I have a fever and I have a headache. And the algorithm is then spitting out the result: Oh, you should take Tylenol. Right. That’s in our simple example. Now machine learning turns this around. You put in the data and you put in the result. And then outcomes the algorithm that turns one into the other. OK. So essentially you’re telling it that I have a fever and a headache and we’re telling it that when I have those two things the algorithm needs to recommend that I take Tylenol. And now the algorithm itself has to figure out what algorithm would give me that result. And so what it does is it goes through a bunch of options, tries a whole bunch of different things, until it gets the inputs to match the outputs and then it essentially has machine learned, it has created its new algorithm. So in this case we put in fever we put in headache and we get out Tylenol and the machine learning part of it says oh the relationship between the two is that you have to have both a fever and a headache and that’s how I get the recommendation for Tylenol. Once we’ve created this simple machine learning model and we have the right result, we put in the data and we put in the answers, we then take our algorithm and then deploy it in the real world and see, test it out. And say OK it figured out this relationship for this simple set of data. Let’s test it on a larger set of data and see how it does. And this is the process by which you develop and perfect your machine learning algorithm. And then once it starts giving you the answers that you want, 90 percent of the time, 99 percent of the time, now it’s kind of ready for primetime and ready to test out in a bigger way. So here we are we just have this Tylenol diagnostic model but basically you could apply this to data sets in hospitals, to data sets in patients’ medical histories, and see if you can actually get it to correctly diagnose things. And we’re going to talk to Cameron Hickey a bit more about that later on. Fedor, can you give me a good example maybe somewhat simple example of how machine learning actually works. We looked under the hood for a very simple one, the Tylenol diagnoser, but you know something more complex.
Fedor Kossakovski: Right. I think a good example that I found that was a little technical but I’ll try to break it down was this YouTube video by 3Blue1Brown called: But what *is* a neural network. The example they use that I thought was very good was the example of a machine learning algorithm learning how to identify handwritten numbers.
Brian Truglio: Interesting.
Fedor Kossakovski: If I write something on a 16 by 16 little square if I write a number even if it’s the number three that seems so simple for you if you look at it you can identify the number three. It can be squat and short, it can be long and skinny, it can be kind of blocky, right? It can be all these different kinds but your brain still recognizes it instantly as a number three. That is much harder for a computer to do. That’s what they’re using to train, you know when you’re signing in and they ask you to identify stop signs or whatever. That’s what they’re, they are using this, what you were talking about this training data. But to really understand what’s going on, basically you were talking about previously about how computer programs work. They have simple linear questions that they’re going in a flowchart you know maybe there’s some branching and some looping and whatnot but it’s is really a simple yes and or no kind of thing that it’s kind of going through this flowchart. The way that these programs, this machine learning program that for example can identify numbers, how it works is a little different. Basically it mimics more of how our own brain is structured. Instead of being a more linear process, this is kind of a more interconnected process that mimics how our neurons communicate where they have a lot of inputs and they have a lot of outputs. They are connected to many things and as some task is given to it and it get, it is rewarded for approaching the right answer, kind of like our brain like you get a reward for getting the right answer or not dying. That connection is strengthened very similarly here. For example you have a 16 by 16 grid.
Brian Truglio: By my calculation that gives you 256 pixels and let’s think of each pixel as a point of data.
Fedor Kossakovski: Exactly so.
Brian Truglio: And you have to convert that to one of 10 options: 0 1 2 3 4 5 6 7 8 9.
Fedor Kossakovski: Correct. Exactly. So you have your input which is 256 data points of black and white.
Brian Truglio: So basically you draw your number in this tiny 16 by 16 box.
Fedor Kossakovski: Which is tiny.
Brian Truglio: And then the computer looks at that box and looks at which pixels you’ve coloured over. So really the only information it has is whether that pixel was coloured over or not.
Fedor Kossakovski: Correct.
Brian Truglio: So you have 256 points of either black or white depending on how you draw your numbers. And then it has to get to 0 through 9, from that.
Fedor Kossakovski: Right, each pixel value is connected by one of those logic gates like you were talking about. It is not just connected to one kind of: OK, if this pixel is highlighted it’s probably a 4. No that can’t it can’t really work that way because you could write it anywhere in any way. So really what you’re taking is each pixel is connected to another layer of these logic gates–and there could be thousands of these. And then there’s another layer and it depends on how many layers you want to do and then the last layer is just ten values right. If we’re looking at it like a black box, what we want to happen is you give this image to the computer, it kind of parses it out, sends it through some sort of networked chain, and then at the end it pops out a number for you. So if we’re thinking about actually writing a program to do this, the first layer my trying to find edges, right? So you’re trying to find which pieces are highlighted. But next to them are blank. Right. So that’s like how you identify edges. Maybe the next layer after collecting all the information from the previous layer is trying to identify where are the loops, because then you can figure out–.
Brian Truglio: Some numbers have loops.
Fedor Kossakovski: Exactly. Yeah. Or like 3 lines versus one line, that would be like a 1 versus a four let’s say. And so you’re just giving it data that you already know what the answer is to, right? You feed it numbers that maybe you yourself wrote a bunch of different ways and then saying that’s a four, that’s a three, kind of like I mean it’s like teaching a child. It’s very similar. And you basically reward the computer for identifying correctly. So at first, stuff is happening almost at random and you are telling it if it’s right or not. And if it’s right it strengthens what it was doing and that’s just by adding weights. This path was very good for getting the right answer, so let’s strengthen it as a more weight on it. And if not, OK let’s kind of curb it back.
Brian Truglio: And so every time it’s going through if I understand it correctly it’s weighting its different analyses so it’s saying I see a straight line and I see a loop. I wonder if that’s an 8. And then it says no and then the next time through it might put those two together and say maybe that’s a 9. And if you say yes, a straight line and a loop are a 9, then it remembers. OK, so in my weighted analysis where I was giving a lot of weight to a straight line and a lot of weight to a single loop, that leads me to a 9. So it remembers that that weighting is a correct path to get to a 9.
Fedor Kossakovski: It’s kind of like evolution right. It’s just fast evolution that’s kind of directed. So it’s not really evolution because evolution is nondirectional. It just fills the niche that it needs to fill. But this is kind of like a directed evolution where you have a goal that is being evolved towards by the computer. The catch is though when you look back into those layers and maybe you think that it makes sense for the first layer to be identifying edges. Then the second layer to be identifying loops, it can be doing something completely different. Maybe it notices a different way to identify loops first and then edges, or maybe it doesn’t even look at the numbers the way we would. Like maybe there’s some other kind of hints it is gathering and has codified. And another interesting component is if you erase everything and you set up the system again, you revert all the values back to baseline and you run it again you’re going to get a different program, right? So much like if you can imagine running evolution again like if you rewound the clock for the Earth back to when life first arose and ran it again it’s very unlikely that humans will appear again. Or I mean I actually don’t know the mathematics of that, I’m just guessing you know like it seems unlikely that we would have all the same species in the same kind of arrangement as we do today. Very similarly, this is the same kind of situation. And so I think because of that it’s very powerful because you are allowing the computer to figure out better how to use itself and find these algorithms that work better. But it’s also a little scary because it does feel a little organic.
Brian Truglio: To circle this all the way back to the beginning, the whole reason we started down this path is because the reporting that we’ve been doing on junk news and you should check out our four part PBS NewsHour series that we did on this topic. They’ve actually set up a separate page that collects all four of our reports.
Fedor Kossakovski: Yeah, they have a tag for us of junk news. If you just go on their website and search junk news, it will pop you out.
Brian Truglio: So in order to understand more about how Facebook uses machine learning and how this figures into some of the challenges they’ve had in the last couple of years, we talked to Cameron Hickey. Cameron Hickey coproduced our reporting on junk news with Miles O’Brien. He’s a member of the MOBProd team as a cameraman as well as a journalist. Before Cameron was a journalist, he was a coder and he used his coding to create a very unique tool with which to investigate the junk news sources on Facebook. And we caught up with him to ask him a little bit more about how Facebook deploys machine learning. Hello, Cameron?
Cameron Hickey: Hey good to be here.
Fedor Kossakovski: Hey Cam thanks for joining us.
Brian Truglio: So how would you define machine learning and is machine learning when we’re looking at junk news in the Facebook arena, the kind of things that Facebook is doing with our social data, does that fall under the machine learning category?
Cameron Hickey: I think certain aspects of what Facebook does fall under the category of machine learning. The way I think about machine learning generally is training a computer to make predictions. So the most common way of doing that is what’s called supervised machine learning. So you take a bunch of data that you’ve labeled in some way, right? So say you’ve got a bunch of pictures and you said these are all pictures of cars and you feed all those pictures to a computer, the computer uses fancy math let’s say, a complex algorithm to try and identify what it thinks are the similarities between all those pictures that you’ve told it are all cars. Now the next time you give it a picture and you haven’t told it what it is it can say, Does this look like a car or not based on what they already know? So that’s supervise machine learning, there’s unsupervised machine learning in which you give a computer a bunch of unstructured data and it starts to identify emerging patterns from that. But in the end either way, what a computer is doing is looking at a bunch of data and trying to extract patterns from it that it can use to make predictions in the future. So for example in the case of Facebook what they are doing now as it pertains to the work that we’ve been studying, they’re trying to find ways to get rid of problematic content in the News Feed right? So one kind of problematic content you could imagine would be clickbait. So what they’ve actually done is they’ve paid a bunch of people to look at a bunch of headlines and say this is clickbait. This is not clickbait. They then took you know tens, hundreds of thousands of headlines, fed them into a machine learning model and trained it to recognize the difference between headlines that are clickbait and headlines that are not clickbait. And then it now uses that to predict in the future what the likelihood of a particular headline is of being clickbait. And if so because they’ve determined that something that they don’t want on their platform they penalize a headline with something that has click bait, so it appears lower in your feed or not at all.
Fedor Kossakovski: So the goal that they’re giving this machine learning algorithm or this specific section of their machine learning enterprise within Facebook is to down rank or remove entirely problematic content. But I assume they’re also using it to drive engagement? What what else are they using it for.
Cameron Hickey: So they’re using it for everything right. So every time you interact with something on Facebook that’s a piece of training data for the algorithm. Every time something appears in your feed, all the things you could do with it right. So the first thing you could do is you could stop scrolling for a minute to look at it and they can see that right. Whether or not you press any buttons, just stopping to look at it that’s meaningful to them right, especially for example if it’s a video that autoplays. If you choose to click like on something. If you choose to post a comment on something. If you choose to share something. All of those behaviors are teaching Facebook, given a piece of content, this is what I’m likely to do with that content. So if hundreds and hundreds of pictures are shared in your feed and the only ones that you ever like are pictures of animals, then it is going to learn you are more likely to like pictures of animals than say people or buildings or whatever. And so then it will take what it already knows about what it understands pictures to contain and your behavior. And so each individual user is custom training its machine learning model to produce new behaviors in the future for that individual user. You are customizing your experience yourself through the use of machine learning.
Fedor Kossakovski: Are they kind of almost building a Facebook for each person. Is it kind of like if I only like animals, I could have a whole animal Facebook. It’d be a very different experience. Does that play out you think from visiting Facebook and seeing how they do the rankings?
Cameron Hickey: Yeah, absolutely. I mean this is this is in great detail what they explained to us. Right. Every single individual has a totally customized experience on Facebook. So say we both have the same friend, in this case Brian right.
Fedor Kossakovski: I would say Brian might be one of our mutual friends.
Brian Truglio: Aw, thanks guys.
Cameron Hickey: But however there’s a difference between you and I, Fedor: I have kids and you don’t. So I might be much more likely to click like on pictures that Brian shares of his kid than you might. Because I naturally do that a lot. So over time the next time Brian shows a picture of his kids I might see it and you might not. Right. Same picture same we both have the same friend, but the version of Facebook that we see will be completely different.
Brian Truglio: In a way we each have a machine learning algorithm that’s kind of attached to each unique login on Facebook. And this thing is kind of traveling along with us. What is the ultimate goal of this algorithm? Is it eyeball hours is that what they’re trying to maximize?
Cameron Hickey: Well so so Facebook would tell you that the ultimate goal is to produce the best quality experience so that you keep coming back. So it may be less hours in a given day if it means more days in a given week or more days in a given year. Obviously their business model is built on advertising. So the more things that the more ads you see the more money that they’re making. But for the most part the people that we actually talk to at Facebook were really focused on something different, which is how to make the experience itself the best for users. And so from their perspective what they’re doing is trying to make Facebook something that you like, you enjoy, and you keep coming back to serve the other business goals which those people aren’t a part of.
Brian Truglio: Where do you see the most risk when you’re applying machine learning to the kind of social data that Facebook collects.
Cameron Hickey: So I think there’s a whole bunch of risks. The first one is if we start thinking about problematic content right. If Facebook tries to train its algorithm to identify and down rank problematic content, then it’s impossible to know whether or not nonproblematic content gets accidentally flagged as well. Or, if the biases encoded in the original training will will carry through to the eventual predictions. So if the people who are doing all of the coding of saying this is problematic content and this is not you know not just clickbait but all the other kinds of problematic content. If they all have biases that are political biases, those biases might be translated into the predictive algorithm. So it won’t be that the algorithm itself is let’s say more conservative than average or more liberal than average, it’ll be the people that trained it, and it will just be expressing that bias. And this is a big issue when you think about the concern a lot of people have, especially on the right, about what feels like censorship. It feels like the suppression of conservative political content. That’s certainly one concern is that the biases that train the machine learning algorithm will be embedded in all the future predictions that it has. Another concern is that what we may be training the algorithm to do and what’s actually good for us may not be the same. So we are frequently interested in stuff that engages us. But that stuff that engages us might be crappy content. But then if we think about how we’re training our own version of the Facebook algorithm to feed us new content, we could be training it in ways that are actually detrimental to us right. So the more junk we like the more likely it is to want to show us junk in the future. Right. And if you ask somebody Do you want to see a lot of junk, most people would say no but their behavior may be different. Right and so there’s a risk there that we’re training it to do something that actually isn’t good for us.
Brian Truglio: So it’s like do we train it for the ideal or do we train it for what we really want?
Fedor Kossakovski: Right and who decides that right? If it’s a company deciding that, they’re probably going to go for profit motivating issues. But you know if there is an oversight of some sort which I think we’ve talked about before if there is a independent overseeing organization they might say you can’t optimize for just engagement without having some kind of other protections in there that would discourage junk news from proliferating. What are your thoughts on that?
Cameron Hickey: What Facebook has actually said to us that it’s doing is they’re trying to combine this. So they recognize this problem that I just described that if you interview somebody they say they don’t want to see junk but then if you actually monitor what they do they might like a lot of junk, right. So they’ve taken what I think might be considered sort of a paternalistic attitude and said we’re going to engineer this so some of the behaviors that you demonstrate are ones that are going to be reflected ultimately in the training that the algorithm has and therefore what you see on that site. But in addition we’re going to make some other decisions for you based on what we believe everybody wants. Right. So. So whether or not you click on clickbait stuff, they’ve decided they’re going to down rank clickbait stuff right. They’ve started to go in other directions at least in terms of what you can see in their news releases that they have made they have said they’re also trying to target divisive content. Now divisive content’s a really complicated–.
Brian Truglio: Abstract, yeah.
Cameron Hickey: Concept right. So what is divisive for me might be important political speech for you. And so how they define that isn’t entirely clear but they have a stated intention to reduce the amount of divisive content because what they believe is that people want to enjoy their time on Facebook so that they want to come back the next day. And whether or not they click on that divisive content because it makes them angry they probably don’t like that that’s what they have on Facebook, so if you reduce that stuff then they’ll be more likely to want to come back the next day. We won’t know for a long time whether or not they’re actually doing that or having an impact, and it’s really important to say there’s no way to tell in an objective way from the outside what if anything they’re doing. Because as I just said everyone’s Facebook feed is completely different. So if Facebook says we’re going to completely down rank junk news sources and totally elevate high quality trusted news sources, that may be true for me but it certainly didn’t appear to be true from my grandmother. Right. So you know each of you are going to have a different experience of Facebook so none of us can see all the versions of Facebook. Only they can and they are probably reluctant to spend too much time delving into what every individual’s Facebook feed looks like because that feels like a violation of privacy. So that gets at sort of this bigger risk of the more power you hand over to these algorithms that we train to give us something that seems to have a good output based on some metric, the more impossible it is to dive inside and actually understand what that really means. Someone took a bunch of health data from a large medical system, you know a network of hospitals, and they fed all that data into a machine learning algorithm and what they found was that this algorithm could very effectively predict the likelihood that certain people based on certain characteristics would have some terminal disease. It might have been cancer, I don’t remember exactly what. But they couldn’t figure out what exactly the characteristics were. Right. They couldn’t figure out exactly what the equation was. So it was only by taking that person’s entire medical history, feeding it into a computer, and getting out a prediction that the computer could predict how this person is likely to develop cancer. But you can’t always look under the hood of the algorithm and say what are all the exact factors that produce this prediction that you gave.
Fedor Kossakovski: Right and I assume that makes it really difficult to, 1) regulate, if that becomes an issue, and 2) it also makes it kind of difficult for the people actually using these algorithms to then if they you know something goes wrong: do you start over? Do you retrain? Is it–because it’s like the middle layers of these machine learning neural nets, it’s kind of like a black box. You don’t really know how everything is weighted and what are the important parameters for all this data that’s coming in. What happens when you do stumble upon some kind of bias that’s encoded, do you just throw everything out the window and start over? Can you go in and change it?
Cameron Hickey: That’s a big challenge that everyone’s facing right. Each circumstance is going to be completely different. In the case of something like Facebook, they’re continuously training it with all of the data that we have and they’re able to tweak and modify their algorithms for presentation on a daily basis. And they’re also running experiments constantly. When we were there what they told us was they’re are literally running thousands of experiments concurrently on all of us on all of the users. Right. So if they’re interested in how does changing the color of a button impact somebody’s likelihood to click on it, then somebody at Facebook who’s running that experiment will essentially take a cohort of Facebook users, right let’s say point one percent of Facebook users, which is still millions of people, right, and run that experiment on them. And then they’ll do the same thing, we can make that button red or we can make that button green. Right. They can do it on millions of people in two different categories and see what the outcome of it is. And so they’re doing things like that continuously to try and optimize and figure out and protect against the kinds of concerns that we might have otherwise.
Brian Truglio: You said Facebook is doing this on their own. Is there a way to audit these kinds of machine learning algorithms? Is there a way to run the tests that they’re doing but run them kind of independently and maybe try to I don’t know push extremes or this kind of thing? Do you see any hope in that or is it just hopelessly complex?
Cameron Hickey: Well for me this kind of diverts away from machine learning itself. I think that there are definitely ways to audit the practices of these companies, the way that they operate. I think that it’s challenging to audit the algorithms directly because the algorithms are built on billions of points of data, trillions of points of data probably. And so you can’t really audit on that scale, auditing is sort of like taking a snapshot and looking at it and evaluating it manually in some way right. But you can audit the approaches that they use. Right so you can say how did you decide how to model this algorithm? What were the inputs that you used initially to train it? And you can evaluate that using logic and common sense and a critical eye that’s maybe not as idealistic as some of the people inside of Facebook are.
Fedor Kossakovski: I’m sorry. Cameron common sense? I don’t know. I’ve never heard of this. Is this a machine learning algorithm as well?
Cameron Hickey: I’m certainly working on that one yes.
Brian Truglio: Just to go back to your example of the hospital feeding a large data set of patients and so you’re saying they fed all this information in, the machine learning algorithm then was able to predict which people would get cancer but the problem is you’re saying that people weren’t able to actually look inside the algorithm to look at what combination of history?
Cameron Hickey: Well so just imagine your medical history especially if you’re older has thousands of data points in it. Right. And so a machine learning algorithm can look at all of those and compare all of those to everyone else’s and then also look at the outcomes. And you could pull out any one person and say what was all the data related them, but if there were 10 things that were important to that person and 10 things that were important to somebody else but those weren’t the same thing but regardless the algorithm was able to predict that these people were going to develop cancer then you can’t make sense of it. But the algorithm still was successful in predicting it right. And the way this is often done is you actually take half of the people and you do this right so you take half of the data off half of those medical records for example, you train the algorithm on half of the data, and then you take the other half of the data and you test it with that. Because you know who developed cancer.
Brian Truglio: Right but you don’t tell the algorithm the outcomes that other people right and then you see how did it work. But in the same sense you might not be able to look like you said you might be able to look under the hood, but if you then apply it to the other half and your success rate is 90 percent or higher, then you can say hey this algorithm might be ready for primetime. Right I mean in other words you don’t even need to know what’s inside the box but you can start feeding new patient information into it and then it can be it could help you as a doctor is that right?
Cameron Hickey: Oh absolutely. I mean that’s the idea I think that’s I believe in fact that it was IBM’a Watson which they did this. That’s the tremendous potential of these kinds of algorithms. But the unintended consequences are always a concern. Thinking with that same example if you feed a bunch of some person’s medical data into that same algorithm it predicts you have a very high likelihood of developing cancer in five years. Right. Then what does that do to that person, it’s the same as getting a false positive on a you know some other kind of cancer screening right. It causes undue stress on that person. You might even have some kind of medical interventions. Right. So say oh it looks like you have a 60 percent chance of developing breast cancer. Why don’t we mastectomy tomorrow. Yeah well that’s a pretty easy decision to make based on an algorithm.
Brian Truglio: There are people making that decision based on DNA now.
Cameron Hickey: Oh absolutely! And let’s be clear, for the most part like the way that we assembled DNA to make these predictions is machine learning. You were using machine learning to reconstruct a whole bunch of fragments of DNA into coherent strands.
Brian Truglio: Because DNA is essentially a giant data set in and of itself right. Tell us about your tool NewsTracker and how it works.
Cameron Hickey: NewsTracker started out as a way to monitor junk news sites. I collected a bunch of junk news sites that other people had identified. And then I found all of their Facebook pages and then I built basically a simple way to extract all the content from all those Facebook pages. And very quickly I discovered that people who liked multiple junk news Facebook pages were likely to like a bunch of other junk news pages. So I built a system whereby I could collect new junk news pages that were popular with all the people that had liked all the ones that I was already tracking and started adding them to the system. Then I built a couple of other features on top of it. I built a scoring system–not a machine learning based scoring system a sort of cruder algorithmic scoring system–to say here’s all the things that I’ve recognized are common with junk news. Let me give a point score to each one of those and any time a new source has one of those elements I will add a score to that and say this is more likely to be junk. That’s helped me to sort the content by how likely it is to be junk and then look at it and analyze it. What the tool also does is it connects the different parts of this ecosystem, the pages on Facebook that are sharing the content and the websites where the content is frequently published so that we can identify the networks of this content. So you can see here’s a Facebook page it shares content from these five domain names exclusively. So probably all five of those domain names are connected to whoever owns this Facebook page. Then we can also look at a given domain name and say this domain appears to be shared from these 18 Facebook pages. Well we can start to guess that maybe all 18 of those Facebook pages are owned by the same person. So seeing those network connections between things facilitates investigating the networks of misinformation to try and figure out who’s behind this. Is this some independent junk news purveyor like the guy we profiled, Cyrus Massoumi, or is this a company which I’ve identified a number of different companies that produce many different channels of junk. You know some that are focused on right wing politics, some that are left wing politics some, that are focused on cooking or you know lifestyle content. So it’s it’s a mechanism by which to identify all of these patterns.
Fedor Kossakovski: Is this open to the public? Can people use NewsTracker? What are your plans for NewsTracker.
Cameron Hickey: So right now NewsTracker is not available to the public. Very soon the goal is to take NewsTracker to the Shorenstein Center at Harvard’s Kennedy School of Government and make it a tool that is used by a project that’s in development right now called the Information Disorder Lab. The Information Disorder Lab will be a newsroom-like environment with a team of researchers journalists who are tracking misinformation across social media using NewsTracker as one tool, using other social monitoring systems as other tools, and using more traditional investigative techniques, to try and identify specific pieces of misinformation, trends in misinformation, larger narratives of misinformation, and then generate alerts. An Information Disorder Wire. I think it’s going to be called the I.D. Wire that newsrooms can subscribe to so that they can be made aware of today’s misinformation trends or today’s specific piece of misinformation. Eventually I believe the goal will also be to make some form of that information available to the platforms themselves so that the next time a piece of misinformation begins to trend, our lab identifies that information, they can be alerted to that and do their best to mitigate the spread of that across social media.
Brian Truglio: Do you have any plans to make it an app something for individuals to use or is there risk with that?
Cameron Hickey: There’s a whole variety of challenges that we’re just starting to get into. One challenge is making the information public serves to convey the sources and methods to the bad actors right. So you don’t want to make everyone aware of exactly what you’re doing and who you’re tracking or they’ll be more likely to change up more quickly and become harder to track. So that’s one risk. The other one is there’s a lot of research that Facebook has done, that academics have done, and even anecdotally I can say having the information that something is junk doesn’t necessarily change people’s likelihood to engage in it. Being told something has been debunked you know for some of us if we see oh that looks dubious and then you read in the comments somebody posted a link debunking it in Snopes, we’re like, Oh yeah that’s debunked. But for a lot of people, something being debunked by Snopes means nothing. It actually might be reinforcing. So I don’t think we’ve hit on exactly the right way to expose this research, this data, this platform to the public yet. I wish for it to be made available, certainly to other journalists that aren’t necessarily going to be subscribers to the system so that people can research it or leverage the tools to do further investigation themselves. But I’m not sure that there is at the moment tremendous value for the general public.
Brian Truglio: Seems like the bigger problem there is we don’t agree on reality anymore. Right. I mean like one reality.
Fedor Kossakovski: Well, I mean, it’s because everyone has a different Facebook how are you supposed to agree on reality when you have a different Facebook.
Cameron Hickey: Which really says you know we could either append a junk score to every post on Facebook right, Facebook could do that if they want. Or Facebook could just say, we’re going to use our junk score to be less likely to show you this junk. We’re going to use some editorial decision making to reduce the likelihood that you see it. And I think the latter is more likely to have an impact than showing that information to the public. It doesn’t feel very good to think about it that way right. It doesn’t feel good to think Facebook is just going to be making these decisions for us. But let’s be fair: they’re making these decisions for us every day all the time already. So I don’t think it’s that different.
Brian Truglio: Can I just jump back for a second. Can you explain how Russian information services were able to piggyback on our tendency to to junk on news let’s say or a tendency toward divisive news. How were they able to kind of piggyback on that and sway people’s opinions?
Cameron Hickey: Well so we don’t know if they swayed people’s opinions or not. That’s impossible to measure. But they certainly attempted to engage with people and they were successful at that. Right. And so I think that Russian trolls, you know Russian contractors at the Internet Research Agency or wherever else, they recognized what was popular on Facebook the same way anyone else can. You can use social monitoring tools to see what gets popular, what is trending, what kind of thing works compared to something else. And they ran experiments, you know. Just like Facebook runs experiments on its users, they ran experiments too. They generated 10 different ads with 10 different headlines and targeted them at 10 different groups and then saw which groups are most likely to respond to which kinds of ads. And then they made more ads like that. Right. And so they focused on creating content and then they bought ads to get that content in front of users and they followed the exact model that you know lots of people have been creating for a long time around hyper-partisan content. I mean hyper-partisan content predates Facebook. It’s been there in newspapers and magazines and on television and cable news for the longest time. So this was just the newest iteration of it that was super cheap and super easy to produce. They recognize that now you know it isn’t very easy for foreign intelligence disinformation operatives to broadcast a television signal into our homes, but it turns out it’s super easy and super cheap to broadcast on Facebook. So they experimented. They found what worked. They measure the effectiveness of it and then they just kept on doing it. And I believe are probably still doing it right this minute.
Brian Truglio: So is there hope for the future? Are we doomed?
Cameron Hickey: I’m a journalist right, we’re journalists, so I believe that journalism is a critical piece of the solution. I built this tool in order to investigate this phenomenon to identify the patterns and report on them and I think that the more investigating that we do, the more reporting that we do, the more we will push this content out of the mainstream. And that reporting will also hold the platforms accountable. And so when you hold the platforms accountable, they’re more likely to make changes. I think that everything that Facebook has done to change its platform has come as a result of reporting exposing the things happening on its platform. So I think that it’s really important to keep up the pressure. Right. It’s really important to try and identify new ways that bad actors can exploit these platforms, report on them, expose them so that Facebook is forced to reckon with it, so that YouTube is forced to reckon with it. Right it’s not just Facebook. I was just chatting with a guy from Medium who’s trying to find out how to solve the same problem on Medium because the same Russian disinformation actors that were on Twitter were on Medium. They just did an analysis and discovered that right. They’re everywhere. They’re on Pinterest, they’re on Instagram, they’re on WhatsApp. So holding these platforms accountable and holding these bad actors accountable I believe is one important way to try and address this problem. The other one is expecting these platforms to shift from this platform-only mentality to some hybrid entity that recognizes its editorial responsibility. Right. Because Facebook already makes editorial decisions about what kind of content it wants and the platform. It does not want hate content. It does not want pornography. It does not want radicalizing content. They’ve said we don’t like those things and if we see it boom it’s kicked off the platform right. You can’t upload porn to Facebook and let it last there for more than a minute, probably. So making some more nuanced editorial decisions about divisive content I think might be really important. And there’s a lot of criteria, the kind of criteria I’ve been defining with Newstrackerm the criteria a lot of other people have been identifying that you can use to objectively identify these other kinds of problematic content. They are doing it with clickbait already. They can do it with a lot of other things. And I think that they can take a bigger stand on that and eventually we’ll see potential dividends paid. Of course, as I said earlier, there’s a bunch of risks with that right. There’s the risk of censorship that we always have to be aware.
Fedor Kossakovski: Fascinating stuff, Cameron, thank you so much for clearing up machine learning and how it applies to Facebook for us.
Cameron Hickey: My pleasure.
Brian Truglio: Thanks Cameron.
Fedor Kossakovski: As Brian mentioned earlier we’re gonna do a more general AI discussion next. Not just machine learning and not just Facebook, but we’ll still be touching on those I’m sure. If you rate and review this Miles To Go podcast, that would greatly help all of us out.
Brian Truglio: Please like us.
Fedor Kossakovski: Please like us, subscribe, I don’t know, do all of it.
Brian Truglio: We like you!
Fedor Kossakovski: And we would like to thank again Steven Gammon, the now disappeared Steven Gammon.
Brian Truglio: Whoever you are, Steven!
Fedor Kossakovski: Thanks for sending that AI topic for us on Twitter. And if you want us to hash anything out you can tweet at us. I’m at @SciFedor.
Brian Truglio: I’m just @btruglio.
Brian Truglio: Absolutely. And coming up on our next episode is artificial intelligence. So check that out as well. Thanks very much.
Fedor Kossakovski: Thank you.