Mapping Misinformation and Russian Influence Online – with Data Journalist Jonathan Albright

, ,

Top US intelligence agencies agree that Russia meddled in the 2016 US Presidential election using an organized campaign of online trolling and misinformation. The details of exactly how are harder to uncover. Jonathan Albright, data journalist and Research Director at Columbia University’s Tow Center for Digital Journalism, studies information flow in networks. Recently, he has been mapping how Russian propaganda spreads on the web. I sit down with Albright on this episode of Miles To Go.

[maxbutton id=”5″ ] [maxbutton id=”8″ ]


Miles O’Brien: Hello and welcome to another edition of Miles To Go. I’m Miles O’Brien.

More today related to our big series for the PBS NewsHour on misinformation–what we are calling junk news.

I’ve been sharing some of the more interesting interviews that we did in the course of this year-long investigation. Details on when those stories will air can be found on my webpage, While you’re there, sign up for the email newsletter and you’ll never miss any of our work.

And as long as you’re on the web, we sure would love your comments and ratings on our podcast.

My guest this week is a leading researcher in the world of misinformation. Jonathan Albright is Research Director at the Tow Center for Digital Journalism at Columbia University.

He has spent a lot of time mapping and trying to understand the ecosystem that delivers us information online, true and otherwise.

It took him right into the heart Russian campaign to influence the presidential election in 2016. His work made it impossible for Facebook to downplay its role spreading lies and rumors that have a political motivation.

Miles O’Brien: What brought you to this whole idea of misinformation and its role in our lives?

Jonathan Albright: I’ve been doing this type of work for years since my PhD. My PhD was on Twitter. It was on information flows. It was on use of hashtags and especially alternative use of hashtags to create kind of side news channels or parallel news channels by groups that are marginalized or underrepresented. So, I think when I moved into my recent research, what I started to get into you was looking at this from a kind of ecosystem-level perspective and a holistic perspective, and looking at this from a network level. So, rather than kind of critiquing one platform or blaming one individual platform, my first dive into this research was really jumping in and looking at this from how all of these actors are connected and kind of the thematic flows and the connections across the entire network of what were identified at the time as “fake news sites.”

And a better way to put those would be hyperpartisan or sites that consistently spread misinformation and rumors. Don’t verify stories and kind of put out information quickly without context or support with evidence. So, when I first got into this, I kind of was looking at the big picture perspective, I would say, from a network level.

Miles O’Brien: Was there a moment or a particularly egregious story that got you?

Jonathan Albright: I think that it was cumulative. I think that it built up over time and I saw things — I saw a kind of patterns and phenomena in this election, going all the way back to early 2015 that indicated that there was a lot going on. I mean elections are always this way but I think that just the length and the duration of the campaign period and the lead up to it was kind of unusual in a sense that, , anecdotally.

I saw friends and people that I knew posting things that were, insensitive and things that maybe I wouldn’t normally see them post, kind of outrage porn is what I call it. So, I think that I started collecting evidence and data, all the way back in 2015 and kind of looking at this from — I guess that it all came together after the election result, I would say, is when I actually started applying some of these datasets.

Miles O’Brien: There’s a lot of money in anger, isn’t there?

Jonathan Albright: Well, absolutely. So, I mean, I think that one of the things that I pointed out a few months ago in a New York Times piece when they asked what can platforms like Facebook do to help solve this, and my response is, “You know, it wasn’t really censorship or regulation, it was more — there are pretty simple fixes.” All of these sharing mechanisms on Facebook, every single one of them are based on emotion. They’re based on anger, they’re based on rage, LOL, I mean they basically have a set of emojis that you can use to share. And they’ve also weighted them higher than the regular like in terms of the algorithm.

So, there’s a prioritization of sharing through emotion and especially one of the most prolific types of ways to get content out there is to create anger or rage and create a response so people — there’s a backlash of kind of — a share backlash.

Miles O’Brien: So, tell us a little bit more about that big initial dive and what you are looking for and what you are after. What was —

Jonathan Albright: I would say that it was more exploratory than empirical. What I did was I took a network and mapped out the kind of major players and the linking patterns within that network. So, I wasn’t — I didn’t develop a hypothesis. This was definitely an exploratory network analysis, but it did show some very, very interesting patterns that are still kind of being confirmed to this day. So, one of the things that I saw in that network was that YouTube actually was the most linked-into entity in the network, the broader network of a hundred and just over 110 sites that were consistently spreading misinformation throughout the election.

So, I was very surprised to see that YouTube and not Facebook or Twitter was actually the hub of a lot of these campaigns to manipulate or to deceive.

Miles O’Brien: We focused maybe too much on text, it’s about video, huh?

Jonathan Albright: Absolutely. This is over and over again. We’re focusing on sites like Twitter that kind of put out a lot of information but people spend huge amounts of time on YouTube. And in the linking patterns, this doesn’t necessarily translate directly to traffic or maybe you would have to do separate studies to validate this, but it did show that these sites are over and over again linking to YouTube channels. They’re linking to video, to content, so it’s a pattern that has been kind of underrepresented in research. YouTube is a tough one to study because there’s so much content and there’s so many videos that it’s not your kind of traditional big data study, it’s more of a you have to filter out and kind of find exactly what you are looking for on YouTube in order to be able to study it. I think from a academic perspective.

Miles O’Brien: So, the focus on Twitter is a mistake, maybe?

Jonathan Albright: I think that Twitter is inflated and I think there’s a kind of circular feedback cycle where Twitter gets placed as important or hugely important or vital in this process when they’re actually just one of many platforms. So, Twitter, I think, one of the biggest problems with the contextualization of Twitter as kind of an election influence tool or a political influence tool is it targets journalists and influencers.

Most people in the US don’t use Twitter, I mean especially daily. So where Twitter’s role in this is really, especially during the election, was to set the news cycles, to set the news agenda and to really — it’s manipulating a couple of things and they’re not typically voters or regular people. The role of Twitter in misinformation is really to deceive and to skew the news coverage cycle by tricking journalists, influencers, politicians and also there’s a kind of a part in there where they’re using Twitter bots to kind of trick algorithms which actually are in — I consider them a non-human audience.

I hope that wasn’t too much. I mean — yeah.

Miles O’Brien: So, Twitter is to — the internet, what the cable news networks are to TV. A small number of people actually watching or paying attention but influence is important, right?

Jonathan Albright: More or less. It’s more nuanced than that but I think that each one of these platforms really, you know, from this point on, I’m a big proponent of making sure that each one of these platforms is put into its place in the kind of respective media ecosystem and not considered this catch all for everything bad about the internet or everything bad about misinformation in politics.

Each one of them has a specific audience. They play a very specific role. Even platforms as big as Facebook really — they shape this process a very specific way.

Miles O’Brien: So, when you say you are examining a network, What do you mean by network? What is the network?

Jonathan Albright: So, network is really the idea of understanding the links between different actors at scale. So, if you can take websites and links and domains and you can map those out and lay those out onto what would amount to a plane and so to understand the shape of kind the organism that’s producing or directing traffic and information flows, that would be the best way. So links, hyperlinks and URLs shared. So, a network would be…for like a political propaganda network, you know, you could take and you could capture the hyperlinks that all the pages have, so their internal hyperlinks, their external hyperlinks if certain new sites are linking to other stories, if you could capture, and this is what I did capture. You can capture things like embedded YouTube videos. You can capture some of the traffic going to videos that don’t go to YouTube but maybe are served by Amazon.

So there are all sorts of ways to understand the links in between the different actors from a structural, not website but also kind of website apps and media embedded tweets kind of level and then lay this out on to look at a kind of top down rather than analyzing individual pieces of content or kind of fact checking. So, this is a different type. It’s more of spatial analysis and I would say it’s more exploratory than empirical. You can’t go back and compare one thing versus the other, kind of numerically. Rather, it’s kind of -– it tells you the next places to look or the kind of the nodes of activity that then you can use to go in and dig further.

Miles O’Brien: Do you begin with a piece of suspected fake news or a site?

Jonathan Albright: Yeah, I would say that the kind of the list or the seeding is important. So, the thing about networks that make them, I think, a little bit more resilient than a lot of people would realize is that outliers in networks often tend to kind of fall outside of the larger network or the clusters.

So, I think that one of the things is I guess that the seeding is important, so identifying a lists of actors that are for a specific purpose. So, sites that have repeatedly — so even if these sites aren’t Russian per se, but sites that repeatedly have kind of shared or been involved in producing stories that have been systematically observed as unreliable, as fake, as faults, as outrageous and then taking those and then using those, and kind of compiling data from those different resources to expand that larger network and to see all of the actors in between and all of the resources that they actually share.

Miles O’Brien: So basically, what you did was sort of visualize the pathways of misinformation?

Jonathan Albright: Absolutely. So yeah, absolutely, I mean it’s a good way to say it. I mean, I visualize the potential pathways.

So, some of these could be active with traffic. It’s likely that people would click and go between links. It’s likely that people would have viewed tweets that are embedded inside of stories. So yeah, absolutely, it’s a way to visualize and to understand this from, I guess, more of a structural thematic perspective.

Jonathan Albright: Last year, I did a couple of very large scale ecosystem level partisan news media networks and what it showed was, things like YouTube. There were also — it showed some very interesting patterns in how things were getting linked to.

So, Facebook is included in that group, but I think that when things started to come back around that there were specific actors and there were groups of accounts and Facebook and Twitter accounts that were involved in a kind of campaign, then I went back actually into CrowdTangle to find the reach of those and to kind of estimate the stated reach of those. So, what I did for CrowdTangle was actually more compiling and kind of making calculations, I guess, based on CrowdTangle’s estimates of how many posts were shared by these accounts and how many people and how many Facebook users that these posts by the IRA accounts that we had that CrowdTangle had data on actually, putting that all together and then totaling that up I guess.

Miles O’Brien: Now, when Jonathan is talking about the IRA, it has nothing to do with Northern Ireland. He is talking about the Internet Research Agency, a Russian company based in St. Petersburg that amounts to a den of state-supported trolls.

Albright was using a powerful social analytics tool called CrowdTangle, which Facebook acquired not long after the election.

Miles O’Brien: So, just so I get this straight, so you visualize the network?

Jonathan Albright: Sure

Miles O’Brien: And then you dive in to specific places and you use the CrowdTangle tool? Help people understand what CrowdTangle is and does.

Jonathan Albright: The best way to put it is it’s a social analytics tool.

So, it has a few different functions. It’s a dashboard in some cases to see which stories — how stories are performing, how stories are performing against kind of an average on Facebook, how certain stories are performing versus other news outlets. So, it’s more of a competitive analytics tool, but CrowdTangle also has things that it calls intelligence. So, you can go back and you can look at video views over certain periods of time for, I guess, what it considers influential accounts. So, if accounts get a certain number of followers, if a Facebook account reaches a minimum threshold of followers, then CrowdTangle will automatically index that site kind of into its database.

You can use CrowdTangle a little bit in retrospect to look at video views, posts, the interactions around single posts kind of over time. So, I guess — I use the more retrospective, backdated kind of version of CrowdTangle than the kind of real time social analytics like performance tool I guess.

Miles O’Brien: So you’re using it backwards.

Miles O’Brien: Give me an idea of the kinds of – just a little more understanding on the kinds of things you can see. What becomes evident to you when you take these transactions and run it through CrowdTangle?

Jonathan Albright: Well, I mean, so one of the things you can see is you can see growth in followers over time. You can see anomalies. You can see like hugely performing posts on certain days. You can see how things are shared. So, in terms of the, I guess, CrowdTangle data that I used for the Facebook. There were anomalies in there. First of all, the number of total users that these posts were shared to, it was just enormous, I mean it was astronomical.

It was in the hundreds of millions just for a handful these IRA Facebook pages. So this was around —

Miles O’ Brien: So hundreds of millions, what?

Jonathan Albright: Well, it was hundreds of millions of users or accounts or profiles that these were shared to. So, one of the–

Miles O’ Brien: Real people or bots or–

Jonathan Albright: So, CrowdTangle doesn’t necessarily — and this would be hard to even if you were human, is to figure out which one of these accounts or — Facebook is, I would say, it’s less likely to be bots, so I guess that’s one benefit of using Facebook. It does have probably less bots than Twitter overall.

But it gives you a sense of the scale of the activity. Even if this weren’t necessarily real people, there’s other factors apply into this like algorithms that you use number of shares to kind of rank certain things.

This absolutely would have effect on the promotion of certain posts or the kind of acceleration of these posts in the Facebook Newsfeed. I mean, if these are getting that much attention whether it’s human actors or not, it’s going to still have a promotional affect on this post in things like the Facebook individual Newsfeeds.

Miles O’ Brien: Were you surprised at the reach —

Jonathan Albright: Absolutely.

Miles O’ Brien: Yeah. That’s an astounding number.

Jonathan Albright: This is around the time where it was back in the tenths of millions. I did that data analysis back at the time when the claim was 10 million. The stated number of posts by Facebook has gone from 10 million to essentially 150. 150 is kind of the — that was the kind of ending point and there was still more work to do so.

Miles O’ Brien: 150 million?

Jonathan Albright: Yes.

Miles O’ Brien: That’s a staggering number.

Jonathan Albright: Yeah, it’s 15, it’s a multiple of — yeah.

Miles O’ Brien: These guys are good.

Jonathan Albright: Yeah.

Miles O’ Brien: I mean, when you look at, did you like “Wow, that’s rocket science” or was it like, “Oh, this is obvious. They just employed obvious techniques which appealed to human nature and they’ve done a good job.”

Jonathan Albright: The effort was clearly good. This isn’t some kind of random amateur eMarketing. They’re linking every single analytics tool together. So what they’ve done is they’ve linked the Facebook Pixel IDs to things like Google Analytics and Google Analytics tags. So everything that is done when people go back to some of these websites that have been set up, BlackMattersUS would be one of them, I think that these sites have — you can clearly see that they are using very systematic tracking campaigns, link tracking, pixel tracking, background data processing and these things are all linked together in kind of dashboards and measurements.

They’re not only using these platforms to push out content and to push out kind of anger, they’re also using the tools and all of the analytics and metrics, these free things that these platforms have offered to watch and observe and to measure. So there’s really two parts. It’s staggering

Miles O’ Brien: They’re using every tool in the toolbox.

Jonathan Albright: Absolutely.

Miles O’ Brien: Now, you ran through a series of tools that are used and it’s a lot of jargon for the average person. If you could just describe the kinds of tools without using the term, you know what I mean, the kinds —

Jonathan Albright: If you have a website you have to have a Google Analytics dashboard and ID. So what it does is it allows you to see the traffic patterns on your website, which pages people go to, which things people click on, time spent, conversions, products purchased.

But when you start linking that to a tiny pixel that you can input by Facebook, then you can see larger things such as when people visit your propaganda page on Facebook and then go to other sites across the internet. The thing about these campaign tracking tools is that they’re pervasive. It doesn’t stop at Facebook. So these things actually combine Google Analytics and Facebook and kind of a lot of other platform tools, kind of bundle them together. And then, once you’ve reached that kind of scale, you can see if people that like your propaganda page are what they’re buying stores or where they’re going on the rest of the internet, where they’re checking-in. It gives you a lot bigger perspective on lifestyle and even kind of information seeking patterns.

Miles O’Brien: I assume you have not overlooked the irony of the fact that this great tool that Silicon Valley developed is being turned around against us all by the Russians, right?

Jonathan Albright: Absolutely. It’s true. I think that these platforms have developed just unbelievably powerful analytics tools and it really comes down to who is allowed access to these. Is there any verification of accounts? And I’m not talking about just ads here, I’m not talking about paid placement. I mean the fact that anyone can set up a Facebook page and basically use all of the Facebook page tracking tools, create a Facebook pixel and target everyone that has checked into a Black Lives Matter protest in a zip code over a certain amount of time in an hour. You can do that in an hour. You could set up an account and basically target certain people with certain interest who are at a protest, who are recently checked in to an event in an hour. It’s a self-service dashboard that has really been not policed or not’s — it hasn’t been safeguarded in any way to stop these types of profiling and targeting.

Miles O’ Brien: It makes me think that big brother was patty cake.

Jonathan Albright: Well, yeah it’s — I guess this is why I have been relentless or adamant about putting in some type of oversight, not regulatory oversight but more oversight into who’s allowed access to these tools or some type of logical system to stop the — because if these things are just freely available or open, we’re going to get to the point where it’s going to become so obscured that we don’t know who’s targeting us, we don’t know how, we don’t know with which data or what data about us, or about our families.

Facebook, just for example, if you like a page, in some cases, that opens up your entire friend network, anyone who is a friend with you to targeting from that account that’s targeting you.

So if you like a Russian propaganda page, there is literally a selection in Facebook’s advertising dashboard that says, “Target friends of people who like this page.” So you are opening up a vulnerability to your entire social network.

Miles O’ Brien: So you hinted slightly at the possibility of some way of reigning this in but is that possible?

Jonathan Albright: There’s a lot of different components of this. I think that one of the places to look at this would be data privacy and to look at affording users rights on this platforms, not just for data portability but for control over who is targeting them or transparency in understanding not just who is targeting them, I think which is kind of coming but also how they’re being targeted and I think this is a newer push.

So, companies will provide, in some cases, the data that they use or the genres of kind of advertising that they’re bundling into your targeting package where people that like this thing like this, so this is why you’re — you can go to Facebook’s tool and see which categories that you fall into as a user, which type of operating system or things that you buy or like.

But I think that we still don’t know the process or the mechanism in which we’re being targeted. So if this is obscure then — if the formula is not given to us then we’re just left with a bag of flour and some eggs.

Miles O’ Brien: We do not own our data, do we?

Jonathan Albright: Yeah, and even if we did, we don’t understand the ways that the data is used against us and that kind of — it’s weaponized.

Miles O’ Brien: Our own data?

Jonathan Albright: Absolutely.

Miles O’ Brien: If we did just that one thing and said, “Your data is yours. You can take it, remove it, own it, provide access as you see fit.” Would that change things?

Jonathan Albright: I think it would be a start. I mean, in a lot of ways, it’s in companies like Facebook. It’s in their interest to provide more transparency because once people that start to lose trust on platforms like Facebook, they’re going to lose users. I mean, there’s going to be a drop off. So once that trust starts to erode, if you think trust is important in journalism and media, trust is very important in the role of social platforms and how people. If people don’t trust Facebook, I mean the drop off is going to be enormous. It’s going to hurt them not only in terms of revenue but in terms of things people share because people won’t share certain information or talk to their families or communicate on Facebook or Facebook’s tools, if they don’t trust what’s happening or what’s going on behind the scenes oat the company.

Miles O’Brien: Is there much evidence people are losing faith in Facebook?

Jonathan Albright: There’s a little bit of a backlash. I mean, some of this is probably created by media, by narratives, created by the media because media and Facebook have been in long in opposition, right? I mean, I would say that the mass media and Facebook have been at odds. Facebook has pulled a lot of audience and kind of split the — well, talk about fractured trust. Facebook has really gotten into platforms like Facebook, it’s not just Facebook.

Social platforms have really broken the trust layer that used to exist between news organizations and audiences because they’ve kind of wedged themselves in between and become the arbiter of not only truth but also audiences in the reach.

I think that we need to kind of take a step back and figure out not only what’s happened in the election and the propaganda stuff but also what things went wrong and which safeguards were lacked in these social networking platforms.

They’re not just social, this exists all across, I think, technology in general or modern technology. We need to understand what things will help and which things might be good in theory that actually could hinder other — fact-checking is one and the kind of the quick push towards fact checking, in some cases has very much backfired. I think that understanding and really systematically thinking this through and bringing platforms, news organizations, people in government all together with researchers will definitely help solve some of these huge questions.

Miles O’Brien: So let’s go back at the 150 million which is mind-boggling, what could we say with any certainty about how it might have truly impacted the election?

Jonathan Albright: I think that people often tend to kind of misframe the role of some these campaigns, these propaganda campaigns. One of the things that’s clear is that these campaigns didn’t — it wasn’t all about getting people to vote for Trump. A lot of this was to fuel and seed discourse and kind of distrust and almost disgust with all the candidates. And also, to probably very much geographically target people that might have voted for Hillary to just not vote at all or they found themselves so much at a loss and just exposed to information time and time again of the system being broken of corruption of all the candidates being bad, no one’s a good choice.

I think that that actually probably have a larger effort when done at a very targeted geographic level in battleground states and cities. I think that this didn’t need to be an operation that was successful in every single zip code, every single state. This was a very much something that only needed to work in just small areas and at certain time periods.
Miles O’Brien: So it was more of broad side on the democracy in general and maybe specifically targeted at Hillary?

Jonathan Albright: It really was. So a good example is when you look at a lot of the IRA tweets. There’s kind of shifts in the themes in how these and what they’re tweeting about and how they are operating.

In the spring of 2016, they kind of come in, there’s a little bit of defense of Trump, so it’s an odd, so they’re kind of — they’re counter-tweeting well Trump actually isn’t a racist. And there’s kind of been this shift, you can see it coming into the French elections. So you kind of come in to this thing where they turn very much issue driven, so they turn into anti-immigration, they turned it in — this can be linked with France and did the French elections. So it turns kind of Anti-Muslim or anti-immigration and then they — from the data that I’ve seen, there’s a big turn in coming in to September when they started really turning their attention to Hillary. So it’s corruption, untrustable, untrustworthy, it turns to a very negative but it is only focused on her and so there’s not as much promotion of Trump as you think.

Most of the issues that these things were pushing on these accounts, most of them weren’t bots, they actually were you know operated clearly by humans undercutting the more the process and kind of breaking the system rather than trying to win. So they won essentially by kind of disrupting and pulling the rug underneath the kind of process that we use to get people the vote.

Miles O’Brien: In your view is it just obvious they moved the needle or do we — is it like trying to say —

Jonathan Albright: Trying to connect this to a causal mechanism quantitatively is going to be extremely tough because there’s so much subjective matter. I mean, there’s so many factors in here that could play in that might not play in that are very, very hard to sit down and kind of put into one study that would say — maybe it could be done on a case study level so in kind of the battlegrounds or the zip codes or the specific regions that pushed him over.

So those kind of final states that the Midwest — I mean, maybe at a case study level it would be more productive to go in, talk to the people, combine that with a media kind of study, a system study, so looking at the tweets, looking at the Facebook posts and kind of then putting that all together. That would be very expensive, very time consuming.

Miles O’Brien: So, it’s hard to connect the dots at a very specific way but they’re sure — there’s a lot of smoke, so probably there’s a fire burning somewhere. Could we accurately say the IRA was a big player in the election without us really knowing it?

Jonathan Albright: They were a player. I don’t want to overstate the impact of that because I think that there are many other, probably, campaigns like the IRA whether they are Russian.

I think that the component on the fringe right is probably every bit, if not, bigger than the IRA. In many cases, what the IRA did was mirror and kind of piggyback on things that were already in motion set by kind of very, very fringe part organizations that push forward a lot of immigrant, racial, ethnic kind of ethno nationalist narratives, so I think that in many cases, they just jump onto things that were already — they did probably help it move along.

Miles O’Brien: So respond to the criticism that it’s easy to overinflate all of this because you made a few mentions about how it’s — there are fewer bots on Facebook. But just give me a paragraph on why you think the critique that this is kind of way overblown because of the factor of bots is in play.

Jonathan Albright: I agree with the critique that some of this isn’t humans and there are bots which I have mentioned before. Where I disagree with the critic is that there is a zero transparency and we are left, and especially journalist and researchers, are left with no evidence. We’re left with, not even breadcrumbs, really, to put these things together to understand and validate these statements made about the reach of known state funded propaganda campaigns. So I mean, we don’t know. So having any data and using that, I think is better than being left completely in the dark. And by putting that data out, I don’t know if this necessarily was the the move to increase that number by Facebook.

So when that data went out in the post and I shared it in various data locations, within 24 hours, a reporter had contacted Facebook and Facebook updated their public statement and added Instagram as a bullet point.

That’s the kind of effect that just a little bit of sunlight on these platforms has.

So the actually went in on a Friday night and added a bullet point and said, “Oh, by the way, Instagram was part of this campaign.” I mean, that was huge. That was just one minor — that was one not minor effect of putting this data out.

Miles O’Brien: But really, one word meant a lot, didn’t it?

Jonathan Albright: Yes.

Miles O’Brien: Let’s talk a little bit about transparency here. Is Facebook transparent?

Jonathan Albright: No.

Miles O’Brien: Tell us about that, because, first of all, when you did the initial CrowdTangle search, you had access to some data which you no longer have access to, walk us through that whole —

Jonathan Albright: I mean, it really was — what happened was, is that they had a lot of content cached in CrowdTangle and I just happen to notice that some of the pages were cached. So —

Miles O’Brien: So these are just old pages that are just out there. They’re not active?

Jonathan Albright: Most deleted pages on Facebook would not be accessible at all via CrowdTangle. So time and time again, accounts that were very suspicious that hadn’t been confirmed returned no data on CrowdTangle, so nothing. So, what’s odd is that some of the known IRA pages, and this is right around the time the Mueller investigation started. The cache was still there for those pages, which technically means they haven’t been deleted.

They were just inactive. But earlier, propaganda pages I considered highly suspect, often returned no data, no results whatsoever.

Miles O’Brien: But you were able — you had access to this cached information by what means?

Jonathan Albright: Just by their tool. It gave the numbers, all the share numbers. So it gave the total number of shares via Facebook. It showed where the accounts were being shared. So how many times that they received the angry emoji, how many times they received the LOL emoji, the Like button. I think it gave a lot of analytics insight but it also gave timestamps and the text on each post.

Miles O’Brien: That’s a huge amount of data. And is that technically going after that data the way you did, is that a violation of the terms of service?

Jonathan Albright: No. I mean it wouldn’t be because it was provided to me at the time I have the API logs. It’s not like I broke in there.

Miles O’Brien: Where are this inactive cached pages now? Are they available for you?

Jonathan Albright: No. They’re not available to anyone. They cut off access.

Miles O’Brien: What happened?

Jonathan Albright: I knew what happened, which is why I gathered as much data as possible before that story went out. They essentially called it a glitch and broke the tool that was used to access that cache.

Miles O’Brien: What are they hiding?

Jonathan Albright: So technically, I think that, they considered it private data but this had no personal identifying information then. I mean, these posts were all from known publicly — kind of publicly known accounts from a foreign agency. So it’s not like people’s personal information or comments from or replies from individual Americans were in this dataset. This was just information being put out by those accounts.

I thought it was a very important set of data for accountability and for transparency. But Facebook obviously has other reasons to not share that data. What those exact reasons are, I don’t know. But within — in less than a week, the tool was — the access was taken down for everyone.

Miles O’Brien: It’s an interesting dilemma that we’re faced with. We have a private corporation which has a huge amount of influence on our discourse. And yet, not the typical accountability that someone with that level of — permeating so many aspects probably should have, right? It’s a private corporation. They’re entitled to keep their secrets, proprietary information, and yet, it’s become the public square. How do you —

Jonathan Albright: It’s a huge conflict I mean.

So I mean, like I said, what we consider the public sphere, the civil sphere is really being played out on various fragmented private spaces. It is technically our data but the data is also — the data that we provide, the communication and the insights from that data are mostly their property, especially the insights, right? I mean, the analytics, how many people things are shared to? It’s really their property and it’s really at their mercy, they define the terms that we understand that we understand impact. I mean how many likes something got for example or how many, retweets? I mean, they’re literally defining the terms that we understand the processes of human communication.

Miles O’Brien: Well, they’re sort of de facto makers of law, aren’t they in away?

Jonathan Albright: I wouldn’t really — in terms of censorship and in terms of ranking and prioritizing things, I mean there has been ongoing claims for years that Facebook, dark censors post or that they, you know, things when people post pictures of certain things on Instagram or Facebook, that Facebook down ranks them and never tells anyone or just silently censors.

There’s always conspiracies but I think that — there has to be some more transparency into the tools that the ways and the mechanisms and that platforms like Facebook use to measure, and the kind of data they collect to target. I mean, it just — it really comes down to, “Do we need a public social media? We have public media. Do we need a public social media?” Now, that’s the biggest, probably, firestorm of controversy that would ever be created, but it’s a question.

Miles O’Brien: Well, given where we exchange discourse, maybe there’s an argument to be made now. I don’t know.

Jonathan Albright: Do we need a PBS of a social media?

Miles O’Brien: You kind of implied that there are people pulling levers in a grassy knoll kind of way, manipulating us in a very deliberate way whereas we — I think we’ve been led to believe, “Well, we just built this algorithm. Holy cow! We just couldn’t control it.” Is it a combination of both, maybe, or what’s going on?

Jonathan Albright: It’s a combination of both. Really, the key point in the manipulative propaganda effort is that every tool and every mechanism was exploited and it exists between platforms. So, most things that are found on Facebook can often be found in so many cases on Pinterest. It’s been re-shared to Instagram. It’s been shared from Instagram back into Facebook.

Again, they’re using every kind of eMarketing, sneaky technique that even normal businesses use to really make sure this has reached and has spread as far as it can possibly spread and they have the tools to measure this as well.

I think that it’s a combination of lack of oversight, lack of safeguards, a kind of utopian vision of Silicon Valley engineering the perfect platform that can connect everyone. In many cases, has polarization become worse.

I don’t know. If these are platforms are causing worse political polarization than ever before, does some of this have to do with the fact that more people are now exposed to each other, more groups, and more tribes and necessarily kind of being — they’re kind of intersecting whereas never before this would happen, so we just have so many people that are joining the internet. We have so many people that are online. I think maybe some of the polarization and the kind of effects of what we’re seeing are really just the exposure of groups that, traditionally, without such densely interconnected social media would never have come in contact with one another.

Miles O’Brien: Everybody’s got a megaphone.

Jonathan Albright: Exactly.

Miles O’Brien: It seems to me though that as long as we have an economy that is — the currency of the realm is attention, our eyeballs, none of this is going to change.

Jonathan Albright: I don’t think so.

Miles O’Brien: You don’t?

Jonathan Albright: No, and I mean this is a fair point just to really critique regular media.

Miles O’Brien: You disagree with the idea that the attention economy is the fundamental problem and unless we address that, nothing changes?

Jonathan Albright: I mean, I do agree and so, this where you can’t just blame Facebook and you can’t just blame Twitter. So I mean, this goes all the way back. A lot of the problems that have been created go all the way back to Clickbait headlines. They go back to Web 2.0. I think that we need to go back and just not fault everyone but just understand what’s gone wrong because Clickbait has been — there was a while where Clickbait was just — stories were so ridiculous just to pull you in to get that view to get that. So we need to redefine the metrics of attention. We need to redefine, what does the page visit mean?

I mean a visit, just the word, “Visit” implies that you’re going to leave. I mean, we don’t leave anymore. We don’t logout of Facebook. We don’t have sessions anymore. We dive in to these feeds. Its ambient. It’s really ambient media. We jump into feeds and information when we have time. So sessions and page visits and bounce rates, I mean, these things don’t really apply to modern — we’re seeing a fracturing of the way that the internet is monetized. This might be another argument for publicly funded social media.
Miles O’Brien: So, we’re doomed?

Jonathan Albright: I don’t know if we’re doomed but I think there is a lot to consider and there is a lot to think about. And it needs attention I think at the highest levels of government. I mean, if these things are going to be worked out, you can’t just let this entity exist as kind of Silicon Valley — this needs to be a national conversation and I think that is happening already.

But we need to be more proactive rather than reactive about dealing with this because this is only going to get worse in the future if we don’t deal with the kinds of problems that have been created, data privacy, attention metrics, I mean, just from every single kind of standpoint of how we communicate.

Miles O’Brien: Well paint the dystopian future then if nothing changes.

Jonathan Albright: Well, one of things that I pointed out is that if we don’t fix some of the structural problems in platforms, especially things, identifying and verifying which ads are placed. Again, they’re not just ads, verifying and understanding how people are being targeted and letting them understand which data is used to target them. If we don’t give people these types of affordances in these platforms, it’s just going to be someone else. It’s not going to be — Russia’s the bad guy now. Will it be China? Will it be North Korea? Were they involved? So we have no idea.

If nothing is transparent then we’re always left in this reactive process where we’re chasing behind what just happened and trying to fix that while something else has already jumped ahead of us. Being more proactive about laws, about data privacy, about transparency. I’m not sure it’s all regulation but I mean it gets the point where if one platform like Facebook which owns Instagram, which owns Whatapp, which basically controls three or four out of the top apps. Facebook basically controls the majority of users in the world. I mean, you have to take this into consideration the effects this will have on kind of democracy and especially voting processes.

Miles O’Brien: Does Facebook owe us transparency?

Jonathan Albright: Absolutely, at least more transparency. I’ll be very disappointed if something isn’t moved forward for providing access to the public or data transparency, say, in a year or by coming in the midterms. But I think that right now, enough light has been shed and just the point of getting some of this out will provoke enough conversation with a range of different actors, foundations, government, politics, people at Facebook, sympathetic employees.

Not everyone at Facebook is a terrible person. There are very concerned employees, I know. But I think that — I’ll be disappointed in a year if we’re at the same point or at a worst stage.

Miles O’Brien: We’ll come back and ask.

Miles O’Brien: And we promise to do just that.

Jonathan Albright, thank you for your time, and thank you for untangling the web for us just a little bit.

Again, watch for our series on the PBS NewsHour on the subject–it’ll roll out this month, at long last.

Producer Cameron Hickey built his own way of analyzing the web and in particular misinformation. He calls it NewsTracker and he was able to glean lot of really interesting things using it. We’ll detail that in future podcasts as well as on the NewsHour.

Everything you need to know is on the website, Sign up for the newsletter while you’re there–it’s free and we won’t spam you, I promise. And you’ll never get trolled by us.

So, stay in touch! Thanks for listening. I’m Miles O’Brien and this has been Miles To Go.

Notify of
1 Comment
Oldest Most Voted
Inline Feedbacks
View all comments

Get our latest stories delivered to your inbox.