This episode focuses on how best to create mechanisms for outside scrutiny of technology platforms.
The first segment is with Brandon Silverman, the founder and former CEO of CrowdTangle, an analytics toolset acquired by Facebook in 2016 that permitted academics, journalists and others to inspect how information spreads on the platform.
And the second segment is a panel provided courtesy of the non-partisan policy organization the German Marshall Fund of the United States. On June 15, GMF hosted Opening the Black Box: Auditing Algorithms for Accountable Tech, featuring Anna Lenhart, Senior Technology Policy Advisor to Rep. Lori Trahan, a Democrat from Massachusetts; Deborah Raji, a fellow at the Mozilla Foundation and a PhD Candidate in Computer Science at UC Berkeley; and Mona Sloane, a sociologist affiliated with NYU and the University of Tübingen AI center. The panel was moderated by Ellen P. Goodman, a Professor at Rutgers Law School and a Visiting Senior Fellow at The German Marshall Fund of the United States.
The below transcript has been lightly edited.
Today’s show focuses on how best to create mechanisms for outside scrutiny of technology platforms. We’ve got two segments. The first is with Brandon Silverman, the founder and former CEO of CrowdTangle, an analytics tool set acquired by Facebook in 2016 that permitted academics, journalists, and others to inspect how information spreads on the platform.
And the second segment is a panel provided courtesy of the nonpartisan policy organization, the German Marshall Fund of the United States. On June 15th, GMF hosted Opening the Black Box: Auditing Algorithms for Accountable Tech featuring Anna Linhart, a senior technology policy advisor for representative Lori Trehan, a Democrat from Massachusetts, Deborah Raji, a fellow at the Mozilla Foundation and a PhD candidate in computer science at UC Berkeley, and Mona Sloane, a sociologist affiliated with NYU. The panel was moderated by Ellen P. Goodman, a professor at Rutgers Law School, and a visiting senior fellow at the German Marshall Fund in the United States. First up, my interview with Brandon Silverman.
My name is Brandon Silverman. I have no title and no affiliation at the moment, but I was one of the co-founders and CEO of CrowdTangle up until about six months ago. Now I’m mostly a dad and then also a volunteer advocate for transparency.
For the handful of my readers… or listeners who do not know what CrowdTangle is or who you are, can you just give the quickest possible history of CrowdTangle and your role in it?
I, along with one of my good friends from college, started CrowdTangle in 2011. And we were a social analytics tool. Originally, we were actually trying to be a community organizing tool, but that failed miserably, and we ended up building a social analytic tool. And we ended up getting acquired by Facebook in late 2016, so we joined Facebook in 2017.
And what we did was make it easy for newsrooms to be able to see what was happening on social media, mostly to help them be able to tell their stories and find an audience on Facebook. But over time, we became one of the main ways that people could look inside the black box of Facebook and see the biggest stories, the latest trends, who might be responsible for a certain narrative, who posted it first, et cetera. It was a powerful way to look at organic content on the platform.
And we went not just from working with newsrooms, but over time to working with nonprofits, human rights activists, academics, researchers, et cetera. And I left about six months ago. CrowdTangle is still around, but the team that used to support it and run it is no longer there. Yeah, that’s one part of the history.
What I would encourage my listeners to do is go and read much of the excellent reporting on CrowdTangle and on the consternation it created inside the company about transparency and that sort of thing. I won’t ask you, Brandon, to rehash all of that, but there’s certainly a history there that’s worth my listeners looking into. But let’s look ahead on some level. Let’s look ahead to what perhaps can be done to make not just Facebook but perhaps all of the social media and information ecosystem slightly more transparent. At least that’s the dream. There’s a, I guess, consensus at this point among researchers, civil society groups, other advocates that… and lawmakers, I would add, in most Western governments that some more insight, some more peak under the hood of these social platforms would help us to address the harms. You clearly believe that. How do you think of that? Why is transparency researcher access so important?
I think there’s a few angles on it, but I think in some ways the biggest one is that it has been 10 years… It’s been over 10 years since social media has been a fairly prominent part of society and the world and countries all over the globe, and we still know way too little about how it impacts society. And that includes everything from the wellbeing of individuals to is it a net good or not bad for liberal democracy? Does it enhance the diversity of speech? Or does it entrench existing powers? There are all these really critical and important questions, and we just don’t know the answer to a lot of them.
And instead, what we have is a lot of observations, anecdotes, or in some cases, leaked documents from certain platforms, obviously with Facebook papers. But we were trying to cobble together all these different things to understand more about what it’s doing to us as a society and as individuals from a lot of places that just aren’t giving us a fully robust and comprehensive picture. And so I think there’s this growing, I think, realization that we should be trying to understand this stuff better, and it’s not going to happen through voluntary efforts from the platforms. And so there’s a lot of regulators and lawmakers and academics, civil society folks trying to figure out how to create responsible ways to require data from the platforms in a way that we can make progress on understanding some of this stuff.
And then the only thing I’ll add is through… I spent 10 years leading CrowdTangle and I got to see firsthand some of the power that making this data available in responsible ways can have. And so we had examples of everything from election protection groups, helping protect elections in places like Ethiopian and Sri Lanka and Brazil and Myanmar. In the US, we had human rights activists helping prevent real world violence. We had, in some cases, ways in which civil society could hold platforms accountable for what they said they were doing but weren’t always living up to. We saw firsthand what that could look like, and I think that’s partly why I got… how I became radicalized on the idea of the power of transparency. Yeah, I think those are, for me, the highest level things of…
And maybe I’ll just one more. I could go on for a while about this, but I think one of the other things is it is a very convenient talking point. In the moments of like the most intense public scrutiny of a lot of platforms in large tech companies, they will frequently roll out how much they believe in transparency, but there’s very little accountability or awareness of how much are they really doing? It was like when Jack Dorsey resigned from Twitter and he had his resignation tweet thread, literally the very last line in the very last tweet was, “I always wanted Twitter to be the most transparent company in the world.” And just over and over you see that, but then the degree to which any of them live up to it just completely varies, sometimes it ebbs and flows, sometimes it doesn’t exist at all, and so I think there also just needs to be more rigor and more industry norms around something everybody clearly believes in at once but we just haven’t made enough progress on.
Really quickly, when you think about the existing big social platforms, whether it’s Facebook or YouTube or TikTok or Twitter, how do you rank them in terms of their current status with regard to being open to outside research or outside scrutiny?
I would say a couple things. First is transparency can mean a lot of different things, and so I think in some ways it’s useful. Be specific, what sort of transparency are we talking about? I certainly feel like I know the best and have spent the most time in is data sharing and letting the outside world have some degree of access to some of the metrics and data and content on the platform. I think one is always important to parse out how complicated and diverse it looks like.
But secondly, I think Twitter is the best, and have historically been the best. And they’ve been so much better at it that than the rest of the networks that, for a long time, there would be all of these public discussions about social media, and what did the research show? And what they really meant was what did the research show about Twitter? Because it was the only place that anybody could get any data.
But that being said, there are also things about Twitter that make it easier. There is far less privacy sensitive data relative to a platform like Facebook that requires people to put in their actual information and private photos and all that stuff. Twitter’s the best, but there are also ways in which it’s easier for them. I think the worst is I would probably be… TikTok does essentially no functional transparency at all, really. YouTube I think is a little better than TikTok, but not a ton. And they’ve historically been pretty closed off to outside research. And then Reddit, I actually think is pretty good. We had a really robust partnership with Reddit, and so I would them in the Twitter category.
And then I think Facebook falls somewhere in the middle. They’ve done some generally industry leading stuff, but I think when I talk about Facebook, I think the one important asterisks on any conversation around them around transparency is they’re unique. There are three billion people who use their platform, I think it is. Twitter is a fraction of the size. Facebook is also, in a lot of developing countries, essentially the internet, and so I think Facebook also just have a particular level of responsibility that none of the other platforms have right now. And so I would put the responsibility they have higher than all the rest, and so the expectations are equally as high.
You mentioned that one of the things you’re doing is helping to advocate around transparency measures that are being pursued in Europe and in the US. And on some level, in Europe especially, lawmakers intend to push the hand of platforms and to require that they’re open to outside scrutiny. And there’s provisions in the DSA, there are people who are thinking about how to do that in the context of GDPR. And I know you’ve been a part of those conversations. I guess, from this point looking forward, what do you think is the timeline here? When are the first research projects going to kick off that potentially are minted under this new capacity that Europe is going to create?
I [inaudible 00:10:29] you and are talking right now on Thursday morning, June 16th, and actually this morning there was a huge announcement about they finally released the official language for the code of practice on disinformation, which is one of the three pillars of data sharing and transparency that are going to be coming under the umbrella of the DSA more broadly. I think partly the beginning of the timeline is going to be when the DSA itself also finally gets released and officially codified, but within the code of practice, which mandates some of this stuff, there’s essentially a seven month timeline by which platforms have to have begun meeting some of the new requirements that are built in [inaudible 00:11:07] whichever one’s signatory have agreed to. To my understanding, there’s language in there that says also, if you have anything available now, you should start making that available now. But if there are new metrics and new data points that you haven’t collected, you have seven months to essentially start getting those in order and start making them available.
One of the things coming out of all this work is the development and funding of an entirely new agency that essentially is going to oversee a lot of the hard work. For instance, one of the big challenges in all this stuff is how do you vet which researchers get access to the particularly privacy sensitive stuff? In the process of trying to write some of this stuff, there were attempts to codify who should and shouldn’t access and put that bill. And it was so complicated that ultimately, they realized that they need an entire agency to actually run that entire thing. They’re like, “Well, let’s start and fund an agency.” That’s going to slow down that piece of it because they have to start it and hire it and yada, yada. But ultimately, that body will do it. And I think it was probably the right path forward as well.
I think you’ll start to see some research in the next year, but I think it’s going to be a slow process for some of the really more in depth longitudinal stuff to get out. But there was language, that was more crowd tangly in terms of public data and things like that. And there’s a universe in which you starts to see some more work around that in a shorter timeframe.
Let’s talk a little bit about the US context as well. I know you’ve been involved in some of the discussions around the Platform Accountability and Transparency Act, which would also establish a office, a Platform Accountability and Transparency Office within the Federal Trade Commission that would also, as I understand it, work with the National Science Foundation and maybe address some of these issues about who gets access? And what projects get approved? And what are the potential risks? And how do you mitigate those? What’s the status of this particular proposal in the Senate? And what do you think are its chances?
Yeah, so I’m not a professional politician so I will put out my predictions, but there are people who watch this stuff a lot closer than I do. I think the biggest challenge right now in any legislation in Congress is just the midterms. The closer we get to the midterms, the likelier that meaningful stuff stops happening, especially I think if you have one party who feels like they’re probably going to coast into new majorities if they don’t rock the ship too much. I think it is a challenge to the midterms.
I think in terms of PATA as a standalone bill, if I… My sense right now is it’s probably not going to get introduced formally given how little time is left. And I think the likeliest path at this point is actually that pieces of PATA are added to another piece of legislation that is able to make it through. And there’s a lot of talk right now about Amy Klobuchar’s antitrust bill. And I wouldn’t be surprised if they try and attach some of this transparency stuff to that, if it gets passed.
One of the ironic but challenging things about transparency… And I talked to a lot of different congressional members and Senate members over the last few months. And everyone supports transparency, but in some ways that’s also a challenge for it because it’s not a particularly galvanizing political lightning rod where people… Nobody has constituents who are beating their door down around more data sharing, and so it means getting on the docket is hard just relative to other people’s priorities. We had a lot of people supported it, but it was just I have other things that are more critical to me. But the good news is maybe that makes it easy to attach to another piece of bill, and so I think that right now for getting something through this cycle is the likeliness path.
But one last thing I’ll just say, though, is I generally believe there’s a lot of value to even working on this legislation even if it doesn’t get past this cycle. There was a lot of interest and awareness generated through the hearings. I think there are new funders thinking about getting more involved in the space. I think there are genuine challenges in getting the legislation right at a detailed level. And there were moments when I was helping write some of the actual legislative texts, and I was like, I probably shouldn’t be writing legislative texts, but I…
And then also there’s, I still think, negotiations to be had among different parts of civil society around this stuff. In the Senate hearing, you could literally watch myself and Jim Harper from AEI who’s way more protective of, or thoughtful about the First Amendment concerns. We were literally almost negotiating how to balance those in real time during the hearing. And I think there’s more work to be done on just figuring out what the trade offs are and where the privacy community is comfortable, where the first amendment community is comfortable, et cetera. And so I think even, in some ways, having more time to get some of that stuff right is also not the worst thing in the world.
As I understand it, you were also involved in or at least an advisor to some of the individuals that worked with the European Digital Media Observatory to put together a kind of… well, a proposed code of conduct around independent researcher access. That was a very detailed document. It even had templated forms and ideas about particular protocols for how approvals might work. It may me wonder, though, a little bit about how industry still gets a say in what types of projects get done or don’t get done. Even in the idea for the Platform Accountability and Transparency Act, there’s still some notion there, of course, that industry has input into and influence on what types of research projects get approved. How do you avoid industry capture in this? How do you avoid the companies not permitting certain projects because they think they’re not in their interest to pursue?
Yeah. And I should maybe just say out of the gate, for the code of conduct, the real person who was the real champion and did the real hard work around this stuff was Rebecca Trumble who’s at GW and helped shepherd that entire process and was a massive effort and lift but a huge contribution to the whole space. And so I helped a little bit, but the level of work some other people did around that was gargantuan, especially Rebecca.
I think your question is a good one and also partly why this stuff is so hard. And I actually worry. In a lot of ways, there’s a huge power imbalance at the moment. The status quo is one in which the platforms can decide completely on their own what they do and don’t want to do, and so you have some entities doing a little bit and dipping their toes in the water but maybe not giving it the resources it really needs, then you have others who are just completely stonewalling anything, which is why I think ultimately hoping for more robust voluntary efforts is not going to be the answer. Or for, in American political parlance, the market’s not going to solve this, and so what you need is some mandatory requirements.
I think in some ways, if you have those mandatory requirements, some of the platforms would actually be glad that that happens because I think one of the challenges we had at CrowdTangle is we were an example of a voluntary transparency effort that Facebook was doing that almost no other platform was doing. And one of the challenges was, well, why are we putting ourselves out on a limb when nobody else is? And so I think if you can create a level playing field where everybody has to be held at the same bar, in some ways I think some platforms be like, “Great, we’ve been waiting for that. We support that.”
But to your question around regulatory capture, I’m not sure I totally know the answer to that question, to be honest. I think probably the best answer is whatever regimes are set up need to have a handful of the following characteristics. One is they need to be flexible. We can’t enshrine commandments that are written in stone right now on how to do this because the industry is just changing too quickly, and so, one, I think you need to build in some sort of flexibility into whatever systems are set up. I think two is you need to be mindful of some of the totally reasonable concerns that the platforms have around… GDPR mandates some privacy things that make it really fricking hard to give some of the data you want to external people. And simply waving that away is not a sustainable, long term answer. There are at times genuine trade secrets and feasibility issues, but I think ultimately what you need to do is find a balance where some of the decision making is just taken out of the hands of the companies because the voluntary stuff is just not yielded the data we need over the last 10 years.
In some ways, what I worried less about than industry capture is more industry stonewalling. It is very hard to write in specific language exactly what data you want and need because the platforms are so different. What you might need two years from now might be vastly different from what you need right now, and so there is… I think one of the things we’re just going to have to watch out for in the EU, and then hopefully at some point in the US and other places, is are platforms genuinely trying to meet the spirit of the regulations, or they simply meeting the letter? And if you meet the letter, there are ways in which some of this can be way less useful than I think what everyone is hoping for.
And that’s not to mention if they want, going through the courts to litigate every single possible privacy nuance, every single trade secret, potential conflict, every single feasibility one. And so I think this is going to… it’s going to take a while to figure this out. Some of it will absolutely be litigated here in the US. It is very difficult to require data sharing without bringing up first amendment issues given the state of the judicial system in the courts who would consider that stuff compelled speech if it weren’t written really well. One of the big questions is how much the platforms are going to actually fight against this stuff. And they’re going to at some point, and so it’s just going to be part of the next chapter in the story.
I do want to talk about the temporal dimension of this. Some people have pointed out, well, it’d be great to have all this access to data and to see how these platforms work, but the reality is you’re going to take a snapshot in time. A lot of the time, a lot of research studies will take a approach that is time bounded in some way, and maybe the platform’s completely different by the time your results come out. Maybe the algorithm’s already been changed or the interface has already been changed or some other variable has been adjusted such that the results aren’t even relevant anymore. How do you think these regimes can prepare for that?
Yeah. I could not every more, so I’m giving jazz hands over here for non-video formats. But yeah, so I think the answer is a couple things. First is you need several different types of transparency. You need several different layers of data that you were providing with different audiences and different types of data and different structures for who gets access and when.
I think about it in some ways like a funnel. At the top, you can have essentially the widest possible audience. It’s an inverse pyramid. At the top, it’s the widest. And that is data that you can share with essentially the public. And so we have versions of that right now that are mostly in reports. Aggregated statistics, Facebook and Google and some others do these basic community standard enforcement reports where they say, “We took down X number of this and Y number of that.” That’s data sharing, and that’s available to the public writ large. And there’s essentially no privacy concerns.
CrowdTangle was a layer down. Most of it was not available to the public writ large, but it was available to essentially any organization working in the general public interest space, from newsrooms to human rights organizations to economics. That data had a little bit of privacy risk, but not much. And what we could also do with that data, because we weren’t looking at the entire Corpus of everything happening on the platform, we were just looking at a sliver of particularly important influential accounts and content, we could deliver it immediately. It was real time. And so if you were 24 hours out from a major election and you wanted to see if people were spreading voter suppression content, or if there was a shooting somewhere and you wanted to go and see if any of the violating content was being passed around, you could do all of it in real time.
And there’s an incredibly important role for that particular type of data sharing. It’s for all the reasons you just mentioned as well as like a ton more, but the platforms change constantly, content comes and goes insanely quickly, so even if you have data sharing that’s, oh, once a quarter, we do something, or even once a week, you will… Even that constitutes essentially a historical look back for how quickly these platforms move. And then at the very bottom of this funnel, for the most narrow portion at the bottom, is you have super privacy-sensitive data sets that might come from anywhere across the entire corpus of a platform where they’re in a position to share it. That is the sort of thing where I think you get into more longitudinal, more robust, oftentimes peer-reviewed, the entire life cycle of doing more traditional academic research, and a lot of that’s going to take time. In some cases, that’s fine, because I think… In some cases, they might be looking at particular aspects, dynamics of a platform, that don’t change as much. But also, there might be things that they also look at, that by the time it comes out, are less relevant than I think they hope. I think the main answer is you need a variety of different types of data sharing and you [inaudible 00:24:56] combine all those together.
Does that create a strange incentive for the platforms to stay ahead of external researchers? I’m just thinking about, for instance, Facebook made a huge trench of data available to researchers, including Rebekah Tromble and others, around the 2020 election, and we haven’t yet seen any real publications come out of that project because of the pace of the research and how long it takes to go through and deal with the massive amount of information and to go through peer review and to go through the various processes. We’ve got another midterm election coming up here and we don’t have the benefit of those learnings in the public domain. It’s quite likely that Facebook has done its own research. It’s looked at the numbers itself and it’s already implemented a variety of changes that would address whatever results may come from that particular consortium. I guess, I don’t know, I’m just imagining this dynamic where the platforms are incented, maybe in a good way, but also maybe in a slightly defeating way to stay ahead of whatever results might come from these transparency efforts.
I am not in the consortium that’s involved directly in the US 2020 research, but my understanding is that some of it is the natural life cycle of doing academic research, but some of the reason we haven’t seen anything yet is also because of all the problems we’re trying to solve through this legislation, that there are also ways in which Facebook was just not particularly well set up, staffed, resourced. There were regulatory stuff they felt nervous about, because it wasn’t clear in legislation. I think that the pace of that particular project, I think, is partly natural academic stuff, but I think a bunch of it is actually because we haven’t solved and mandated some of these requirements through legislation. And so, Facebook just didn’t have all the systems in place that they should. Listen, I think one of the benefits of transparency is that even if no research ever comes out of it, the mere fact that they know they’re going to have to be more transparent should actually encourage a bunch of behavioral changes internally around this stuff.
And actually, that’s why it’s such a powerful mechanism. If it turns out that because they know research is coming, they change a bunch of stuff, I don’t know, maybe that’s not the end of the world.
Pardon me for, I guess, pushing a thumb into some of these issues just to think them through a little bit, but I guess there’s one last one I’ll bring up, which is potential downside of transparency. By the way, I’ve spilled a lot of ink on tech policy press in this podcast talking about the benefits of it, so keep that level set on that. I am very much a fan of these proposals and very much hope that we will see some of this legislation pass and see, of course, the European access come soon enough.
But one of the things that people are so scared about when it comes to social media platforms, of course, is their ability to impact populations at scale, potentially to make changes that have far-reaching consequences and change political outcomes and change the nature of discourse. Is there any fear that you have based on either any of the work you’re doing around transparency legislation or legal frameworks at the moment, or just your experience of having worked inside of a large social media platform, that to some extent, more transparency might also mean more common knowledge about the ways in which social media can be used to do some of those things people are afraid of?
I have many fears, so I will attempt to go through all of them. I think there needs to be more discussion about this, because there are a lot of downsides. There are a lot of potential downsides. Honestly, one of the challenges is getting this right and trying to avoid as many of them as possible, but I will just start going through them and tell me when to stop before we run out of time. I think one is simply writing this legislation and getting it through in a way that is… In the US, [inaudible 00:29:09] constitutional is really hard. There are real first amendment challenges to requiring private companies to share data. The way the first amendment is interpreted by a lot of people, especially on the conservative side, is that the first amendment protects both people and corporations ability to speak as well as their ability not to speak if they don’t want to.
And so, when you require them to say things they don’t want to say, that’s considered compelled speech. There has to be an overarching public interest reason to do that, and what that interest is entirely up to, ultimately, the justices that sit on the Supreme Court, and there are very differing opinions on what level of public interest outweighs a corporation’s right to not speak. One is that we could just write a piece of legislation that gets struck down, and then that could set back this entire thing for a long time. One is, I think… There’s one set of harms that are big just on making progress, and if you don’t do it well, it could set back all the efforts. There’re similar challenges around GDPR in Europe. I think a second set of harms is that… I think there’s a real… One of the things you want to avoid is malicious uses of data that I think don’t accomplish the things that those of us who believe in transparency are the reasons we’re in it.
One of those is enabling more surveillance at a government level. One of the questions is, can government access this data? Let’s say they can. Does that outweigh or totally negate the benefits you get from also giving it to academics and researchers, and how do you balance that trade off? I think a third one I think a lot about is that transparency’s not going to solve everything. There is no solution to everything. Social media and internet around, we are going to be negotiating and debating and figuring out how to make it as healthy and constructive as possible for as long as we’re all around. There will never be any solution to anything. We’re just going to constantly be trying to make it better. When it comes to making more data available, there is going to be a lot of research and data that’s just immediately politicized.
In some cases, it will be misleading political uses of this stuff, but suddenly have the halo of more robust data instead of either leaked documents or anecdotal stuff. Data can always be politicized, misused, misinterpreted, misled. The more of it you make available, the more likely people are going to be able to do that. Now, I also think… For me, the counterweight to that is the answer to bad analysis is just more analysis. And so, the bigger the ecosystem [inaudible 00:31:37] people who can look and debate and negotiate can mitigate that, but there will be totally annoying and frustrating uses of data, and I think that’s something we see in almost every field.
I think there’s also a case to be made that… The privacy community, members of the privacy community who I really respect, part of their overall thesis is that these companies shouldn’t be collecting this data at all, let alone be making it available to anybody else, even if it’s in privacy safe ways. If you enshrine this, what you’re actually doing is giving a stamp of approval to the fact that they even collect a lot of this stuff that a bunch of people, smart people in society, think they should never collect in the first place. Are you locking in a business model through some of these efforts? I think the regulatory agencies and bodies themselves could also be politicized as well as the data itself. And so, how do you try and avoid that? Yeah, I could probably go on, but I think these are real and some of these will a hundred percent happen.
In the next six months, assuming we don’t see something happen in Congress, what do you think should be the priority for advocates of transparency in the US?
I think one of the interesting dynamics of this space is how much everyone loves transparency and everyone agrees with it. There’s almost like you don’t need to make the case for it, but I think the bigger gap is making the case about why platforms and other entities who talk about how much they believe in it aren’t doing it as much as they should.
I think finding ways to hold platforms more accountable for the delta between what they talk about and what they actually deliver is, I think, a huge opportunity. I know there’s actually some people working on this as well, including everything from transparency scorecards to other things that I think are going to help inform the public. But no, I’m a fundamentally optimistic person. I actually think there’s been a lot of phenomenal progress in this whole space over the last six months. It feels to me like the tide is flowing in a very clear direction. It’s really just a question of how slow it is and how hard it is and how many mistakes we make along the way.
An optimistic note’s a good place to end, so Brandon Silverman, thank you very much.
Thanks, Justin. Appreciate it.
Just as there is a focus on the necessity of external insight into social media platforms, there is increasing awareness of the potential harms and discrimination that can result from algorithmic decision-making on all technology platforms. That was the premise for the panel discussion “Opening the Black Box: Auditing Algorithms for Accountable Tech” hosted by the German Marshall Fund on Wednesday, June 15th. Here’s Rutgers Law professor and German Marshall Fund visiting senior fellow, Ellen P. Goodman, to introduce the discussion.
Ellen P. Goodman:
We’re really excited about this topic, and I want to introduce our panelists. We’ll start with some Q and A, and then we’re really going to open it up for the panel to ask questions. Also, we’ll save the last 10 minutes for the audience to ask questions and we’ll put that in the Q and A. We have Anna Lenhardt, who’s an information scientist and senior tech policy advisor to representative Lori Trahan in the house of representatives, and she represents the tech quarter outside of Boston. Deb Raji is a computer scientist and Mozilla fellow working on AI accountability mechanisms.
Mona Sloane is a sociologist at NYU, working on technology and inequality. I want everyone to notice the interdisciplinary nature of this panel. I’m a lawyer, and I think this highlights the interdisciplinary nature of algorithmic audits and how, when we think about algorithmic audits, we really need to think of them not only in terms of interrogating the tech, the code, the parameters of algorithms and algorithm systems, but also the socio-technical context in which they operate, including the humans who are either in the loop or not in the loop, in implementing algorithms and AI systems.
Let’s kick this off, Deb, with general insights about what are algorithmic audits, and if you can say a little bit about the different kinds of audits, which are not always clearly highlighted in the policy space.
Yeah. That makes a lot of sense as a starting point for this discussion, for sure. Yeah, I think that the term algorithmic audits or the term audit used in the algorithmic accountability context has been purposefully loose just because of the range of participants that identify as auditors or those engaging in oversight. In a recent paper with Dan Ho at Stanford Law School, we actually dug a little bit more into the range of audit systems that are existing and how this connects with the way that auditing has emerged in the algorithmic space. I think one big taxonomy that we’ve anchored to is the reality that there’s at least two main categories of audit. There’s the internal audits, which are executed by internal stakeholders or stakeholders that have some kind of contractual obligation to the audit target.
These are people, maybe a consultant or a contractor that’s doing an internal audit on behalf of that organization. They are very focused. Internal audits are very focused on compliance objectives and very much aligned with defining the criteria for the audit based off of the organizational and institutional criteria. That’s very different from an external, what we call an external audit, which tend to be third party institutions, and definitely have no contractual relationship with the audit target, and their objectives are externally defined. Their expectations for this system could be based off of an external standard or a proprietary standard, but definitely not anchored to anything defined by the institution itself, and also not focused on compliance as much as focused on whatever organizational objectives these auditors have. It’s often in protection of a group that they represent.
That, for us, was a really helpful dichotomy of internal versus external oversight, because it defined a lot of the relationship between the audit target and the auditor, and that really helped us with defining all other aspects of the audit and the different methodologies used by the two different groups and the different goals of the two different groups. That’s really the taxonomy that we’ve anchored to at this point. Yeah, and algorithmic audit, in general terms, would be any kind of inspection or external oversight or independent oversight of the deployment of an algorithmic system. At least in my work, I’m very interested in algorithmic deployment, so meaning that the system is either intended to be deployed or integrated into an actual ecosystem or it’s already out there and it’s this post hoc situation where we’re looking at the system after it’s already been deployed.
The goal of the audit is to assess or evaluate how the system behaves in deployment versus the expectations of that system held by the auditor. The definition that I’ve been using lately as well has been, it’s not just necessarily about the evaluation, but it’s also about taking that evaluation result and integrating it into some larger process of accountability, so making sure that the evaluation is not just an assessment or an inspection and you leave it there, but also trying to integrate that assessment into a broader accountability frame where we’re actually changing the system, or there’s some kind of consequence as a result of that assessment. I think that, for me, it’s what shifts it from just purely QA assessment to an audit where there’re consequences as a result of what that evaluation tells us. And so, it can actually factor into accountability outcomes for the organizations involved.
Ellen P. Goodman:
Yeah, that’s so important. I really appreciate you’re pointing out that there’s an intentional vagueness around algorithmic audit. I think one of the reasons is that it’s almost a desperate “Hail Mary” that we don’t know how to regulate [inaudible 00:40:58] some of these systems or we don’t want to. And so, let’s just leave it to audits and hope that achieves substantive goals. Let me turn to Mona about some of those substantive goals, if you can talk a little bit. We’ve done really interesting work, especially on hiring algorithms, how algorithmic audits might fit into some of our substantive goals for algorithmic systems.
Yeah. Thank you, Ellen. Thanks, Deb, for that very wonderful setting the scene. I think what is really important, to underline what Deb already has said, is to think about audits as a tool for broader accountability agendas and processes, and thinking about those in terms of not just the whole life cycle of one system or one product, but maybe even the general genealogy of a whole industry, perhaps even. What are we even trying to do here? And so, the important bit here, I think, is that we develop an understanding, a shared understanding, of audits that are pushing beyond just inspections. That’s something that’s really interesting for me as a sociologist, bringing to the table and understanding of what is actually the social process that underpins an inspection.
That is very often a concern for safety, the fact that we’re measuring something against some sort of standard. Now, that’s very complicated, and Deb said that. That’s very complicated when we’re looking at that in the context of AI systems or algorithmic systems, because they constantly evolve. What’s really key here is to find a way to bring together the premise on which a system is built or based with how it technically works and how the interplay between the two of them actually can disproportionately disadvantage, oppress, and harm certain populations. In other words, it is insufficient to just think about the audit in the traditional sense, the traditional social process, where it’s just an inspection about a set standard, but bring it together with the idea. What idea is the system actually materializing?
And so, for example, in the context of hiring, which is work that I’m doing with [inaudible 00:43:20] at NYU, we can think about certain ideas of how, for example, or what serves as a proxy for job performance, and is that idea actually grounded in actual science or is that an idea that is pseudoscience, essentially? What are we actually putting at scale here? We can actually meaningfully bring that together by way of looking at these constructs and looking at how these constructs are operationalized, and then assess how this actually performs or how they actually perform in the wild. That is actually an external audit that can be done. I’m happy to talk more about that. It gets a little into the weeds and geeky, but I think that’s the way in which I would think about audits, AI audits, as pushing beyond inspections.
Ellen P. Goodman:
Thanks, Mona. I want Anna to jump in here and talk a little bit about how we can think about these purposes of audits and the role of audits in the policy context.
Yeah. Thank you. Thank you, everyone. It’s great to be here with people I respect so much on these issues. First, I think algorithm audits are really important part of a regulatory toolkit for all the reasons already mentioned. They provide a much needed way to give oversight to an industry that’s incredibly consequential for human and civil rights, but also is evolving quite fast, faster than we can get new legal statute written, so they’re a great way to tackle this. Bigger picture, the way I’ve been thinking about algorithm audits is as a hub and spoke model, where at the center, you’ve got these core frameworks. You’ve got the NIST framework, you’ve got the provisions in the Algorithm Accountability Act, these really strong risk-based frameworks for assessing risks and algorithms, and also ways to mitigate them. But then from there…
Ellen P. Goodman:
Anna, I just want to stop you for a second. Just in case some people aren’t familiar with the NIST framework and the Algorithmic Accountability Act, can you just give us a few sentences [crosstalk 00:45:19]?
Yeah. The NIST framework actually came out of the 2020 National Defense Authorization Act directed [inaudible 00:45:28] which sits in the Commerce Department to start putting together framework for… I think specifically the one they’ve been working right now has been discriminatory bias in automated decision systems, but my understanding is they’re going to do more like this. It’s great. They’re not regulation. They are very much just standards, but even that’s a little squishy, but they do have a lot of input in stakeholder groups, which I know Deb and Mona are very aware of. That’s a great thing to have. And then, the Algorithm Accountability Act is Senator Wyden, Booker and Representative Clark’s bill that offers a very detailed way to do algorithm audits and directs the FTC to actually do a rule making process, to spell that out even further, and really just would mandate that all companies covered by the FTC, which is a pretty broad jurisdiction, but does not include government social services automated decision systems, which I know [inaudible 00:46:25] done a lot of work on as have I, but still is quite broad.
It would mandate that all those companies have to do these assessments. It doesn’t go much further than that, but it is a great starting… You have to do them, they have to be good, they have to meet this standard.
Ellen P. Goodman:
To use Deb’s framework, those would be internal audits?
Yeah, you’re right. Exactly. Yeah. The FTC can, at any time, check and make sure you’re doing them, but that’s really where it stops. And so, that’s the center. And then, I think policy makers need to think about the scopes, the specific context, and that’s really where [inaudible 00:47:00] Trahan has been doing a fair amount of work. First, we put out an ed tech staff draft last summer, and it looks at… AI’s being used in the classroom right now. It’s being used to make critical, what I would consider critical decisions, so predicting if someone’s cheating on a high profile exam, predicting if someone’s going to have future success in AP classes. These are the types of things I want, I think most of us would, and especially if they’re being used with taxpayer dollars. But what else do we want to require here?
I would argue if a company is making a claim that their AI is going to contribute to learning outcomes, I think they should have to be monitoring that too. I would like to see a little bit disclosures and assessments on how they’re doing that. We’ve been doing oversight of curriculum for a long time. Why wouldn’t you do that at ed tech? That’s where I think you take this core, you add a little bit of additional disclosure. And then additionally, who needs to see it? Getting to the internal-external audit. I would argue some task force at the FTC Department of Ed, probably should be able to do maybe either random audits or audits of the largest platforms. I think there’s a couple ways you could structure that, but they should be able to take a look. And then additionally, I think some kind of summary statement of what’s in these audits probably needs to be provided to educators, especially when you consider the way education policy kind of happens in this country. You’ve got the federal level and then a lot of it, really the decision of what gets used in the classroom is done on the local level. So there needs to be some sort of communication happening there. So that’s just like one sort of context to think about moving off of the core. And then the other one, the content I’ve done a lot of work on has been social media.
So in March she introduced the digital services oversight [inaudible 00:48:45], which includes a whole provision for risk assessments, risk mitigation, and independent audits within that risk mitigation framework. And it is broad. It’s broader than algorithms, right? It’s thinking about the whole safety processes of these companies and their products and product design, but it does specifically call out the fact that these products contain algorithms, right? Algorithms used to flag hate speech, election dis info, child pornography. You’ve got algorithms, obviously deciding personal recommendations. Things like how the news feed is functioning. That relies on a lot of personal data, but there is also with privacy data rights element there, which is interesting. And then of course is ad targeting, which we know can be discriminatory. So you’ve got a lot of algorithms in there that the bill calls out for them to be audited.
And then again, that’s where you’d want to turn to business frameworks to the provisions and the algorithm. [inaudible 00:49:41] you’re not repeating those, you’re just moving them into this context. And then it’s also worth mentioning that there, that’s a place that’s really important independent audit, right? Because you are talking about content. And that means that the government’s probably not going to be the best auditor or at least shouldn’t be the only auditor, right? That’s a place we really do want a true independent audit. So we do include language there for the large platforms to have that.
So it’s just really interesting and I think important to think about the hub and the spoke and to really ask the important questions of additional disclosures. Where do we draw bright lines? Right? So where do we say XYZ tech can now be used in XYZ situation, facial work in the classroom. Where do we draw bright lines? And then who do we think that’s [inaudible 00:50:25].
Ellen P. Goodman:
I just have one follow up and then I know Mona has a question too. Sort of as a policymaker, when you’re imagining audit systems, to what extent are you really drawing on the legacy and history as sort of financial audits and that, and the development of independent and basically consensual international consensus on what those audit standards should be?
Yeah. I mean, I think it’s a really interesting model. I know Mona and more work on this. I’ll timely, let her fill in. But I think what I will say is I think we definitely, what I see when I look at that space is sort of this independent audit space and can say not the best word, because industry associations are part of it, but it’s a outside of government. And then you’ve got government setting the rules and there’s this constant back and forth, right? So you’ve got the American Institute of Certified Public Accountants, which oversees the CPA, which most Americans, even if you’re not in that space, you understand that CPA means you’ve taken a lot of tests. You’ve taken a lot of classes, you know how to follow the laws but there’s laws, right? And that’s really important.
So, the laws are put in place, you now have this certified auditor that knows how to follow them, but then you’ve got this back and forth where, come the Enron incidents of the early 2000s, Congress has to respond. And they respond with the Sarbanes-Oxley Act and they add additional mandates. And additional rules around the way audits are done. I think it’s really safe for us to assume we’re going to have that back and forth here too and I think that’s okay. And that’s what we want. Just start doing them. Let’s see what happens, let’s course correct as needed. So I think that [inaudible 00:52:04] flow is what I really appreciate about the financial sector and trying to learn as much as we can from that.
Ellen P. Goodman:
Yeah. Thank you. I have the typical, conference. A comment and a question. So I just want to underscore what Anna just said and also look back to Deb’s opening remarks, which is that, again, as a sociologist, I’m very interested in understanding what are the existing social and professional practices that we can meaningfully focus on to integrate these compliance and new audit cultures, really, without them having to be top down. Because we already know from social science research, that social change is more likely to occur when we expand meaning rather than rapidly change that. So I think that’s very important and I really firmly believe that as we move forward with this audit agenda, we need to more forcefully integrate the professions and professional associations and think about that. And I think that’s later on the agenda for this conversation.
But I want to pick up on something that Anna also just mentioned, which is the question, how do we actually meaningfully make information from these audits available as part of this larger agenda of social change that we’re after? And a question I have a little bit for depth here is can we think about interoperability here? What are the kind of languages, what are the kind of socio-technical bits of information that we might want to standardize for audits so that, we have a whole landscape of audits that are actually accessible, not just to regulators and respective industries, but perhaps the public? Because that’s the ultimate goal is, I think in parts is to make this a democratic effort. And I think that legibility is really important here. And we talk about this a lot, right? We all want the outcomes of these audits to be publicly available, but what does that actually mean? So I don’t know, Deb, maybe you’ve thought on this, if not, I’m sorry for putting you on the spot here.
Oh no, it’s fine. Actually, this is the crux of my research right now. And I think early on as well was this question of, well, even if we… I think there’s two tasks with an audit. There’s one, you want to make sure that you’re doing a very valid evaluation. Setting up the expectations, articulating the expectations, formalizing those expectations, and then being able to sort of examine the deployed system and make that comparison and formalize that gap between what the system is doing and what the auditor expected it to do. That’s like one really big, important task.
The second important task is related to the second part of the definition I was mentioning earlier. Well, how do you take this evaluation and this assessment, and actually have that feed into accountability outcome. So how does that feed into either like a product change or a product recall? Or how does that feed into a public campaign? Or how does that inform standards? Or how does that inform a regulatory change? And I think that second question is super important and it has been the focus of a lot of my research for the last year. I think I’ve learned a lot from looking at other audit systems.
So other industries where audits are quite common. So this is like transportation, medicine. You mentioned finance where I definitely think there’s been a lot of back and forth in terms of what’s happened in that industry. And it’s really shifted the validity of the audits, but also how much the audits factor into accountability outcomes. So you mentioned Sox audits, which are post Enron. There were a bunch of rules that were introduced, and a lot of those rules were actually not necessarily just about requirements in terms of how you evaluate the quality of the financial reporting from these systems. But a lot of those rules were actually around consequences from these audits. So a lot of those rules were transparency rules.
One was, now if you’re an auditor and you do an audit report of like a financial audit report, you have to submit that to the Edgar database. And the SCC is now monitoring what the outcomes of your report are. And that Edgar database isn’t publicly available but it’s available upon request. And there’s like a mediated transparency regime that they have in finance, where the audit results are accessible in a way and through a process, a vetting process of, if you’re an actor that’s been approved under certain criteria, then you can see how well these systems are doing.
For publicly traded companies, now they have to make their financial reports public. And there’s all of these rules that were brought in. And what we realized was, those rules were not just rules around… And I think this is an interesting contrast in my perspective to the conversation right now in the algorithmic auditing space. So if you look at like, ICO’s like the information commissioner office in the UK, and their auditability guidelines. Or even if you look at the algorithmic accountability act, there’s a lot of focus on what is in the audit. Are we going to audit for bias? Even if you look at New York’s hiring, the new bill that came out of New York City council recently, the focus is on, like this is the bias audit, how are we get the details of the audit content itself?
And I think that is very valid and definitely a locus of discussion. But when you look at other industries and where regulation has really shifted accountability outcomes, when I say accountability outcomes, I mean, bad actors get punished, good actors get rewarded. A lot of those interventions are actually not know so much about the details of the evaluation but also around all these other factors that relate to the consequences of the audit and mandating, a response time, for example. That’s another feature that we noticed across different industries is, they’ll mandate that the company has to respond by a particular time, or there’s other consequences or additional fines.
They’ll mandate making the audit report public or visible in some way. They also mandate other things in terms of not just accountability on the corporate response side, but also setting up structures of accountability for the auditors themselves. Like you mentioned, we have regulations and guidelines and best practice restrictions for auditors in the finance space and in the medical device space, why don’t we have anything like that for algorithmic auditors? Anyone can call themselves an algorithmic auditor right now. And that could satisfy the digital services act mandate to hire an independent audit. You can just take anyone and call them an auditor and satisfy that requirement.
So it’s part of ensuring accountability to make sure that we have some validation certification, accreditation process for auditors as well. So I think there’s a lot of these details that are now revealing themselves in these other audit systems and emerging as like important considerations for us in the algorithmic accountability context. Yeah. And I’m happy to list more but I don’t want to take up too much time as well.
Me and Dan have a paper coming up at AIS where we pretty much look at all these other audit systems and we highlight some of the patterns that we notice, and we try to connect that to practices that happen in other communities. I’ll say the sort of top three practices is one is like, there’s some oversight board in the sense of like in finance, in transportation, there’s some kind of professional society or professional standards for auditors. And that’s something that we have not yet established or set up in this algorithmic accountability space. Another second point that was pretty resonant is the existence of an incident reporting database where we don’t really have any way for harms discovery to happen in the algorithmic context.
If someone has a complaint about an algorithmic deployment, there’s no a avenue for them to communicate with regulators or even communicate with a broader ecosystem of auditors. So the existence of like an incident reporting database and similarly like registering audits in some kind of accessible database, similar to Edgar, that’s not something that we do. We don’t really have those transparency regimes. And then the final point that we had made was the point I was making around some of these post audit measures of legislation right now. I think the DSA does this a little bit of the companies have to respond within, I think it was 90 days or 180 days or something. But we haven’t really fleshed out the details of what does it actually mean? What kind of corporate responses are we expecting to these audits?
And are there ways in which we can enforce that through a regulatory mandate of like… The company actually has to pay attention or respond in this particular way to an audit outcome. So that’s the third thing. And I think like to your point, I do think transparency does help a lot with that. Especially in our space where people do pay attention, especially online platform audits in particular, the users are the public, right? So when there’s an audit of, let’s say like Facebook’s algorithm, and let’s say the audit results were made public without the interference of Facebook, for example, either to quail the severity of the audit results or censored in some way, let’s say you actually got full access to the audit results.
Even if Facebook did not take direct action in response to that audit report, the public does a really good job. The public, and also institutions that mobilize the public, like the ACLU, they do a really good job coordinating these campaigns that end up leading to accountability outcomes regardless. So I do think your point around making things public and thinking about post audit actions is totally in line with the kind of research we’re doing and the results we have there as well. So yeah, all this to say that there’s things happening in other audit spaces that we can definitely learn from, to go from just a really good evaluation to an evaluation that actually holds weight in terms of broader accountability outcomes.
Ellen P. Goodman:
Wow! Deb, I’ve been-
Sorry. That was a lot of words.
Ellen P. Goodman:
There’s so much there to talk about. I think just doing my moderator job, I’m going to lift up just a couple of things, maybe one thing from what you said, and one thing from what Anna said, and then Mona, I want you to reflect and respond and then we can take it from there. Okay. So Deb, when you talk about whether or not the audit object has participated or not, I mean, in some ways, unless it’s a sort of scraping exercise or a sock puppet account or GDPR data, unless those are the sources or are the inputs for the audit, there’s going to have to be some cooperation. And I think, Anna has laid out different audiences that might have different amounts of access to the underlying data and the systems.
And which is something that is interesting to pursue. And you can see the beginnings of that in the AI act in the EU and what the UK is doing. Anna, you also in your hub and spoke template. I think, another way to think about that is, is we can think of it a overarching regulation that deals with AI audits, and then we can think of the verticals. And this goes to qualifications that would be needed to conduct the audit. And so, if we think about AI or algorithmic audits at large, it would actually very difficult to think about what qualifications would be necessary because they’re see obviously technical qualifications.
But then they also might be, and people are starting to say, “Everybody needs to bring on philosophers and arts and science, humanities majors because you have to be able to understand, first of all, surfacing palms discovery, which is a great term. Who is in the best position to audit that that has been done appropriately. When I think about the Frances Haugen testimony when she said, algorithmic harms were discovered, they were surfaced and then the corporate structure and the incentive structure with such that they were batted away, that is an object for audit, right? Is to see how that works. And that’s a very different qualification than someone who’s looking at code.
And then the verticals, if we’re talking about healthcare, education, policing, right? So understanding those systems and the human technical interface, it might be different in each of those verticals. And so I sort of throw this over to you Mona as this big socio-technical stew, when we think about qualifications and also just where governance responsibility, either in the company or outside for an external audit for policy makers or civil society. How do we think about those things?
Yeah, thanks, Ellen. That’s a big question and I just want to offer my thoughts and I’d be very curious here from Anna and Deb as well. So I think first and foremost, we need to have a shared baseline understanding of what an audit should do. What is the very basic social process that we’re actually trying to implement here? Is it for safety? What is the idea behind it? And there’s a ton of… Not a ton, but there’s a fair bit of balls in the air with regards to AI audits specifically. So I think it’s very important to establish that first because as you said, Ellen, if we talk about stakeholder involvement and different kinds of expertise, I think we do need to have a shared understanding what an audit is and what it is not, right?
It’s probably not an impact assessment, which I would think is something that happens, perhaps pre-deployment, right? And it’s maybe more internal. So I do think that’s a lot of work that we still need to do with getting on the same page here. Then I do think that we’ll end up in a situation. And I don’t think that is a bad thing where the actual audits, the way in which they are being deployed is vertical specific, and not only vertical specific but professional practice specific. The way in which a specific, let’s say risk assessment system in the emergency room is used by nurses will be very different from the way in which a computer vision technology system is used to assist radiologists, right?
So there’s different ways in which these systems, A work purely on a technical basis but also are slotted into existing social and professional practices in how their meetings are interpreted, right? And we need to be very careful not to impose any kind of assumptions around how they’re being used, which could water down actual good audits, right? Basically buying into the idea that one AI system is used by nurses in this one very prescribed way, which could be very different from the way in which is actually being used. And the notion of professional discretion plays a really important role here. And we need to find, once we have a baseline definition of what audits should do, then get to a place where we need to think about how do we actually enact those in these verticals and with these different professional practices in mind.
And I think because AI systems are contextual, AI audits will be too. Which is why I asked this question about interoperability of outputs or outcomes of audits to Deb, right? Because we then need to loop back to the macro level. We need to get to a circular kind of process here. And I think that’s where the rubber hits the road, ultimately.
I don’t know if I can respond quickly. Yeah. I was going to say, I think that that’s definitely an approach to making sure that there’s… Because I think the other thing, the other value, especially about third party and external audits is the fact that you have more eyes on the system, right? You have more perspectives looking at the same artifact and that just reveals different harms and different issues. You have effectively evaluators asking new questions about the system that really challenge the narratives brought forth by the companies about how well that system is working and what that system is doing. And I think that is inherently the value of audit.
So the idea of sharing information about the audit result with a broader ecosystem of people or engaging a greater diversity of participants in the audit practice makes a lot of sense as a way to really increase the number of the perspectives analyzing the artifact and then really helped to identify more issues by looking at it from different angles. I think the challenge, the practical challenge of making the auditing practice more accessible, or releasing audit results for outside scrutiny is that not all audit auditors. And this is really something that we learned looking at other communities as well.
It’s not all auditors actually necessarily have the best intent. There have been situations in other industries of individuals representing competitors within the same industry operating or standing in as auditors, and then not responsibly handling the information provided to them. And scrutinizing the system with the purpose of dealing with a competitor unethically. So I there’s been cases of that. And I think as a result of that, there’s a weariness in government of just making it completely open to the public or making everyone eligible to participate as an auditor.
And it seems like the approach that has been taken in at least a few industries has just been having some oversight over the auditor population themselves. To say not everyone can call themselves an auditor, especially if an auditor gets privileged access to a particular product, there’s some kind of vetting process. We have to make sure that you’re actually independent of the company and any competitors of that company. We have to actually make sure that you’re qualified to ask the questions, that you’re in a position to ask, or that would be beneficial for accountability for you to ask.
I think that intervention, I’m not sure if it’s the ideal one, but it’s the current standard around just having vetted access or having some kind of … Because I think your vision is a more absolute … I mean, I’m curious to hear what your thoughts are too of dampening the vision from a more absolute, everyone in the public has access to this and has an ability to contribute to this versus a more restricted space of after some vetting, after some certification, then those that are qualified will have access.
I think that to play devil’s advocate on my own point, the downside of having an oversight board and having a vetting process is, well, how do you determine what these qualifications are? Are there details of those qualifications that might exclude the parties that need to participate in this the most? Especially if they’re coming from a marginalized population, it might not be as easy for them to get certification as another group. I think there is nuance to that proposal of having some auditor oversight or auditor practice oversight. But I’m curious what you think about just restricting the vision of absolute transparency or absolute interoperability to something that is a little bit more gated or vetted or guarded.
Ellen P. Goodman:
Well, I know, I think Mona has thoughts on that, but I just want to pivot a little bit and move from qualifications to a question actually that we got in the Q&A that’s also a question I have, and I think has been raised by all of you, which is we recognize that algorithms, many algorithms at least are a tremendously dynamic process. When Elon Musk said, “I’m going to get in there and we’re going to open up the code for Twitter.” There was appreciation in the transparency community followed by, “Well, wait a minute, what does that even mean?” The code? Which code? One second’s code or the next second’s code?
Who can understand it? Right? So who is it useful for? But just sticking on this dynamism point, we’ve got dynamic processes and then we’ve got products which are deployed and then jiggered and re-optimized and redeployed and humans in the loop who may be overriding the algorithm. The question was, Mona mentioned that audits are not just about thinking of the life cycle of one product or one system. How do you think about drawing the boundaries of where an audit begins and ends, especially for identifying unintended or unexpected impacts. And maybe Anna, you can start because the Algorithmic Accountability Act definitely has a view about this. Maybe we can start with you.
I’ll talk more about this [inaudible 01:15:11] view on it, but they’re similar, right? It’s this idea of an ongoing, this isn’t a one-time thing. We did our audit. We’re good, right? These are ongoing, have you as a company or product team put in practices, you are ongoingly monitoring the risk whether it be discrimination, whether it be accessibility, or our various other firms, have you assessed your risks and then have you put in practices. And as new risks emerge, are you putting in practices and then are you doing scenario planning? Right? So one of as the way the digital services that were saying in safety act handles, this is you have your risks assessment and then one of the top mitigation things you need to do and document is scenario planning, right?
Think about how your product’s going to be misused or how it could go wrong and document what you’re doing right now to prepare for that. In the social media context, obviously that we’re thinking about influence operations, elections, things like that, but you can imagine it in every other sector as well. So I do definitely think it’s not meant to be a compliance checkbox tool. It really is meant to be like an active, ongoing thing. I think that an algorithm accountability act as well, as put these processes in place. And I think you see it in companies as well. So you’ve got like Arthur AI and tools of that nature, where they’re supposed to be kind of ongoingly monitoring various things. And so you can discuss kind of putting those types of processes in place. And I know there are other tools like that.
So I think that’s what we’re what we’re trying to push for in this legislation. Now, obviously legislation is like legal text. The details get kind of worked out further down the line. I’ll just really quickly circle back to this discussion of expertise verticals. I think we see this with CPAs as well, right? Like a lot of CPAs are experts in nonprofit finances or corporate taxes and corporate disclosures. So I think it’s totally reasonable to expect that.
And I actually think one area that doesn’t get discussed a lot so I’ll say it to this community is, the [inaudible 01:17:13] bills are moving pretty quickly. And for reference they used to work for [inaudible 01:17:18]. Self-referencing is basically an algorithm deciding that one set needs to be favored over another. So Amazon’s algorithm, favoring, Amazon basics products for the buy box. In order to comply with these new self-referencing laws, I think it’s reasonable to assume these types of audits, at least internal and if they’re smart, these big tech companies are smart. Will turn to external ones. It doesn’t mean to put on your radar too, that like there are so many subfields we can see this.
Ellen P. Goodman:
Thanks. Mona did you want to come in on the qualifications point?
Just to underscore that, I think that expertise is extremely important and it is important that people who … Individuals or communities of practice who represent lift experience have some accountability relationship to the community that they’re representing. Why am I flagging this. Deb and I both talked about harms and I think what we really need to keep in mind is that the basic social problem or the basic problem that we have with algorithmic harm is we don’t necessarily are able to see it until it occurs. At least that’s what we hear from the tech community. But we know that it’s more likely to occur along the existing fault lines of social stratification and the intersections there off, which means that those who are affected the most have the least resources to actually flag the harm.
So we need a process by which we can anticipate these kinds of harms so that the labor, our flagging harm is not on the side of those who are experiencing the harm, but those who are causing it. And that is something that in my book needs to be firmly on the side of impact assessment, algorithmic impact assessment. So pre-deployment, we need to move beyond the social logic of using the public, using markets as a lab for testing, how the algorithm will behave. That’s one thing. So that’s different from audit, right? Audit happens once it’s a form of product that is in the market that’s being deployed and so on. With regards to when audits happen, I think there’s different kinds of decisions that can be made here, one is timeline. Is there an update? Is there a new feature?
Is there a new market that’s being tapped into? There’s different kind of timelines that could be developed here. With regards to professions, I think, accreditation might be something interesting to think about here. The other thing that is really important to think about is, and I would say that as a sociologist, what is actually the social assumption and the calculative model that we’re dealing with here. We’re talking so much about the technology, right? But we can get at harm and we can get at disproportionate impact if we develop a model whereby we ask people, how do you think you’re going to make money of this? What’s your business model? What is the basic economic or business idea here? And that’s a very straightforward question that can help surface potential harms, right? If your basic idea that you’re going to use, personality as a proxy for job fit, that already points you in a very good direction with regards to what harms could possibly occur. So I think we need to work a little bit more on that side, of both algorithmic impacts and audits.
Ellen P. Goodman:
And transparency. Some of that is, yeah. All right. Let me move to another question, which takes us to Europe. In so much of AI governance, Europe is out ahead and the things we see being proposed in Congress are really sort of catch up to where they are. And so this question is about the DSA somewhat directed at you Deb, what is your opinion on the DSA, is this sort of and the structure is kind of like the algorithmic accountability act where you’re doing self-assessment, the companies are doing the platforms in this case are doing self-assessments, risk assessments, assessments of their risk mitigation strategies. And then these audits are reported to the EU, which then looks at them. What’s your opinion about this? And would this even be possible this EU audit without knowing what kind of data the platforms themselves have?
I’ll just say for myself, I think it’s very sort of partial and baby steps, maybe a necessary first baby step, but what’s missing from this, first of all is, any kind of standards by which the companies are going to audit themselves. Also kind of standardized reporting so that they are comparable and can be easily kind of digested by outsiders. It reminds me, it’s a little bit like a transparency report plus. And so Mona, the point about where does transparency and transparency reporting and auditing accountability, auditing begin. The platforms all have a lot of transparency reporting, but I’m sure we’ve all been among critics to poke holes in them and say, well, why aren’t you telling us this, that, or the other thing, I think this is a little bit susceptible to the same kinds of problems, but Deb, what do you think?
Yeah, I have a lot of thoughts about the DSA. I think a lot of what your comments were in reference to article 28, which is the internal audit requirement. And I think, for me, it really is the analogy that the transparency reporting conversation happening in the states right now of these companies need to just reveal information about how well their systems work. I think I’m always very skeptical around just how far you can go with … Depending on internal accounts of performance or risk. I think on one hand, providing strict guidance in terms of the format and the requirements of such internal audit, like report outcomes, makes a lot of sense in terms of forcing these companies to reveal certain types of information. But there’s already kind of been precedent, especially with Facebook of them really just miscommunicating or intentionally obscuring some of the details of their system through identifying loopholes and what they can or can’t say, and thus reporting incomplete accounts of the level of risk of their system.
So I found that’s … And you kind of rightfully pointed out, if there’s stricter guidelines around what they should be saying, if there’s some level of oversight, then that might become easier. But I think that’s the big risk of just purely depending on internal accounts is that these companies are not always fully honest or complete in their reporting of what’s going on in the inside of their systems. I think there’s an article 31, which I’m sure whoever posted this, probably is aware of the level of debate around it, where it’s talking about providing external auditors with access to the systems to come up with independent accounts of how well that system is working. For obvious reasons, it seems like a great idea of like, yes, we don’t have to depend on the internal accounts, we can also bring in external accounts for certain cases or allow researchers to come in and engage.
I think there’s already been an open letter from different groups, such as Algorithm Watch, and I think human rights watch as well. There’s been a couple groups access now might have also commented on this, of how restrictive the current qualifications section of that article is, where it’s just restricted to academic researchers. Whereas you have a lot of civil society groups, you have a lot of law firms, you have a lot of just other entities that participate in the external audit process. So they found that article was restrictive. Although the idea of access to an external party as part of an independent oversight mechanism, I think is something that a lot of external auditors found to be an exciting prospect. And a good counterbalance in addition to this sort of internal reporting out, a caveat in the DSA.
So I think that’s a lot of my opinion about the DSA is that … I think it was one of the first instances I saw of third party audit, external auditors being given … In this case specifically academic researchers being given an avenue to access the system, which is really one of the biggest roadblocks for external auditing. But I think there’s definitely a lot more that could be discussed with respect to the details of what those qualifications were for those external auditors and expanding that definition of participants to really encompass those working on this. I think the other thing as well is with the internal accountability clauses, closing some loopholes that companies have already proven that they would take advantage of. So in the DSA, there’s some loopholes around not reporting things that might compromise proprietary knowledge or trade secrets.
And that’s definitely a loophole that companies are going to take. So I think there’s definitely been a lot of discourse on that end as well of with internal audits and depending or relying on reporting out in terms of these transparency reports, how can we make it so that there’s integrity to those reports and we can actually depend on them. The final point I’m going to make about the DSA and sort of conversations I’ve had about that in general is there seems to be a lot of confidence in regulatory capacity, to one, assess these transparency reports, and two, maybe even mediate or facilitate their own kind of external investigations into the performance of these systems.
And I’m a little bit more skeptical than the DSA working group about the capacity that regulators have to do some of that work themselves versus relying on maybe a broader ecosystem of audit, like third party, external audit participants to engage in that practice. So Mona mentioned a couple times this idea of, well, if a company is submitting their, internal risk assessment to the regulator, why not make that public, or why not make that accessible to a third party upon request like a vetted third party upon request. Those are ideas that I think are worth definitely thinking a little bit more about is why depend completely on the regulator to sort of vet the quality of these internal audits when you could actually make those accessible. And that’s sort of another way to increase third party participation.
Ellen P. Goodman:
Thank you, Deb. So we have four minutes and I know Anna has something to say.
If you’re following a DSA, please look at the digital services [inaudible 01:28:32] safety act. Please, please, please, please. It is an attempt at a U.S. approach to this problem and really strengthens a lot of the holes that Deb has just point out, mostly because United States, just within our first amendment context, this is going to have to be our approach, right? Like deep transparency with independent audits, separation from government, which it’s just realistically the approach we’re going to have to strive for. And even then it’s going to be a trip. So we definitely took the approach of the similar audit, we looked at the language, we added some additional things and then manding that all the large covered platforms and there’s some back and forth on if you should go even further than that, get an independent audit.
FTC can do well making on like what that looks like. What counts as independence. And that the other thing is that the FTC and [inaudible 01:29:23] do rule making, both on how those auditors should get secure access to the data so they can actually get raw data. So if they want to audit the algorithms that are used through all those processes, they’d be able to get through [inaudible 01:29:35] that they need to do that. Along with 40 pages where we spell out, essentially, what’s an article 31, which is researcher and civil society access to data in a way that is secure, limits law enforcement access, I mean, there’s so many issues that come up, especially in a country that doesn’t have a comprehensive privacy law. But the idea is to have kind of these multiple types of checks on what the platforms are doing.
So both this independent auditor, that’s looking over the risk assessment and then this set of civil society and researchers. The other thing I’ll mention is that the audits do go back to the FTC who then summarizes and makes a public version available. And this approach is … The trade secrets is one thing, I think that gets talked about a lot and the way disassociates written, it trusts the FTC who is also in charge of the markets and competition to actually determine where the trade secret is, not the companies, but also national security is the other thing too. So you have to remember, right, these platforms are used to weaponize dis-info, especially Russia, China, and troll bots. So it’s a legitimate national security concern and so there are some things you wouldn’t want to make public for that reason too. And we worked really closely with lot of national security experts. And Adam Shift is one of the pro-leads on the bill to think about that as well, which I think gets lost. Please look at it.
Ellen P. Goodman:
Yeah. I mean, Congresswoman [inaudible 01:30:58], this bill is I think the last word on this topic so far in the U.S. So we’ll put a link to the bill when we send this out. Mona, do you want the last word?
Oh my God. The last word was devil’s advocate, I’m going to ask the big so what question. Which is we … Acknowledge that these kinds of systems have become social infrastructure, right? They kind of really profoundly affect how we organize and hold together society. What if we find that one system or a part of a system so profoundly infringes on civil rights and so on, what do we do? Are we going to … Can we shut it down? And the way in which why I’m asking is when we have a restaurant that doesn’t comply with food safety standards, we can close the restaurant. Can we close down an AI system, an algorithmic system or a part of it? So I would really encourage policymakers, industry, my community, researchers, to think about once we’ve got the audit figured out, what’s on the other side, right? How can we actually act upon.
Ellen P. Goodman:
That gets into the substantive law for each of these areas.
I don’t know if this community has seen the bipartisan staff draft privacy bill that’s been circulating. It includes, it would mandate algorithm audits, and it also Mona’s most recent point, would mandate that you follow discrimination law in data processing. It’s bold. Look at it for sure. It’s bipartisan and somewhat bicameral. So it definitely works with these areas. Thanks
Ellen P. Goodman:
All right. We have lots of things to look at. Lots of big questions to ponder, and I want to thank all of you so much for joining us. Take care, everybody.
Justin Hendrix is CEO and Editor of Tech Policy Press, a new nonprofit media venture concerned with the intersection of technology and democracy. Previously, he was Executive Director of NYC Media Lab. He spent over a decade at The Economist in roles including Vice President, Business Development & Innovation. He is an associate research scientist and adjunct professor at NYU Tandon School of Engineering. Opinions expressed here are his own.