At Twitter, data scientists like Rani Nelken Director of Engineering, Twitter Cortex Applied ML Research San Francisco, use AI to try to make you vote Democrat and Accept Immigrants.

Scientists Increasingly Can’t Explain How AI Works

AI researchers are warning developers to focus more on how and why a system produces certain results than the fact that the system can accurately and rapidly produce them.

What’s your favorite ice cream flavor? You might say vanilla or chocolate, and if I asked why, you’d probably say it’s because it tastes good. But why does it taste good, and why do you still want to try other flavors sometimes? Rarely do we ever question the basic decisions we make in our everyday lives, but if we did, we might realize that we can’t pinpoint the exact reasons for our preferences, emotions, and desires at any given moment. There’s a similar problem in artificial intelligence: The people who develop AI are increasingly having problems explaining how it works and determining why it has the outputs it has. Deep neural networks (DNN)—made up of layers and layers of processing systems trained on human-created data to mimic the neural networks of our brains—often seem to mirror not just human intelligence but also human inexplicability.

Most AI systems are black box models, which are systems that are viewed only in terms of their inputs and outputs. Scientists do not attempt to decipher the “black box,” or the opaque processes that the system undertakes, as long as they receive the outputs they are looking for. For example, if I gave a black box AI model data about every single ice cream flavor, and demographic data about economic, social, and lifestyle factors for millions of people, it could probably guess what your favorite ice cream flavor is or where your favorite ice cream store is, even if it wasn’t programmed with that intention.

These types of AI systems notoriously have issues because the data they are trained on are often inherently biased, mimicking the racial and gender biases that exist within our society. The haphazard deployment of them leads to situations where, to use just one example, Black people are disproportionately misidentified by facial recognition technology. It becomes difficult to fix these systems in part because their developers often cannot fully explain how they work, which makes accountability difficult. As AI systems become more complex and humans become less able to understand them, AI experts and researchers are warning developers to take a step back and focus more on how and why a system produces certain results than the fact that the system can accurately and rapidly produce them.

“If all we have is a ‘black box’, it is impossible to understand causes of failure and improve system safety,” Roman V. Yampolskiy, a professor of computer science at the University of Louisville, wrote in his paper titled “Unexplainability and Incomprehensibility of Artificial Intelligence.” “Additionally, if we grow accustomed to accepting AI’s answers without an explanation, essentially treating it as an Oracle system, we would not be able to tell if it begins providing wrong or manipulative answers.”


Image: Getty Images

Black box models can be extremely powerful, which is how many scientists and companies justify sacrificing explainability for accuracy. AI systems have been used for autonomous cars, customer service chatbots, and diagnosing disease, and have the power to perform some tasks better than humans can. For example, a machine that is capable of remembering one trillion items, such as digits, letters, and words, versus humans, who on average remember seven in their short-term memory would be able to process and compute information at a much faster and improved rate than humans. Among the different deep learning models include generative adversarial networks (GANs), which are most often used to train generative AI models, such as text-to-image generator MidJourney AI. GANs essentially pit AI models against each other to do a specific task; the “winner” of each interaction is then pitted against another model, allowing the model to iterate itself until it becomes very good at doing that task. The issue is that this creates models that their developers simply can’t explain.

“I think in a lot of cases, people look to these black box models as a response to lack of resources. It would be very convenient to have an automated system that can produce the kinds of outputs they’re looking for from the kind of inputs they have,” Emily M. Bender, a professor of linguistics at the University of Washington, told Motherboard. “If you have a dataset consisting of such inputs and outputs, it’s always possible to train a black box system that can produce outputs of the right type—but often much, much harder to evaluate whether they are correct. Furthermore, there are lots of cases where it’s impossible to make a system where the outputs would be reliably correct because the inputs just don’t contain enough information.”

When we put our trust in a system simply because it gives us answers that fit what we are looking for, we fail to ask key questions: Are these responses reliable, or do they just tell us what we want to hear? Whom do the results ultimately benefit? And who is responsible if it causes harm?

“If business leaders and data scientists don’t understand why and how AI calculates the outputs it does, that creates potential risk for the business. A lack of explainability limits AI’s potential value, by inhibiting the development and trust in the AI tools that companies deploy,” Beena Ammanath, executive director of the Deloitte AI Institute, told Motherboard.

Screen Shot 2022-10-31 at 9.18.26 AM.png

These people do not exist. Image: Arxiv

“The risks are that the system may be making decisions using values we disagree with, such as biased (e.g. racist or sexist) decisions. Another risk is that the system may be making a very bad decision, but we cannot intervene because we do not understand its reasoning,” Jeff Clune, an Associate Professor of Computer Science at the University of British Columbia, told Motherboard.

AI systems are already deeply entrenched with bias and are constantly reproducing such bias in their output without developers understanding how. In a groundbreaking 2018 study called “Gender Shades,” researchers Joy Buolamwini and Timnit Gebru found that popular facial recognition systems most accurately detected males with lighter skin and had the highest errors detecting females with darker skin. Facial recognition systems, which are skewed against people of color and have been used for everything from housing to policing, deepen pre-existing racial biases by determining who is more likely to get a house or be identified as a criminal, for example. Predictive AI systems can also guess a person’s race based on X-rays and CT scans, but scientists have no idea why or how this is the case. Black and female patients are less likely to receive an accurate diagnosis from automated systems that analyze medical images, and we’re not sure why. These are just a few examples of how viewing AI-generated results as concrete data without understanding the system’s potential biases creates rippling societal consequences.

At the same time, some experts argue that simply shifting to open and interpretable AI models—while allowing greater visibility of these processes—would result in systems that are less effective.

“There are many tasks right now where black box approaches are far and away better than interpretable models,” Clune said. “The gap can be large enough that black box models are the only option, assuming that an application is not feasible unless the capabilities of the model are good enough. Even in cases where the interpretable models are more competitive, there is usually a tradeoff between capability and interpretability. People are working on closing that gap, but I suspect it will remain for the foreseeable future, and potentially will always exist.”

Though there is already a subset of AI known as Explainable AI (XAI), the general techniques it promotes are often diminutive and inaccurate in encompassing the true breadth of the processes, and AI developers are not incentivized to follow this model. The issue with explainability has to do with the fact that because AI systems have become so complex, blanket explanations only increase the power differential between AI systems and their creators, and AI developers and their users. In other words, seeking to add explainability after an AI system is already in place makes it more difficult to approach than if you start with it.

“Maybe the answer is to abandon the illusion of explanation, and instead focus on more rigorously testing the reliability, biases, and performance of models, as we try to do with humans,” Clune said.

In recent years there’s been a small but real push by some in the industry to develop “white-box models,” which are more transparent and whose outputs can be better explained (it’s worth mentioning that the white-box / black-box terminology is in itself part of a long history of racially coded terms in science; researchers have pushed to change “blacklist” to “blocklist,” for example.) White-box models, nonetheless, are a relatively new branch of AI research that are seeking to make AI more explainable.

AI researchers say giving users who are impacted by a certain system a bigger role in participating in the development process is an important first step in making AI systems that are more transparent and adequately represent user needs.

“A lot of the explanations that people treat as explanations really aren’t. They are reductive, they are written to the interests of the developers and what the developers think are important to explain, rather than what the user needs,” Os Keyes, a Ph.D. candidate at the University of Washington’s Department of Human Centered Design & Engineering, told Motherboard. “Arguably, I’d say there are two big changes to AI that would be necessary to change this state of affairs, and the first is that, ultimately, this is in part a problem of the massive gap in practice between developers and actual users.”

“Broader participation in not just building the system, but also asking, what questions are interesting, what things need to be possible for this to really be explainable,” Keyes added. “That would make a massive difference.”

Ammanath agrees that some of the best practices in fostering explainability include tailoring explanations and reporting to the people who will engage with or be impacted by the automated systems. Along the same line, she said, developers need to first identify the needs and priorities of the people who will be most affected.

A more challenging problem is that many AI systems are designed for the concept of universalism—the idea that “[a] system is good if it works everywhere for everyone at all times” Keyes explained. “But the problem is that that’s not how reality works, different people are going to need different explanations of different things. If we really want AI to be more explainable, we actually have to really, fundamentally change how we imagine and how developers imagine AI.”

In other words, if you build explainable AI with a one-size-fits-all design process, “you end up with something where it has explanations that only make sense to one group of people who are involved in the system in practice,” said Keyes. “The internal change is a [much] broader set of involvement in deciding explainable to whom, and what does explainable mean.”

Keyes’ call for more localized AI systems and their concern about the universality of AI models is what researchers have been warning us about in the past few years. In a 2021 paper co-authored by Bender and Gebru, who was terminated from Google for publishing this research, the authors argue that training AI models with big data make it difficult to audit for embedded biases. They wrote that big data also fails to represent populations that have less access to the internet and “overrepresents younger users and those from developed countries.”


“If we orient knowledge and AI around big data, then we’re always going to bias towards those who have the resources to spin up a thousand servers, or those who have the resources to, you know, get a billion images and train them,”  said Keyes. “There’s something fundamentally, I’d say undemocratic, but I’d also say just badly incentivized in that.”

“The question first is, what are the conditions under which AI is developed? Who gets to decide when it’s deployed? And with what reasoning? Because if we can’t answer that, then all good intentions in the world around how do we live with that [AI] are all screwed,” they added. “[I]f we’re not participating in those conversations, then it’s a losing game. All you can do is have something that works for people with power, and silences the people who don’t.”

Debiasing the datasets that AI systems are trained on is near impossible in a society whose Internet reflects inherent, continuous human bias. Besides using smaller datasets, in which developers can have more control in deciding what appears in them, experts say a solution is to design with bias in mind, rather than feign impartiality.

“The approach I currently think is the best is to have the system learn to do what we want it to,” Clune said. “That means it tries to do what we ask it to (for example, generate pictures of CEOs) and if it does not do what we like (e.g. generating all white males), we give it negative feedback and it learns and tries again. We repeat that process until it is doing something we approve of (e.g. returning a set of pictures of CEOs that represents the diversity in the world we want to reflect). This is called ‘reinforcement learning through human feedback’, because the system is effectively using trial and error learning to bring its outputs in line with our values. It is far from perfect, and much more research and innovation is required to improve things.”

As we continue to negotiate where and how AI should be used, there are many things to consider before we start letting AI hire people or decide who to give loans to.

“I think it is absolutely critical to start by keeping in mind that what gets called ‘AI’ isn’t any kind of autonomous agent, or intelligence, or thinking entity,” Bender said. “These are tools, which can serve specific purposes. As with any other tools, we should be asking: How well do they work? How suited are they to the task at hand? Who are they designed to be used by? And: How can their use reinforce or disrupt systems of oppression?”

BBC tries to understand politics by creating fake AI computerized stereotypic Americans

This image released by the BBC shows London-based reporter Marianna Spring, who illustrated how disinformation spreads on sites like Facebook, Twitter and TikTok despite efforts to stop it, and how that impacts American politics. (Robert Timothy/BBC via AP)
This image released by the BBC shows London-based reporter Marianna Spring, who illustrated how disinformation spreads on sites like Facebook, Twitter and TikTok despite efforts to stop it, and how that impacts American politics. (Robert Timothy/BBC via AP)

NEW YORK (AP) — Larry, a 71-year-old retired insurance broker and Donald Trump fan from Alabama, wouldn’t be likely to run into the liberal Emma, a 25-year-old graphic designer from New York City, on social media — even if they were both real.

Each is a figment of BBC reporter Marianna Spring’s imagination. She created five fake Americans and opened social media accounts for them, part of an attempt to illustrate how disinformation spreads on sites like Facebook, Twitter and TikTok despite efforts to stop it, and how that impacts American politics.

That’s also left Spring and the BBC vulnerable to charges that the project is ethically suspect in using false information to uncover false information.

“We’re doing it with very good intentions because it’s important to understand what is going on,” Spring said. In the world of disinformation, “the U.S. is the key battleground,” she said.

Spring’s reporting has appeared on BBC’s newscasts and website, as well as the weekly podcast “Americast,” the British view of news from the United States. She began the project in August with the midterm election campaign in mind but hopes to keep it going through 2024.

Spring worked with the Pew Research Center in the U.S. to set up five archetypes. Besides the very conservative Larry and very liberal Emma, there’s Britney, a more populist conservative from Texas; Gabriela, a largely apolitical independent from Miami; and Michael, a Black teacher from Milwaukee who’s a moderate Democrat.


  • Musk floats paid Twitter verification, fires board

  • Musk boosts surge in misinformation about Pelosi attack

  • White House invites dozens of nations for ransomware summit

  • Poland looks to South Korea to build 2nd nuclear power plant

With computer-generated photos, she set up accounts on Instagram, Facebook, Twitter, YouTube and TikTok. The accounts are passive, meaning her “people” don’t have friends or make public comments.

Spring, who uses five different phones labeled with each name, tends to the accounts to fill out their “personalities.” For instance, Emma is a lesbian who follows LGBTQ groups, is an atheist, takes an active interest in women’s issues and abortion rights, supports the legalization of marijuana and follows The New York Times and NPR.

These “traits” are the bait, essentially, to see how the social media companies’ algorithms kick in and what material is sent their way.

Through what she followed and liked, Britney was revealed as anti-vax and critical of big business, so she has been sent into several rabbit holes, Spring said. The account has received material, some with violent rhetoric, from groups falsely claiming Donald Trump won the 2020 election. She’s also been invited to join in with people who claim the Mar-a-Lago raid was “proof” Trump won and the state was out to get him, and groups that support conspiracy theorist Alex Jones.

Despite efforts by social media companies to combat disinformation, Spring said there’s still a considerable amount getting through, mostly from a far-right perspective.

Gabriela, the non-aligned Latina mom who’s mostly expressed interest in music, fashion and how to save money while shopping, doesn’t follow political groups. But it’s far more likely that Republican-aligned material will show up in her feed.

“The best thing you can do is understand how this works,” Spring said. “It makes us more aware of how we’re being targeted.”

Most major social media companies prohibit impersonator accounts. Violators can be kicked off for creating them, although many evade the rules.

Journalists have used several approaches to probe how the tech giants operate. For a story last year, the Wall Street Journal created more than 100 automated accounts to see how TikTok steered users in different directions. The nonprofit newsroom the Markup set up a panel of 1,200 people who agreed to have their web browsers studied for details on how Facebook and YouTube operated.

“My job is to investigate misinformation and I’m setting up fake accounts,” Spring said. “The irony is not lost on me.”

She’s obviously creative, said Aly Colon, a journalism ethics professor at Washington & Lee University. But what Spring called ironic disturbs him and other experts who believe there are above-board ways to report on this issue.

“By creating these false identities, she violates what I believe is a fairly clear ethical standard in journalism,” said Bob Steele, retired ethics expert for the Poynter Institute. “We should not pretend that we are someone other than ourselves, with very few exceptions.”

Spring said she believes the level of public interest in how these social media companies operate outweighs the deception involved.

The BBC said the investigation was created in accordance with its strict editorial guidelines.

“We take ethics extremely seriously and numerous processes are in place to ensure that our activity does not affect anyone else,” the network said. “Our coverage is transparent and clearly states that the investigation does not offer exhaustive insight into what every U.S. voter could be seeing on social media, but instead provides a snapshot of the important issues associated with the spread of online disinformation.”

The BBC experiment can be valuable, but only shows part of how algorithms work, a mystery that largely evades people outside of the tech companies, said Samuel Woolley, director of the propaganda research lab in the Center for Media Engagement at the University of Texas.

Algorithms also take cues from comments that people make on social media or in their interactions with friends — both things that BBC’s fake Americans don’t do, he said.

“It’s like a journalist’s version of a field experiment,” Woolley said. “It’s running an experiment on a system but it’s pretty limited in its rigor.”

From Spring’s perspective, if you want to see how an influence operation works, “you need to be on the front lines.”

Since launching the five accounts, Spring said she logs on every few days to update each of them and see what they’re being fed.

“I try to make it as realistic as possible,” she said. “I have these five personalities that I have to inhabit at any given time.”