Yet Another AI Blind Spot: Biased Images

Published Jun 14, 2023, 9:00 AM

As pressure mounts on lawmakers to regulate artificial intelligence, another problem area of the technology is emerging: AI-generated images. Early research shows these images can be biased and perpetuate stereotypes. Bloomberg reporters Dina Bass and Leonardo Nicoletti dug deep into the data that powers this technology, and they join this episode to talk about how AI image generation works—and whether it’s possible to train the models to produce better results.

Read more: Humans Are Biased. Generative AI Is Even Worse

Listen to The Big Take podcast every weekday and subscribe to our daily newsletter: https://bloom.bg/3F3EJAK 

Have questions or comments for Wes and the team? Reach us at bigtake@bloomberg.net.

Okay, make a photo of a CEO and enter all right results. Oh, it made four different pictures and all of them appear to be light skinned men in suits. Okay, do that again, photo of a ce Oh and for more people who look like men to me. This is my very first time using an AI image generator, is through a free website on my laptop. Essentially, it's an AI model like chat GPT, but instead of creating AI generated text, it produces AI generated images. I type into a search box what I want to see, and a few seconds later it produces pictures that the model believes are what I'm asking asking for. All right, let's do this again. Photo of a physician and looks to me like men, but in lab coats and scrubs. Oh, and you know they're doctors because they all have stethoscopes around their necks. I requested images for several different jobs and repeated the requests at least ten times for each one. The results were eye opening. Almost all the images of CEOs and doctors the model produced, at least when I was using it, appeared to me to be men. All nurses and almost all teachers appeared to be women. By the way, I'm saying appeared to be women or men, because these images are people who don't actually exist, and so identifiers like gender, race, and ethnicity are subjective and we'll talk more about that a bit later. Most images of attorneys the model look to me to be light skinned men. Images of scientists appeared to be more diverse, but they also mostly looked like men, and this is really weird. When the model did generate pictures of attorneys or scientists who looked to me like women, they were very often shown dressed in traditional men's clothing, like business suits and wearing neckties, as though the model couldn't fully accept the concept of a woman in those professions. AI sometimes generates a distorted version of reality that doesn't look like the world we live in, and it can perpetuate gender and racial stereotypes. This matters because, as we know, AI is fast working its way into our lives, and AI generated images that can make us believe something artificial is actually something real, maybe especially influential and potentially harmful.

So we're looking at a situation where we're generating more and more content via AI, more and more of these synthetic images. Those images become a part of the body of images, the body of work that's on the internet, they are more biased than reality. And then in the future those images get fed back into future AI systems, so that you end up in this nasty cycle where the bias is getting worse and worse and being fed back into future systems which are then less diverse.

So there was a recent eurocal report that suggested that by twenty twenty six, ninety percent of all online content could be artificially generated. What happens when ninety percent of all online images are images reinforcing those stereotypes.

Bloomberg's Dina Bass and Leonardo Nicoletti dug deep into the data to find out why the results look like this and what can be done to fix the shortcomings of this rapidly emerging technology. I'm wes Kasova today on the Big take. You can't trust your eyes when it comes to AI, like, maybe you can start by just giving us an overview of what you found in this investigation.

This investigation essentially looks at generative AI, which is a new type of AI. It's like chat GPT where you ask it a question and it just answers you or gives you the information you need. In our case, we use stable diffusion, which is similar to chat GPT, but it's instead of text to text, which means like using texts to generate more text, it's text to image where you ask it a question or give it a description, and then it will generate an image for you of what you're looking for. And well, you know that gives us lots of possibilities and opens lots of doors for work, for design, for artistic advertising, lots of purposes. What we found is that it also has very strong bias is against people of color, women in general. And so what we wanted to show through this piece is really how biased is generative AI, and specifically generative AI that creates images and visual representations. And you know to what extent are these biased ingrained in this technology? And you know what are the potential implications of that.

This is significant because it's different from the kind of facial recognition that we already know about, Is that right?

It is?

So somewhere around twenty eighteen we started finding out that facial recognition software had significant racial and gender biases. And what that software is You have a picture, an image, and the AI scans it and tries to predict what's in it?

You know, what am I looking at? Is it a black cat, Is it a cheeseburger? Is it a white woman? What am I looking at?

In twenty eighteen, a couple of researchers join Bulmini, Timnique Cabrew, and then Deboraji combined to do some work called gender sheets, where they ran a bunch of the popular facial recognition programs, ran tests on them and found that their performance was significantly worse on people of color and significantly worse on women of color. So we've known that that's an issue, and it's wrapped up in real world scenarios. There have been situations in the US where black men have been mistakenly arrested because they were flagged by facial recognition software. It turns out it was some completely different person, So we know that's a problem. Generative AI is a new type of AI, and it's a new wrinkle. So instead of AI that scans existing pictures, it's creating new ones and that we found also has significant racial and gender biases. So the additional, you know, significant issue that this raises is we're now using artificial intelligence to create massive volumes of new content. Then put out into the world for use. That new content is demonstrating racial and gender bias, and we're adding to a body of content out there, using it for reports, using it for clip art, for presentations, and it is significantly biased.

And Leo, how did you go about finding this bias in this new form of generative AI?

As a half reporter but also half former scientists, academic and just coder. The fact that many of these models generative AI models are open source was actually very useful for just researchers in general, but also reporters because it gives the possibility for anybody to download the generative AI model, in our case, Stable Diffusion and ask it to generate images. And so what I did is I simply went on the hugging Face platform, which is this really interesting and very useful platform that has come out recently that hosts all of these models, including source versions of GPT for example, and stable Diffusion, and I downloaded the model, and then I wrote some code to basically iterate through a series of very well known high paying and low paying jobs and also different criminalized activities, and just ask the model a very simple question, can you generate color photograph of blank? And Blank is a judge, an engineer, and a janitor, a housekeeper, a fast food worker. For professions, for example, and for criminalized activities, we looked at three of them, so Blank would be a terrorist, a drug dealer, or an inmate. I let my computer run for actually an entire month, because it's very computationally heavy to generate thousands of images for each of those keywords. So that was the first step.

So you would just say to it, make me a picture of a CEO and then see what it came up with.

Exactly. But the idea was to do that exact same thing thousands of times, so that instead of, you know, having anecdotal evidence that the AI might be biased, we would actually gather a database of images of the same thing over and over and over. Basically that would allow us as reporters and as data scientists to then analyze those thousands of images and actually find a pattern across those images. So that's exactly what we did.

And what is the pattern that you found when you typed in ceo, when you typed in fast food worker all of the other things you mentioned, and then asked it to show you pictures thousands of times? What did it turn up.

So the pattern is a very stark pattern. It's that for high paying professions, the generative AI model is overwhelmingly generating pictures of white men, and for low paying professions it's overwhelmingly generated more pictures of women and darker skinned people. So in our analysis we couldn't really talk about race because race is very hard to quantify in images, especially when you have images of fake people essentially that can't really self identify, So you can't say this is a black person or this is an Asian person. But what you can do is rigorous scientific analysis where you do things like average all the pixels of a person's skin across all of the images of one profession, and for example, doing that, what we found is that the pattern of darker skinned subjects being overrepresented in low paying professions and lighter skin subjects being overrepresented in high paying professions. And the same goes for criminalized activities, where you have darker skin tones constantly and systematically being represented in criminalized activities.

Tina Leo was talking about stable diffusion. Exactly what is that and how does it work?

So stable diffusion is a text to image program that is open source. It's distributed by a company called Stability Ai, and the version that we used is Leao mentioned hosted on hugging Face, which is basically a repository of open source AI model. So some of your listeners may have heard of GitHub, which is a repository of programming code. Hugging Face tries to be sort of like a version of that for AI models, and a lot of your listeners may have actually heard of a different image generation program, which is open AI's Dolli. Dolli to the second version of it came out in wider distribution last year and around July. It was announced a bit earlier than that, and that was also very popular and attracted a lot of attention. Stable Diffusion followed that and came out as an open source version, and because it was open source, it's been very widely used. In order to use the open ai version, you for you know, commercial applications, you have to work with open Ai.

You have to pay for that, and so it's a little different.

I just want to talk for a minute about why we did not look at open AI's Dolli, and that's because it's not open source, so we can't tell what is in the training data.

For DOLLI in a way that we.

Can for stable diffusion, and there are greater limits on what you can do with it.

It was sort of difficult to look at the bias there and.

Dan, by open source, what exactly do you mean.

So it's basically the opposite of proprietary software.

It's freely distributed. It's openly distributed.

Anyone can download it, use it, and in the case of AI models, you have greater freedom to play with it to tweak different parts of the AI model to what you need.

You can see exactly the code or the data that is going behind an AI model, and you can see the different versions of the model over time, and that's very important for people who are trying to improve these things because you can basically have some sort of version control, so control fork. The previous version used to be like this, and now we've improved it, and now we can see clearly the difference between the new version and the previous version. Actually, for this story, we did interview prominent academics within this field, and they've all really stressed this point that one of the only ways to address the problem of bias is to start by having open source models, because then those models can be taken by other academics or other organizations that are also transparent, and whatever they do to them to quote unquote improve them is now again made transparent, made very publicly known, and available to yet more academics to improve upon it.

Again, there's also greater auditability.

The reason that we were able to run this experiment on stable diffusion is that it's open source. So you know, we obviously found some significant problems, but there is that auditability. You don't have that with open aies Dolly and so again. Open AI has said that they're taking steps to address representation and make sure that the outputs are representative, but you kind of have to trust them because you don't know what they're doing.

After the break, what's the data set behind these AI generated images? We know from everything we've been hearing all about chat GPT now GPT for that it takes as its source enormous amounts of data that exists on the Internet. What is the source material for generative AI when it comes to images.

So, the source material for most generative AIS models, these so called large language models, it's basically the entire Internet. In simple terms, it's everything that's been posted on the Internet in the past ten fifteen years. The way that works is that there is a data set called Lyon, which basically collected URLs to images or texts for the past fifteen years all over the Internet.

When you're training on data from across the entire Internet, as most of us know, there's a fair amount of unsavory stuff out there on the Internet. And you know, there's been some academic work done on the earlier version of this data set, this line On data set that found pornography, violence, again, racial and gender bias. When certain terms that were associated with certain races were used, it was much more likely to bring up an image that was sexualized.

So there are a lot of problems.

Within that data set, and it is an openly available, open source data set, and so the viewpoint of the people behind it is, look, you know, you should use this for academic work. If you're using this in a commercial product, you've got to actually take some responsibility for the content, and we've made some not Safe.

For work filters. There are steps you can take, but you.

Know, when you're training a model on a large volume of data from across the entire Internet, there is a lot of unsavory stuff in there, and there are way way too many images in this data set for anybody to go through it and make sure that they're cleaning it up.

Diana, what does Stable Diffusion say about your findings about this bias in their data?

We reached out to Stable Diffusion and explained what we were finding, and they sent us in an email state informous spokesperson saying that quote, all AI models have inherent biases that are representative of the data sets they're trained on, and by open sourcing our models, we aim to support the AI community and collaborate to improve bias evaluation techniques and develop solutions beyond the basic prompt modification. The company also told us that, you know, they have sort of an initiative to developed some open source models that will be trained on data sets that are specific to different countries and cultures, and so part of the argument the company was making is that the open source nature of what they're doing will enable them to address some of these issues by getting more and more data that is more diverse than what they currently have.

Dina, we can see why bias would be so harmful, especially when it comes to images, which are very powerful. What are some of the real world downsides that we see with the possibility of fake images, bias and images being proliferated all over the world.

There is an issue of deep fakes, things that are meant to mislead people, misinformation that you can't tell as AI generated. With the specific issue of bias. There's a number of issues that crop up here. One is a representation one. So if we're going to start using all of these synthetic generated images for brochures, for advertisements, for marketing materials, and we're already seeing what happens when the marketing materials have all the CEOs be white men, doesn't that worse in the situation that we already have. You know, one of the things that we found in this experiment was that the bias in the unstable diffusion was actually worse than the real world. So we know that there are fewer female CEOs, but the number of female CEOs that were being generated in these experiments was even lower than the real world. So we're looking at a situation where we're generating more and more content via AI, more and more of these synthetic images. Those images become a part of the body of images, the body of work that's on the internet, they are more biased than reality, and then in the future those images get fed back into future AI systems, so that you end up in this nasty cycle where the bias is getting worse and worse and being fed back into future systems which are then less diverse.

So there was a recent eurocal report that suggested that by twenty twenty six, ninety percent of all online content could be artificially generated. What happens when ninety percent of all online images are images reinforcing those stereotypes? One of the main impacts can really affect people's mental health and how they project themselves into the world and you know, what kind of jobs that they see themselves doing in life. So that's a really big issue that can definitely be reinforced by this problem.

When we come back. How can artificial intelligence become more intelligent? Diina earlier layout said that these open source models have one advantage, which is that everybody is able to kind of work on them and improve them. And if you are, say, had an advertising agency making a brochure and you ask it to create a CEO and it's a white male, can't you say no, that's not the image I'm looking for that there's a certain amount of responsibility of people who are generating these images not to just simply accept what the generative AI bot spits out.

When we talk about AI bias, a lot of the quote unquote blame for it gets put on the data sets. There needs to also be accountability from users at all levels, and that includes the people that are creating the models, the researchers that are working on the models, who have their own biases that get kind of imprinted on these models, and it includes the people that are using them.

At the end of the day, it's not totally clear.

To me that you can currently use these models that effectively to even specify in that way and get the output you want.

So do you know what can actually be done to fix this? We talked earlier about how there's a lot of work being done to improve these models.

One of the things that needs to be done is increased diversification of the data set to be a way to get data from other countries, other cultures, and there needs to be to be clear a way.

To do that that's ethical.

There have been projects or companies that have tried to source a more diverse set of data, but they've done it in unethical ways. They've tried to get images of people, and they've done it without consent. This is sort of cropped up in the facial recognition era when people are trying to fix those systems, just as a question about the largeness of all of these models. So the current trend in AI is that bigger is better, that the only way to do these kind of foundational models is to have the sum total of the Internet dumped into the training data. There are people working on ways to do better smaller models, in which case you have greater control over what is in the data set and you can do things that are more targeted. If we move to optimizing the technology where you don't just have to add more volume in order to have a better performing algorithm or model, that could help as well.

LAO is somebody who is deep in this data and watching how it's developing very rapidly. What are you watching for as this keeps unfolding.

One of the most interesting developments is really the open source versus closed source models, And you know which are going to become the status quo because you know it's not clear right now. It's very easy to use closed source models in some way because they have better user interfaces, and they market it better, and you know, it's it's for profits, so they have all these ways to kind of like get really mainstream. But at the same time, the open source models are being used by millions of people, not just people you know that are using them like as a developers or researchers. And then we also see private companies adapting open source models as opposed to closed source models because they actually recognize the fact that they can and build on top of those models within their systems. It's very unclear and it will be interesting to see if five ten years from now, generative bi has become completely a closed source thing because it's easier to regulate. You can just regulate private companies and tell them what to do, or it's become an open source thing because there's more transparency and it's easier to see if things are getting better or not.

Leo Dina, thanks so much for coming on the show.

Thank you, Wes, thank you for having us.

Thanks for listening to us here at the Big Take. It's a daily podcast from Bloomberg and iHeartRadio. For more shows from iHeartRadio, visit the iHeartRadio, app, Apple Podcasts or wherever you listen, and we'd love to hear from you. Email us questions or comments to Big Take at Bloomberg dot net. The supervising producer of The Big Take is Vicky burg Alina. Our senior producer is Catherine Fink. Frederica Romanello is our producer. Our associate producer is Zenobsiiti. Raphael M Seely is our engineer. Our original music was composed by Leo Sidrin. I'm West Kasova. We'll be back tomorrow with another Big Take