Algorithmic Bias in AI: What It Is and How to Fix It

By IBM Technology

Summary

Topics Covered

Bad Data Fuels Bias Feedback Loops
Design Flaws Embed Human Biases
Proxy Data Masks Discrimination
Resume AI Favored Males
Diverse Teams Prevent Hidden Biases

Full Transcript

The more we use AI algorithms to discover patterns, to generate insights and just to help us make decisions, the more we should be concerned with the impact of algorithmic bias.

What is it and how can we minimize it?

Well, let's find out.

Algorithmic bias can lead to harmful decisions and actions, causing machine learning algorithms to produce unfair or discriminatory outcomes.

It's something we want to avoid.

So let's take a look at the causes of algorithmic bias, some real world examples and mitigation strategy, and let's get started with causes.

Now algorithmic bias is not necessarily caused by the AI algorithms themselves, but by how data is collected and coded.

We can think of this in terms of four different causes.

Now, the most obvious is biases that occur in the actual training dataset itself.

Essentially, we're talking about bad data.

That's data that's non-representative or lacks information or is in some other way a misrepresentation of the ground truth.

It can also be data that is incorrectly classified, causing the algorithm to misunderstand what the data represents, and a little bad data can go a long way.

AI systems that generate biased results may use those results as input data for further decision making, and while that creates a feedback loop that can reinforce this bias over and over again.

Now, another cause, of algorithm, of algorithmic bias is really related to algorithmic design.

So this is talking about programing errors such as an eye designer unfairly weighting factors in the decision making process.

They can unknowingly transfer into the system some of those biases, where it might be developers that might embed the algorithm with subjective rules based on their own conscious or unconscious biases.

Poor algorithmic design that can also lead to correlation bias, such as an algorithm that determines a causal relationship between increased shark attacks and higher ice cream sales.

Hey, they're both higher in the summer, but that's correlation, not causation, and an example of where the model failed to consider other factors in the data that may be of more importance.

We can also have biases in proxy data as well.

What's proxy data?

Well, that's data used as a stand in for attributes not available in the ground truth data.

So things like race or gender.

And that could be because they're in some way protected or they just plain unavailable.

For example, zip codes often serve as proxies for social economic status, that might unfairly disadvantage particular demographic groups when evaluating applications or opportunities.

And there's also biases in evaluation as well.

How we interpret the results from an algorithm.

So even if the algorithm is completely neutral and it's completely data driven, how an individual or how a business applies the algorithms output can lead to unfair outcomes depending on how they understand those outputs.

Now there are a bunch of real world examples of algorithmic bias.

Look, I'm not here to name and shame, but let's briefly discuss a few high profile ones, like biases that have occurred in recruitment.

Now an IT company built an algorithm, and that algorithm could review resumes.

Unfortunately, they discovered this algorithm systematically discriminated against female job applicants.

Why?

Well, developers are training the hiring algorithm, use resumes from past hires, and it turns out that those past hires were predominantly male.

As a result, the algorithm favored keywords and characteristics found in men's resumes.

For example, the algorithm downgraded resumes that included the word women, as in women's rugby team, and it favored the kind of words that men tend to use more often, such as captured and executed.

Now, algorithms, they also play a big role in guiding decisions in the area of finance.

In the financial services sector and algorithmic bias here can have severe consequences for people's livelihoods, as historical data can contain all sorts of demographic biases affecting things like creditworthiness and loan approvals.

For example, a study from the University of California, Berkeley showed that an AI system for mortgages routinely charges minority borrowers higher rates for the same loans when compared to white borrowers.

And look, I could go on AI image generator, for example, where generated images of people in specialized professions found biases related to gender and age, or how bias in pricing led to ride sharing algorithms charging more for drop offs in neighborhoods with high nonwhite populations,

or how policing algorithms in Columbia reflected social biases that misrepresented forecasted criminal activity, but how about we instead use the remaining time to figure out the steps that we can take to reduce algorithmic bias?

Now mitigating bias from AI systems, that starts with AI governance.

Many of the guardrails that make sure air tools and systems are safe and ethical.

So let's take a look at four ways to do that across the system lifecycle.

And first up is diverse and representative data.

Machine learning is only as good as the data that trains it.

Data fed into machine learning models must be representative of all groups of people and reflected of the actual demographics of society.

Unlike, say, a training data set filled with only male resumes, but good representative data is just the start.

There should also be a system for ongoing bias detection and that can detect and correct potential biases before they create problems. Now, that could be through initiatives like impact assessments, algorithmic auditing and causation tests.

Remember sharks and ice cream?

Now, this is where human in the loop processes can help, where recommendations be reviewed by humans before a decision is made final.

Now the outputs of AI algorithms can often be something of a black box, making it difficult to understand their outcomes.

So transparent AI those systems document and do their best to explain the underlying algorithms methodology.

Now, to be clear, this is still an emerging field, but advances are being made in AI interpretability, which goes some way to explaining how algorithms arrive at their outcomes.

And then finally, we have inclusive AI, which means developing AI systems where the developers, where the data scientists, where the machine learning engineers are varied racially, economically, by education level, by gender, by job description and all sorts of other demographic metrics

This will bring different perspectives to help identify and mitigate biases that might otherwise go unnoticed.

The fact is that algorithmic bias has many causes, and as AI becomes more prevalent in decision making, the importance of detecting and mitigating these biases only grows.

If you have any questions, please drop us a line below.

And if you want to see more videos like this in the future, please like and subscribe.

Thanks for watching.

Loading...

Loading video analysis...