How to Fix Bias in Big Data and Artificial Intelligence

Big data analytics and machine learning are on the rise and set for massive further growth over the coming years. The results of a survey conducted jointly by MIT Technology Review and Google Cloud showed that 60 percent of respondents have already implemented a machine learning strategy in their organization. Furthermore, Deloitte predicts that spending […]

Big data analytics and machine learning are on the rise and set for massive further growth over the coming years. The results of a survey conducted jointly by MIT Technology Review and Google Cloud showed that 60 percent of respondents have already implemented a machine learning strategy in their organization. Furthermore, Deloitte predicts that spending on machine learning (ML) and AI will nearly quadruple from $12 billion in 2017 to $57.6 billion in 2021.

Amidst this growing popularity, a growing concern is that algorithms are only as good as the data that’s fed into them. The old adage “garbage in, garbage out” applies to AI and ML as much as it does to any other computing-based system. Therefore, developers along with companies depending on data and algorithms have a challenge in ensuring that data remains free of bias.

The extent of this challenge shouldn’t be underestimated. When Microsoft unleashed its conversational chatbot Tay onto Twitter in 2016, it took only 24 hours for the robot to start spewing out racist and pro-Hitler tweets. While this is an extreme example, it illustrates the risk that exists.

Subtle bias is everywhere in our society, so it’s only logical that a machine will take what exists and magnify it. For example, in studying machine-based language processing, researchers found that female names were more closely associated with family-based terms, whereas male names correlated more closely with career-based words.

How AI Bias Can Impact Real-World Decisions

Ultimately, societal bias in data could play out to influence real-world scenarios where algorithms are used to drive decision-making. Consider the examples below.

Amazon reportedly ditched an AI recruiting program late last year because the algorithm was biased in favor of selecting men for technical roles. This was because the tool was programmed based on ten years of Amazon’s hiring data, and the vast majority of hires were men.

Insurance companies using algorithms to determine premiums could see the introduction of bias. Propublica has already reported that minorities generally pay more for insurance premiums than white people, and pointed to the use of algorithms to account for this bias.

Algorithms are used by ad-serving platforms such as Google’s Adsense. One experiment showed that there were biases inherent in the job advertisements served to men and women, with women less likely to be shown ads for jobs paying more than $200,000 per year.

Policies, frameworks, and legislation now exist in many companies and countries, designed to prevent discrimination and bias. These usually cover the provision of goods and services but also internal processes such as job advertisement, hiring, and promotion.  Therefore, companies have a vested interest in ensuring that the data and algorithmic solutions they deploy are free of bias. To fail to do so could see litigation, loss of brand reputation and degradation of employee trust.

How to Overcome Bias in Data and AI

Fortunately, there are ways to ensure that the incidences of bias arising from the use of AI and big data are reduced or even eliminated by adjusting algorithms to overcome biases.

It’s possible that the underlying datasets or the algorithm itself can be adjusted to take account of the fact that bias may happen. For example, taking out all gender data may prevent gender bias from occurring. One study found that it was possible to adjust an algorithm using a framework called Reducing Bias Amplification. This framework takes a verb that is subject to heavy bias, such as cooking, and constrains the algorithm from creating any further bias than exists in the initial data set.

While this may work to an extent, the danger of humans meddling in the data and algorithms may inadvertently introduce further bias. The human doing the intervening is also likely to have subconscious biases.  

Automated Predictions on Encrypted Data

In her TED Talk, MIT graduate student Joy Buolamwini outlines how facial recognition algorithms have used data sets based on the faces of white people. This meant that when she tried to test these algorithms on herself in the course of her own research, as a black woman, her face was unrecognizable to the algorithm. This led her to create the Algorithmic Justice League, with one of the goals being to ensure that developers are using broad and inclusive datasets.

One technological solution to the big data bias-processing comes from an MIT-backed project converging AI and Social Physics, called Endor. Social Physics is a new, revolutionary way of understanding human behavior based on the analysis of Big Data to build a predictive, computational theory of human behavior.

Endor’s platform uses mathematical tools to model the behavior of human crowds, and is able to process large quantities of data, and create effective, automated predictions by leveraging these models and its proprietary AI.

Endor’s solution to the bias quandary lies with its ability to run predictions on data while filtering out any existing biases from the original data (i.e. gender bias).  For instance, if an e-commerce company wants to know whom to target next week for a new product, but their database is made up of 90% women, businesses can request to run predictions without the use of the (gender specific) biased data, yet use all other data available on the prospects, resulting in fully bias-free predictions.  

This innovative platform is also using a technology that can compute and analyze encrypted data, without ever having to decrypt it. This makes the accurate predictions scalable and accessible and empowers businesses, large and small alike, to access insights on their sensitive data in a fully secure method. 

Decentralizing the Learning Process

If machines ultimately become intelligent enough to learn to overcome bias, then that learning could be replicated in other machines to create an overall positive snowball effect. Similar to the Microsoft Tay robot, but in reverse. SingularityNET could be the catalyst that achieves this.

The project is using blockchain to decentralize AI algorithms so that one AI machine can learn from another and could be the catalyst that achieves this. SingularityNET is the brainchild of AI pioneer Ben Goertzel, who developed the Sophia robot. A beta version is expected for release later this year.

Final Thoughts

Many countries have made significant strides in reducing discrimination toward consumers and employees through policy and legislative frameworks. Now, it’s important that developers and companies are incorporating the use of algorithms in a way that doesn’t exacerbate existing human biases in employment processes, availability of financial services, insurance and consumer targeting. By building on existing progress in reducing bias and discrimination, emerging technologies are more likely to gain widespread adoption, meaning they can survive long into the future and help further human progress as well.

The Thrive Global Community welcomes voices from many spheres. We publish pieces written by outside contributors with a wide range of opinions, which don’t necessarily reflect our own. Learn more or join us as a community member!
Share your comments below. Please read our commenting guidelines before posting. If you have a concern about a comment, report it here.

You might also like...


Cutting Through the Hype

by Tamara Nall

How I unite art and science in my job as a research scientist

by Courtney Napoles

Approaching Artificial Intelligence From The Perspective Of Mobile App Development

by Melissa Crooks

Sign up for the Thrive Global newsletter

Will be used in accordance with our privacy policy.

Thrive Global
People look for retreats for themselves, in the country, by the coast, or in the hills . . . There is nowhere that a person can find a more peaceful and trouble-free retreat than in his own mind. . . . So constantly give yourself this retreat, and renew yourself.


We use cookies on our site to give you the best experience possible. By continuing to browse the site, you agree to this use. For more information on how we use cookies, see our Privacy Policy.