When Big Data Becomes Bad Data

 

A recent ProPublica analysis of The Princeton Review’s prices for online SAT tutoring shows that customers in areas with a high density of Asian residents are often charged more. When presented with this finding, The Princeton Review called it an “incidental” result of its geographic pricing scheme. The case illustrates how even a seemingly neutral price model could potentially lead to inadvertent bias — bias that’s hard for consumers to detect and even harder to challenge or prove.

[pullquote cite="Lauren Kirchner" type="right"]Corporations are increasingly relying on algorithms to make business decisions and that raises new legal questions.[/pullquote]

Over the past several decades, an important tool for assessing and addressing discrimination has been the “Disparate Impact” theory. Attorneys have used this idea to successfully challenge policies that have a discriminatory effect on certain groups of people, whether or not the entity that crafted the policy was motivated by an intent to discriminate. It’s been deployed in lawsuits involving employment decisions, housing and credit. Going forward, the question is whether the theory can be applied to bias that results from new technologies that use algorithms.

One unexpected effect of the company's geographic approach to pricing is that Asians are almost twice as likely to be offered a higher price than non-Asians, an analysis by ProPublica shows. Read the story.

The term “disparate impact” was first used in the 1971 Supreme Court case Griggs v. Duke Power Company. The Court ruled that, under Title VII of the Civil Rights Act, it was illegal for the company to use intelligence test scores and high school diplomas — factors which were shown to disproportionately favor white applicants and substantially disqualify people of color — to make hiring or promotion decisions, whether or not the company intended the tests to discriminate. A key aspect of the Griggs decision was that the power company couldn’t prove their intelligence tests or diploma requirements were actually relevant to the jobs they were hiring for.

In the years since, several disparate impact cases have made their way to the Supreme Court and lower courts, most having to do with employment discrimination. This June, the Supreme Court’s decision in Texas Dept. of Housing and Community Affairs v. Inclusive Communities Project, Inc. affirmed the use of the disparate impact theory to fight housing discrimination. The Inclusive Communities Project had used a statistical analysis of housing patterns to show that a tax credit program effectively segregated Texans by race. Sorelle Friedler, a computer science researcher at Haverford College and a fellow at Data & Society, called the Court’s decision “huge,” both “in favor of civil rights…and in favor of statistics.”

So how will the courts address algorithmic bias? From retail to real estate, from employment to criminal justice, the use of data mining, scoring software and predictive analytics programs is proliferating at an exponential rate. Software that makes decisions based on data like a person’s ZIP code can reflect, or even amplify, the results of historical or institutional discrimination.“[A]n algorithm is only as good as the data it works with,” Solon Barocas and Andrew Selbst write in their article “Big Data’s Disparate Impact,” forthcoming in the California Law Review. “Even in situations where data miners are extremely careful, they can still affect discriminatory results with models that, quite unintentionally, pick out proxy variables for protected classes.”

It’s troubling enough when Flickr’s auto-tagging of online photos label pictures of black men as “animal” or “ape,” or when researchers determine that Google search results for black-sounding names are more likely to be accompanied by ads about criminal activity than search results for white-sounding names. But what about when big data is used to determine a person’s credit score, ability to get hired, or even the length of a prison sentence?

Because disparate impact theory is results-oriented, it would seem to be a good way to challenge algorithmic bias in court. A plaintiff would only need to demonstrate bias in the results, without having to prove that a program was conceived with bias as its goal. But there is little legal precedent. Barocas and Selbst argue in their article that expanding disparate impact theory to challenge discriminatory data-mining in court “will be difficult technically, difficult legally, and difficult politically.”

There still exists “a large legal difference between whether there is explicit legal discrimination or implicit discrimination,” said Friedler, the computer science researcher. “My opinion is that, because more decisions are being made by algorithms, that these distinctions are being blurred.”

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

In Algorithms We Trust. Do We?

7 Jul, 2015

Let’s try to paint a likely not-too-distant future. In this future, everything around us will be managed by algorithms. They …

Read more

The World in 2100, According to NASA’s New Big Dataset

16 Jun, 2015

  The image you’re looking at is a glimpse into our future. Welcome to July 2099, according to 21 different …

Read more

Competition Heats Up for Elusive Big Data Professionals

3 Jan, 2016

In the coming year Big Data technology will continue to be the big thing for businesses. Experts estimate the amount …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.