AI has a big data problem. Here’s how to fix it

AI has a big data problem. Here's how to fix it

Artificial intelligence has, quite literally, got a big data problem – and one that the COVID-19 crisis has now made impossible to ignore any longer. 

For businesses, governments, and individuals alike, the global pandemic has effectively redefined "normal" life; but while most of us have now adjusted to the change, the same cannot be said of AI systems, which base their predictions on what the past used to look like.

Speaking at the CogX 2020 conference, British mathematician David Barber said: "The deployment of AI systems is currently clunky. Typically, you go out there, collect your data set, label it, train the System and then deploy it. And that's it – you don't revisit the deployed System. But that's not good if the environment is changing."

Barber was referring to supervised machine learning, which he called today's "classical paradigm" in AI, and which consists of teaching algorithms by example. In a supervised model, an AI system is fed a large dataset that has been previously labeled by humans, and which is used to train the technology into recognizing patterns and making predictions.

You could train an algorithm to automate the lending decision in a bank for example, based on individuals' incomes or credit scores. Cue COVID-19, along with a whole new set of banking patterns, and the AI system is likely to be at a loss to decide who gets the cash.

Similarly, a few months into the COVID-19 crisis, a US researcher pointed out that algorithms, despite all the training data they have been fed, wouldn't be all that helpful in understanding the nature of the outbreak or its spread across the globe.

Because of the lack of training data about past coronaviruses, explains the research, most of the predictions generated by AI tools were found to lack reliability, and results often skewed away from the severity of the crisis. 

Meanwhile, in healthtech, the makers of AI health tools struggled to update their algorithms due to a lack of relevant data about the virus, resulting in many "symptom finder" chatbots being a little off the mark.

With data from a pre-COVID environment not matching the real world anymore, supervised algorithms are running out of examples to base their predictions on. And to make matters worse, AI systems don't flag their uncertainties to their human operator. 

"The AI won't tell you when it actually isn't confident about the accuracy of its prediction and needs a human to come in," said Barber. "There are many uncertainties in these systems. So it is important that the AI can alert the human when it is not confident about its decision."

This is what Barber described as an "AI co-worker situation", where humans and machines would interact to make sure that gaps aren't left unfilled. In fact, it is a method within artificial intelligence that is slowly emerging as a particularly efficient one.

Dubbed "active learning", it consists of establishing a teacher-learner relationship between AI systems and human operators. Instead of feeding the algorithm a huge labeled dataset, and letting it draw conclusions – often in a less-than-transparent way – active learning lets the AI system do the bulk of data labeling on its own, and crucially, ask questions when it has a doubt.

The process involves a small pool of human-labeled data, called the seed, which is used to train the algorithm. The AI system is then presented with a larger set of unlabeled data, which the algorithm annotates by itself, based on its training – before integrating the newly labeled data back into the seed.

When the tool isn't confident about a particular label, it can ask for help from a human operator in the form of a query. The choices made by human experts are then fed back into the system, to improve the overall learning process. 

The immediate appeal of active learning lies in the much smaller volume of labeled data that is needed to train the system.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Why Sentiment Analysis Could Be Your Best Kept Marketing Secret

4 Dec, 2018

Sometime before the holidays in 2014, the travel company Expedia Canada launched its “escape winter” campaign. Nothing unusual so far. …

Read more

Prescriptive Analytics – Getting Ahead of the Curve to Solve Big Data Problems

21 Sep, 2018

Pricing Prescriptive Analytics – Getting Ahead of the Curve to Solve Big Data Problems In this modern age, data generated …

Read more

How Criminals are Using Big Data for Their Crimes

23 Jun, 2018

For all the advantages big data has given to organizations, one that has proven especially beneficial is its use in …

Read more

Recent Jobs

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

D365 Business Analyst

South Bend, IN, USA

22 Apr, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.