The first rule of data science

“The first rule of data science is: don’t ask how to define data science.” So says Josh Bloom, a UC Berkeley professor of astronomy and a lead principal investigator (PI) at the Berkeley Institute for Data Science (BIDS). If this approach seems problematic, that’s because it is—data science is more of an emerging interdisciplinary philosophy, a wide-ranging modus operandi that entails a cultural shift in the academic community. The term means something different to every data scientist, and in a time when all researchers create, contribute to, and share information that describes how we live and interact with our surroundings in unprecedented detail, all researchers are data scientists.

We live in a digitized world in which massive amounts of data are harvested daily to inform actions and policies for the future. We build sophisticated systems to collect, organize, analyze, and share data. We each have unlimited access to huge amounts of information and the tools to interpret it. We are more aware than ever how molecules and cells move, how inflation fluctuates, and how the flu travels, all in real time. We can efficiently distribute bus stations and plan transit schedules. With the right tools, we can predict how proteins misfold in our brains,

or what our galaxy might look like in a thousand years. In a society driven by data, knowledge is a commodity that is created and shared transparently all over the world. It connects causes with effects, familiar places with distant locales, the past with the future, people with one another.

In this rapidly changing world, universities are faced with the challenge of adapting to increasingly data-driven research agendas. At UC Berkeley and elsewhere, scientists and administration are working together to reshape how we do research and ultimately restructure the culture of academia.

More than ever, researchers in all disciplines find themselves wading through more and more kinds of data. Frequently, there is no standard system for storing, organizing, or analyzing this data. Data often never leaves the lab; the students graduate, the computers are upgraded, and records are simply lost. This makes research in the social, physical, and life sciences difficult to reproduce and develop further. To make matters worse, it’s no easy task to build tools for general scientific computing and data analysis. Doing so requires a set of skills researchers must largely learn independently, and a timeframe that extends beyond the length of the average PhD.

Historically, no single practice described the simultaneous use of so many different skill sets and bases of knowledge. However, in recent years data science has emerged as the field that exists at the intersection of math and statistics knowledge, expertise in a science discipline, and so-called “hacking skills,” or computer programming ability. While these skills are changing the way that science is practiced, they’re also changing other aspects of society, such as business and technology startups. In a world where rapidly advancing technology is forcibly changing data science practices, universities are struggling to keep up, often losing good researchers to industries that place a high value on their computational skills.

Despite its increasing importance and relevance, it’s almost impossible to pin down what data science actually is. Data scientists hate doing it. Bloom describes data science as a context-dependent way of thinking about and working around data—a set of skills derived from statistics, computer science, and physical and social sciences. Cathryn Carson, the associate dean of social science who is heavily involved in BIDS and the new social science Data Laboratory (D-Lab), is more interested in how we can use the idea of data science to do more interesting science. This involves bringing people from different areas of expertise together to work on multifaceted problems. “It’s a kind of social engineering,” Carson says.

 

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Digital transformation: How to personally brand yourself as a leader

26 Dec, 2019

Do you want to be known as a digital transformation leader? Hint: The answer is yes. Sure, digital transformation is a …

Read more

How AI and ML Can Improve Exploding Cloud Computing Needs

30 Aug, 2020

With 90% of the world’s data created in the last two years, the cloud risks becoming a dumping ground. Here …

Read more

U.S. Chief Data Officer: ‘Time is Now’ For Technologists to Jump into Public Service

8 Dec, 2016

Data is one commonality all today’s emerging technologies and levels of society share. Cloud computing, the internet of things, artificial …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.