6 Steps to Effective Data Preparation for Quality Conclusions

6 Steps to Effective Data Preparation for Quality Conclusions

6 Steps to Effective Data Preparation for Quality Conclusions

Data preparation is usually the most time consuming part of a data analysis project. To get good results, follow the six steps here, starting with Understand the Business Needs, Get to Know the Data, and Wrangle, Munge, and Mash Up.

Garbage in, garbage out. In this age of big and unstructured data analytics, good data preparationis a must to avoid risking invalid results or being blocked from analyses of benefit to your business. You may also need to dedicate up to 80% of the time of a data analysis initiative to preparing data properly. So, to optimize results, follow the six steps below.

Ask! Get the ultimate beneficiaries of your data preparation to tell you what business insights or knowledge they want from the data available. Check that enterprise goals translate into appropriate business questions and key performance indicators (KPIs), which can then be mapped onto the data and analytics to be used. Don’t get sucked into a “proof of concept” project without a valid, useful business benefit.

Read Also:
The Modern Analytic Platform [video]

Step 2 – Get to Know the Data

Understand where the data is to be accessed, and whether it falls into the category of simple, diversified, big or complex data. These categories are determined by the overall volume of data and the number of tables. The data you need may be in Excel files, in a data warehouse, or in a CRM system. You’ll need the right credentials to access the data, and the right software and hardware resources to process it.

Time to take out the garbage. Identify or amend your data sources to ensure they are complete, accurate, and current.



Read Also:
How Apache Kafka is powering a real-time data revolution
Big Data Paris 2017
6 Mar
Big Data Paris 2017

15% off with code BDP17-7WDATA

Read Also:
How facial recognition is improving security and business intelligence
Big Data Innovation Summit London
30 Mar
Big Data Innovation Summit London

$200 off with code DATA200

Read Also:
Exploring open data quality
Read Also:
Introducing a Graph-based Semantic Layer in Enterprises
Read Also:
Introducing a Graph-based Semantic Layer in Enterprises

Leave a Reply

Your email address will not be published. Required fields are marked *