A to Z of Analytics
- by 7wData
Digital Transformation, Big Data, Analytics, IoT, Mobility, Cloud are the hottest terms around, with lot of confusion even in matured organizations. This is an effort to simplify the area.
Analytics has taken world by storm & It it the powerhouse for all the digital transformation happening in every industry.
Today everybody is generating tons of data – we as consumers leaving digital footprints on social media, IoT generating millions of records from sensors, Mobile phones are used from morning till we sleep. All these variety of data formats are stored in Big Data platform. But only storing this data is not going to take us anywhere unless analytics is applied on it. Hence it is extremely important to close the loop with Analytics insights.
Here is my version of A to Z for Analytics:
Artificial Intelligence : AI is the capability of a machine to imitate intelligent human behavior. BMW, Tesla, Google are using AI for self-driving cars. AI should be used to solve real world tough problems like climate modeling to disease analysis and betterment of humanity.
Boosting and Bagging: it is the technique used to generate more accurate models by ensembling multiple models togethe>r
Crisp-DM : is the cross industry standard process for data mining.  It was developed by a consortium of companies like spss, Teradata, Daimler and NCR Corporation in 1997 to bring the order in developing analytics models. Major 6 steps involved are business understanding, data understanding, data preparation, modeling, evaluation and deployment.
Data preparation: In analytics deployments more than 60% time is spent on data preparation. As a normal rule is garbage in garbage out. Hence it is important to cleanse and normalize the data and make it available for consumption by model.
Ensembling: is the technique of combining two or more algorithms to get more robust predictions. It is like combining all the marks we obtain in exams to arrive at final overall score. Random Forest is one such example combining multiple decision trees.
Feature selection: Simply put this means selecting only those feature or variables from the data which really makes sense and remove non relevant variables. This uplifts the model accuracy.
Gini Coefficient: it is used to measure the predictive power of the tools to find out who will repay and who will default on a loan.
Histogram: This is a graphical representation of the distribution of a set of numeric data, usually a vertical bar graph used for exploratory analytics and data preparation step.
Independent Variable:Â is the variable that is changed or controlled in a scientific experiment to test the effects on the dependent variable like effect of increasing the price on Sales.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
Strategies for simplifying complex Salesforce data migrations – Free Webinar
27 March 2024
5 PM CET – 6 PM CET
Read MoreYou Might Be Interested In
The Four Key Pillars To Fostering A Data-Driven Culture
5 Apr, 2019Today, most organizations are striving to tap into the potential value that comes from being more data-driven. For example, research by MIT professor …
How Much “Data Science” Do You Really Need? — Dennis D. McDonald’s Web Site
18 Oct, 2016Is it really true that “Nearly two-thirds of big data projects will fail to get beyond the pilot and experimentation …
277 Data Science Key Terms, Explained
5 Sep, 2017This is a collection of 277 data science key terms, explained with a no-nonsense, concise approach. Read on to find …
Recent Jobs
Do You Want to Share Your Story?
Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.