UNIT 1 Flashcards
It is an approach that offers new techniques to solve problems
Data Science and Analytics
What are the roles in analytics?
Collector/Data Steward, Data Engineer, Business Analyst, Modeler/Data Scientist
Also known as data scientist that models algorithm; makes sure data are correct
Modeler/Data Scientists
They are the business experts in the field of data science
Business Analyst
It oversees all roles in the data field
Project Manager
Hacking skills and Substantive expertise
Danger Zone
Substantive expertise and Math and Statistics Knowledge
Traditional Research
Hacking Skills and Math and Statistics Knowledge
Machine Learning
Due to its interdisciplinary nature, it requires an intersection of abilities: hacking skills, math and statistics knowledge, and substantive expertise in a field of science
Data science
It is necessary for working with massive amounts of electronic data that must be acquired, cleaned, and manipulated
Hacking skills
It allows a data scientist to choose appropriate methods and tools in order to extract insight from data
Math and statistics knowledge
In a scientific field, it is crucial for generating motivating questions and hypotheses and interpreting results
Substantive expertise
It lies at the intersection of knowledge of math and statistics with substantive expertise in a scientific field
Traditional research
Stems from combining hacking skills with math and statistics knowledge, but does not require scientific motivation
Machine Learning
Hacking skills combined with substantive scientific expertise without rigorous methods can beget incorrect analyses
Danger zone
Scope: Macro
Data Science
Goal: To ask the right questions
Data Science
Major Fields: Machine learning, AI, Search engine engineering, corporate analytics
Data Science
Using Big Data: Yes
Data Science and Analytics
Scope: Micro
Data Analytics
Goal: To find actionable data
Data Analytics
Major FIelds: Healthcare, gaming, travel, industries with immediate data needs
Data Analytics
It is the mother of invention
Necessity
History: Report Writing; Goal is automation
1970s
History: Centralized System; Goal is to have Enterprise Resource Planning or Management Info System
1980s
History: Business Intelligence; Goal is apps for everyone, applications for personal use were invented and made to share
1990s
History: Internet Data and Mining
2000s
History: Big data and data science used for real time analysis
2010s
T/F: The needs of the industry, as demanded by the fast moving realities of the present time, also evolve the analytics
TRUE
T/F: The value in the data “haystack” is not guided by your knowledge of the domain- but of the tools or techniques
FALSE
T/F: Finding that value- the combination of all the skillsets that you need- is data science
FALSE
Evolution: Describes historical data; Helps understand how things are going
Descriptive
Evolution: Helps understand unique drivers; Segmentation, Statistical & Sensitivity analysis
Diagnostic
Evolution: Forecast future performance, events, and results
Predictive
Evolution: Analysis that suggest a prescribed action
Prescriptive
Evolution: Proactive action; Learn at scale; Reason with purpose interact naturally
Cognitive
Medical image analysis, Machine learning in disease diagnosis, Genetics and Genomics, Drug development, Virtual assistance for customer support
Data Science and Analytics in Healthcare
Finding useful pattern in a data; It is the process of knowledge discovery, machine learning and predictive analytics
Data Mining
Which of the following is NOT about data mining?
Descriptive statistics, Exploratory visualization, Dimensional slicing, Hypothesis testing, Queries
It is a type of learning model in data mining which is directed data mining. The model generalizes the relationship between the input and output
Supervised
It is a type of learning model in data mining which is an undirected data mining. The objective of this class of data mining techniques is to find patterns in data based on the relationship between data points themselves
Unsupervised
What are the groups of learning models in data mining?
Classification, Regression, Clustering, Anomaly Detection, Time Series Forecasting, Association, Text and Sentiment Analysis
What are the steps in data mining by CRISP-DM Framework?
Business understanding, Data understanding, Data preparation, Modeling, Testing and Evaluation, Deployment