9 Service Analytics Flashcards
Which phenomenon contributes to the formation of huge amounts of data, also called big data?
Internet of Things -> every electronic object generates data
Industry 4.0 -> A lot more data gets “harvested” in all industries
what are the four “Vs” of big data?
variety
volume
velocity
veracity (Wahrhaftigkeit)
Why is Big Data such a big thing?
People expect new insights by analysing this data
- find structures
-> these new insights shall enable new business models or bring to light possibilities to makes processes more efficient
Name the two levels big data will be exploited and name examples
Centralized level: Central data model and intelligence
- transportation systems, warehouses
Autonomous/ decentralized intelligence
- Self-managing traffic lights
- packages will find their way automatically
Name 8 (Business) Analytics methods \+ group them in "basic" and "advanced" \+ sort them according to their degree of intelligence/ competitive advantage
Basic: Standard reporting Ad hoc reporting Drill down Alerts
Advanced: Forecasting Simulation Predictive modeling Optimization
Name three kinds of analytics
Descriptive
Predictive
Prescriptive
In which departments of a company are more likely to rely on data and analytics?
with increasing value:
Customer HR Strategic Operational Financial
Distinguish supervised from unsupervised machine learning
supervised:
- we need to train the model
- > set of know problems/ answers needed to train
- typical tasks: regression or classification
- Example: Tell a child to sort cars into sports cars and SUV after telling him what their characteristics are
unsupervised:
- identify previously unknown patterns
- > outcome might be a structure we haven’t been thinking of yet
- typical: clustering/ association rules
- Example: Tell a child to sort cars but don’t give any criteria
Definition: Machine learning
algorithms that can learn from and make predictions on
data
Definition: Data Mining/ Knowledge discovery in databases
The nontrivial process of identifying valid, novel, potentially useful, and understandable patterns in
data
Draw the Crisp- DM circle
Business understanding Data understanding
Data Understanding -> Data preparation
Data preparation Modeling
Modeling -> Evaluation -> Deployment
Evaluation -> Business understanding
Describe the term Data Science using a Venn Diagram
Computer Science + Statistics = Art zone, solving problems that never appear
Computer Science + Business Application = Danger zone, automation of gut feeling
Statistics + Business Application = Theory zone, solution without implementation
All three disciplines are needed!
Explain shortly multivariate regression
is it used for supervised or unsupervised learning?
predict the value of a response variable with help of its correlation with other variables
- find the right few (~ 2-4) variables that explain the response variable realistically
- not too many -> overfitting
-> used for supervised learning
describe the method “classification” shortly and name techniques for it
One variable indicates class membership The other variables are used to predict it
Techniques: Naive Bayes k-Nearest Neighbor Decision Trees (Quinlan’s ID3, C4.5) Logistic Regression Neural Networks
Name two applications of unsupervised learning
Clustering/ Segmentation
Association rules