Data Science Flashcards

1
Q

What is data science?

A

The extraction of knowledge from data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data analytics vs. data science

A
  • Data analytics is about analyzing the data to draw insights (past data)
  • Data science is about data plus math and statistics to create predictions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What can Machine Learning determine?

A
  1. Is this A or B? —> Classification
  2. Is this weird? —> Anomaly detection
  3. How much/how many? —> Regression analysis
  4. How is it organized? —> Unsupervised learning (e.g. clustering)
  5. What should I do next? —> Reinforcement learning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

data scientist vs. data analyst vs. data engineer

A
  • data scientist focuses on analyzing and interpreting data to find insights, patterns, and make predictions
  • data analyst focuses on digging around in data, visualizations, focus on insights into past data
  • data engineer focuses on managing and organizing data, maintaining databases and data pipelines
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Can you prove a hypothesis?

A

No, you can never prove that a hypothesis is true, you can only fail to reject it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is PCA?

A

Principal Component Analysis: Unsupervised learning method that uses patterns present in high-dimensional data (data with lots of independent variables) to reduce the complexity of the data while retaining most of the information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In a decision tree, how are the ends called?

A

Leavesmµ.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

History of ML/Ai

A
  • 1st generation: The Backend - Large Datasets (Fraud detection, search algos, SCM)
  • 2nd generation: The human side - data about humans (rec. systems, social media, commerce + ads)
  • 3rd generation: Modern Machine Learning - pattern recognition (speech recog., computer vision, translation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is pruning (prune=zurechtstutzen)

A

Purning removes unnecessary splits > compresses part of the tree from strict and rigid decision boundaries into ones that are more smooth and generalise better > reduces tree complexity > tree complexity = number of splits in the tree

A simple yet highly effective pruning method is to go through each node in the tree and evaluate the effect of removing it on the cost function. If it doesn’t change much, then prune away!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly