Use Case and Evaluation Flashcards

1
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Data Science Use Case (DSUC)?

A

A scenario or project that creates value uniquely using data-driven insights.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is identifying DSUCs important?

A

It helps organizations increase gain, reduce risk, and decrease effort.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the key steps in identifying a DSUC?

A

Define the problem, collect ideas, structure the ideas, define success, and assess potential risks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are operational-related DSUCs?

A

Use cases focused on optimizing operations, predicting failures, and improving product quality.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are fraud-related DSUCs?

A

Use cases detecting unauthorized access, fraudulent behavior, and security threats.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are customer-related DSUCs?

A

Use cases focused on improving customer experience, predicting churn, and optimizing marketing strategies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the two types of evaluation for DSUCs?

A

Model-centric evaluation and business-centric evaluation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is model-centric evaluation?

A

Evaluating the predictive model’s performance using metrics like accuracy, precision, and recall.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is business-centric evaluation?

A

Evaluating the impact of a model on business KPIs such as revenue, customer retention, and operational efficiency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Machine Learning Canvas?

A

A structured framework used to define, plan, and evaluate machine learning projects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the key components of the Machine Learning Canvas?

A

Prediction Task, Decisions, Value Proposition, Data Collection, Data Sources, Impact Simulation, Making Predictions, Building Models, Features, and Monitoring.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is customer churn?

A

The rate at which customers stop doing business with a company over a certain period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why is customer churn important to businesses?

A

Reducing churn helps retain valuable customers and improves profitability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What type of machine learning task is customer churn prediction?

A

A supervised learning binary classification problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What features are used in churn prediction models?

A

Customer demographics, purchase history, subscription details, engagement levels, and payment history.

17
Q

What data sources are used for churn analysis?

A

CRM databases, payment records, and website analytics.

18
Q

How is model performance evaluated in churn prediction?

A

Using metrics such as accuracy, precision, recall, and F1-score.

19
Q

What are the key steps in the machine learning workflow?

A

Feature extraction, data splitting, model training, evaluation, and deployment.

20
Q

What is overfitting in machine learning?

A

When a model performs well on training data but poorly on unseen data due to memorization.

21
Q

How can overfitting be prevented?

A

Using techniques like regularization, cross-validation, and reducing model complexity.

22
Q

What is accuracy in model evaluation?

A

The proportion of correctly classified instances out of all predictions.

23
Q

What are the limitations of accuracy?

A

It does not account for class imbalances, which may lead to misleading results in fraud detection.

24
Q

What is a confusion matrix?

A

A table used to evaluate classification models by displaying true positives, false positives, true negatives, and false negatives.

25
Q

What is a Type I error (False Positive)?

A

Incorrectly classifying a negative instance as positive.

26
Q

What is a Type II error (False Negative)?

A

Incorrectly classifying a positive instance as negative.

27
Q

What is precision in classification?

A

The proportion of true positives among all predicted positives (TP / (TP + FP)).

28
Q

What is recall in classification?

A

The proportion of actual positives that were correctly predicted (TP / (TP + FN)).

29
Q

What is the F1-score?

A

The harmonic mean of precision and recall, balancing both metrics.

30
Q

What are techniques for improving model performance?

A

Dimensionality reduction, hyperparameter tuning, and ensemble methods.

31
Q

What is dimensionality reduction?

A

Reducing the number of features in a dataset to remove redundant or irrelevant information.

32
Q

What are common dimensionality reduction techniques?

A

Principal Component Analysis (PCA) and feature selection methods.

33
Q

What is hyperparameter tuning?

A

Optimizing the configuration settings of a model to improve performance.

34
Q

What are ensemble methods?

A

Techniques that combine multiple models to improve predictive accuracy, such as bagging and boosting.

35
Q

What is live evaluation in machine learning?

A

Continuously tracking model performance on real-world data to detect drift and degradation.

36
Q

What is Return on Investment (ROI) in data science?

A

The financial benefit gained from implementing a data science solution relative to its cost.

37
Q

Why is monitoring machine learning models important?

A

To ensure that model predictions remain accurate and aligned with business objectives.