Use Case and Evaluation Flashcards

Question 1

Q

Question 2

Q

What is a Data Science Use Case (DSUC)?

Answer

A

A scenario or project that creates value uniquely using data-driven insights.

Question 3

Q

Why is identifying DSUCs important?

Answer

A

It helps organizations increase gain, reduce risk, and decrease effort.

Question 4

Q

What are the key steps in identifying a DSUC?

Answer

A

Define the problem, collect ideas, structure the ideas, define success, and assess potential risks.

Question 5

Q

What are operational-related DSUCs?

Answer

A

Use cases focused on optimizing operations, predicting failures, and improving product quality.

Question 6

Q

What are fraud-related DSUCs?

Answer

A

Use cases detecting unauthorized access, fraudulent behavior, and security threats.

Question 7

Q

What are customer-related DSUCs?

Answer

A

Use cases focused on improving customer experience, predicting churn, and optimizing marketing strategies.

Question 8

Q

What are the two types of evaluation for DSUCs?

Answer

A

Model-centric evaluation and business-centric evaluation.

Question 9

Q

What is model-centric evaluation?

Answer

A

Evaluating the predictive model’s performance using metrics like accuracy, precision, and recall.

Question 10

Q

What is business-centric evaluation?

Answer

A

Evaluating the impact of a model on business KPIs such as revenue, customer retention, and operational efficiency.

Question 11

Q

What is the Machine Learning Canvas?

Answer

A

A structured framework used to define, plan, and evaluate machine learning projects.

Question 12

Q

What are the key components of the Machine Learning Canvas?

Answer

A

Prediction Task, Decisions, Value Proposition, Data Collection, Data Sources, Impact Simulation, Making Predictions, Building Models, Features, and Monitoring.

Question 13

Q

What is customer churn?

Answer

A

The rate at which customers stop doing business with a company over a certain period.

Question 14

Q

Why is customer churn important to businesses?

Answer

A

Reducing churn helps retain valuable customers and improves profitability.

Question 15

Q

What type of machine learning task is customer churn prediction?

Answer

A

A supervised learning binary classification problem.

Question 16

Q

What features are used in churn prediction models?

Answer

A

Customer demographics, purchase history, subscription details, engagement levels, and payment history.

Question 17

Q

What data sources are used for churn analysis?

Answer

A

CRM databases, payment records, and website analytics.

Question 18

Q

How is model performance evaluated in churn prediction?

Answer

A

Using metrics such as accuracy, precision, recall, and F1-score.

Question 19

Q

What are the key steps in the machine learning workflow?

Answer

A

Feature extraction, data splitting, model training, evaluation, and deployment.

Question 20

Q

What is overfitting in machine learning?

Answer

A

When a model performs well on training data but poorly on unseen data due to memorization.

Question 21

Q

How can overfitting be prevented?

Answer

A

Using techniques like regularization, cross-validation, and reducing model complexity.

Question 22

Q

What is accuracy in model evaluation?

Answer

A

The proportion of correctly classified instances out of all predictions.

Question 23

Q

What are the limitations of accuracy?

Answer

A

It does not account for class imbalances, which may lead to misleading results in fraud detection.

Question 24

Q

What is a confusion matrix?

Answer

A

A table used to evaluate classification models by displaying true positives, false positives, true negatives, and false negatives.

Question 25

Q

What is a Type I error (False Positive)?

Answer

A

Incorrectly classifying a negative instance as positive.

Question 26

Q

What is a Type II error (False Negative)?

Answer

A

Incorrectly classifying a positive instance as negative.

Question 27

Q

What is precision in classification?

Answer

A

The proportion of true positives among all predicted positives (TP / (TP + FP)).

Question 28

Q

What is recall in classification?

Answer

A

The proportion of actual positives that were correctly predicted (TP / (TP + FN)).

Question 29

Q

What is the F1-score?

Answer

A

The harmonic mean of precision and recall, balancing both metrics.

Question 30

Q

What are techniques for improving model performance?

Answer

A

Dimensionality reduction, hyperparameter tuning, and ensemble methods.

Question 31

Q

What is dimensionality reduction?

Answer

A

Reducing the number of features in a dataset to remove redundant or irrelevant information.

Question 32

Q

What are common dimensionality reduction techniques?

Answer

A

Principal Component Analysis (PCA) and feature selection methods.

Question 33

Q

What is hyperparameter tuning?

Answer

A

Optimizing the configuration settings of a model to improve performance.

Question 34

Q

What are ensemble methods?

Answer

A

Techniques that combine multiple models to improve predictive accuracy, such as bagging and boosting.

Question 35

Q

What is live evaluation in machine learning?

Answer

A

Continuously tracking model performance on real-world data to detect drift and degradation.

Question 36

Q

What is Return on Investment (ROI) in data science?

Answer

A

The financial benefit gained from implementing a data science solution relative to its cost.

Question 37

Q

Why is monitoring machine learning models important?

Answer

A

To ensure that model predictions remain accurate and aligned with business objectives.