1 Chapter 3 Flashcards

1
Q

What is the role of a data engineer in AI/ML products?

A

To power the data flow needed for product success and maintain the ETL pipeline

ETL stands for Extract, Transform, Load, a process for data integration.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does ETL stand for?

A

Extract, Transform, Load

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How often are ETL pipelines generally updated?

A

In batches and not in real time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is a data pipeline that is updated continuously used for?

A

To provide real-time insights for dashboards used by internal business users

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is MLOps?

A

A practice that combines machine learning and operations to maintain AI systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does IaaS stand for?

A

Infrastructure as a Service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is strategizing and planning for AI adoption crucial?

A

To avoid technical debt and ensure sustainable implementation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is model decay?

A

The decline in model performance over time due to changes in underlying data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is one deployment strategy involving a new model alongside an existing one?

A

Shadow deployment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In A/B testing, what is the primary goal?

A

To compare the performance of two slightly different models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a gradual deployment strategy that tests new models on subsets of users called?

A

Canary deployment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What platform does Databricks offer for managing the ML life cycle?

A

MLflow

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the purpose of Google’s AI Platform?

A

To deploy production-level ML pipelines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Uber’s ML management tool called?

A

Michelangelo

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does Meta’s ML platform aim to achieve?

A

Reusability of ML algorithms and easy access to past projects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What service does Amazon provide for building and deploying ML models?

A

Amazon SageMaker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What tools did Airbnb use to orchestrate their ML platform?

A

Zipline, Redspot, DeepThought

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the promise of AI rooted in?

A

Quantifying prediction and optimization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What percentage of Amazon’s sales come from their recommendation engine?

A

35%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a smart strategy for implementing AI/ML projects?

A

Start small, apply to a clear business goal, and track effectiveness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is essential for justifying investment in AI projects?

A

Communicating the strength and capabilities of AI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What do we learn through in the context of AI/ML projects?

A

Iteration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the importance of iteration in learning?

A

Iteration builds confidence through successful task completion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How does GE utilize AI for customer benefit?

A

GE offers cost savings to its customers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What role does Highmark play in preventing future bottlenecks?

A

Highmark predicts fraud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

How did Amazon benefit from machine learning?

A

Amazon grew its revenues through ML.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is the significance of AI in the context of industrial revolution?

A

AI promises benefits to both companies and consumers.

28
Q

What are the stages of the NPD cycle for AI/ML products?

A

Stages include discovery, define, design, implementation, marketing, training, and launch.

29
Q

What is the focus during the discovery stage of NPD?

A

Identifying the market need and why AI should address it.

30
Q

What is defined in the define stage of NPD?

A

Product requirements and screening ideas from the discovery stage.

31
Q

What does the design stage of NPD involve?

A

Creating mockups and defining UI/UX elements.

32
Q

What is the purpose of the implementation phase in NPD?

A

Materializing the planned product and achieving performance expectations.

33
Q

What is a key consideration in marketing AI products?

A

Balancing communication about AI capabilities without overselling.

34
Q

What is the focus of the training phase in NPD?

A

Training users and managing expectations regarding product performance.

35
Q

What happens during the launch phase of NPD?

A

Officially releasing the product and assessing its performance against original metrics.

36
Q

What is the Naive Bayes algorithm used for?

A

It’s used for classification problems by treating each feature as independent.

37
Q

What does the Support Vector Machine (SVM) algorithm do?

A

It splits data into two classes to predict future data points.

38
Q

What is linear regression used for?

A

Predicting future data points using one or more variables.

39
Q

What does logistic regression predict?

A

A future binary categorical state.

40
Q

What is the function of decision trees in ML?

A

They predict both categorical and numerical values using a flowchart-like structure.

41
Q

How does the random forest algorithm work?

A

It creates multiple decision trees from random samples and averages the predictions.

42
Q

What is K-Nearest Neighbors (KNN) used for?

A

Predicting future values based on the characteristics of neighboring data points.

43
Q

What does clustering aim to achieve in ML?

A

Finding patterns or clusters in data without supervision.

44
Q

What is the purpose of Principal Component Analysis (PCA)?

A

Reducing dimensions of large datasets while preserving information.

45
Q

What do deep learning models mimic?

A

The way the human brain processes information through layers.

46
Q

What is the goal of the implementation phase in the NPD process?

A

Achieving optimal performance based on the defined metrics.

47
Q

What are neural networks primarily used for?

A

Neural networks are used to make up the models in AI/ML products.

48
Q

What is the most important factor for AI/ML products?

A

Data accessibility.

49
Q

What types of data might you initially start with for model training?

A

Third-party data or public data.

50
Q

Why is partnering with customers important in AI/ML product development?

A

It helps build a product that can be successful with real-world data.

51
Q

What is a potential risk of using pristine datasets for model training?

A

The model may perform poorly with real-world data it hasn’t seen before.

52
Q

Why is having a variety of data crucial for model training?

A

To ensure good model performance and usability ethics.

53
Q

What is iterative hyperparameter tuning?

A

It involves continuously retraining models for performance.

54
Q

What informs ML engineers on how to tune hyperparameters?

A

Performance metrics and benchmarks from the define phase of the NPD process.

55
Q

What are hyperparameters?

A

Settings that define how a model functions and optimizes performance.

56
Q

What is an example of a hyperparameter in a decision tree model?

A

Maximum depth allowed for the decision tree.

57
Q

What is the coefficient of determination also known as?

A

R-squared.

58
Q

What was the R-squared value for the OLS regression model tested?

59
Q

What R-squared value did the random forest model achieve?

60
Q

What hyperparameter was used in the KNN model?

A

6 neighbors.

61
Q

What score did the KNN model achieve?

62
Q

What phenomenon occurs when a model performs exceptionally well on training data but poorly on new data?

A

Overfitting.

63
Q

What should you be suspicious of when a model gets very close to a perfect score?

A

That the model may not generalize well to new datasets.

64
Q

What should AI/ML enthusiasts look for in model performance over time?

A

Incremental improvement in performance.

65
Q

What is the next step after comprehensive data training and model adjustment?

A

Moving forward to deployment.

66
Q

Fill in the blank: The process of ideating your product, choosing the right model, and gauging performance is ______.

A

[collaborative].