Section 8 AI and ML Flashcards

1
Q

What is Artificial Intelligence (AI)?

A

AI is a broad field focused on developing intelligent systems capable of tasks requiring human intelligence, such as perception, reasoning, learning, problem-solving, and decision-making.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are some key use cases of AI?

A

Use cases include computer vision (self-driving cars, facial recognition), fraud detection, and intelligent document processing (IDP).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the main layers of an AI system?

A
  1. Data Layer (collecting vast amounts of data)\n2. Machine Learning Framework Layer (defining ML frameworks and algorithms)\n3. Model Layer (training the AI model)\n4. Application Layer (serving the model to users)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Machine Learning (ML)?

A

ML is a subset of AI where machines learn from data to improve performance on tasks without explicit programming.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are common ML tasks?

A

Regression (predicting continuous values) and classification (categorizing data points into groups).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the difference between AI and ML?

A

AI is a broad field that includes ML, while ML is a method within AI that enables computers to learn from data without explicit rules.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Deep Learning?

A

A subset of ML that uses artificial neural networks with multiple hidden layers to process complex data patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does Deep Learning work?

A

Deep learning models use layers of neurons (input, hidden, and output) to process data, learn patterns, and adjust connections to improve predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are Neural Networks?

A

Neural networks are AI models inspired by the human brain, consisting of interconnected neurons that process and learn from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Generative AI (GenAI)?

A

A subset of deep learning where models generate new content (e.g., text, images) by learning from large datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a Foundation Model?

A

A large, pre-trained AI model that can be adapted for various tasks, such as GPT models for text generation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are Transformer Models?

A

A deep learning architecture that processes entire sequences efficiently, enabling advanced NLP tasks like ChatGPT.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are Multi-Modal Models?

A

AI models that process multiple types of inputs (e.g., text, images, audio) and generate diverse outputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does Generative AI differ from Traditional AI?

A

Traditional AI classifies or predicts based on existing data, while GenAI creates new content, such as text or images.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is GPT

A

Generative Pre-trained Transformer, a model that generates human text or computer code based on input prompts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is BERT

A

Bidirectional Encoder Representations from Transformers, a language model that reads text in two directions, useful for translation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is RNN

A

Recurrent Neural Network, a neural network for processing sequential data like time series and speech recognition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is ResNet

A

Residual Network, a deep convolutional neural network (CNN) used for image recognition tasks like object detection and facial recognition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is SVM

A

Support Vector Machine, an ML algorithm used for classification and regression tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is WaveNet

A

A model used to generate raw audio waveforms, commonly used in speech synthesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is GAN

A

Generative Adversarial Network, a model for generating synthetic data like images, videos, or sounds that resemble training data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is XGBoost

A

Extreme Gradient Boosting, an implementation of gradient boosting used for regression and classification tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is labeled data?

A

Labeled data includes both input features and output labels, allowing for supervised learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is an example of labeled data?

A

Images of animals labeled as ‘dog’ or ‘cat’.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is unlabeled data?

A

Unlabeled data has input features but no output labels, requiring unsupervised learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is an example of unlabeled data?

A

A collection of images without any labels, where the algorithm must find patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is structured data?

A

Data that is organized in a structured format, often in rows and columns, such as tabular data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is an example of structured data?

A

A customer database with columns like Customer_ID, Name, Age, and Purchase_Amount.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is an example of time series data?

A

Stock prices recorded over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is unstructured data?

A

Data that does not follow a specific structure, often text-heavy or multimedia content.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is an example of unstructured data?

A

Text reviews, social media posts, or images.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Why is having good training data important?

A

Poor-quality data (garbage in) leads to poor model performance (garbage out).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is supervised learning?

A

A type of learning where an algorithm maps inputs to known outputs using labeled data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is unsupervised learning?

A

A type of learning where an algorithm finds patterns in unlabeled data without explicit labels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is supervised learning?

A

Supervised learning is a type of machine learning where a model learns to map inputs to outputs using labeled data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Why is labeled data important in supervised learning?

A

Labeled data allows the model to learn the correct output for given inputs, making supervised learning powerful but sometimes difficult due to the challenge of obtaining large labeled datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is regression in supervised learning?

A

Regression is used to predict a continuous numeric value based on input data, such as predicting weight based on height.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is an example of linear regression?

A

Predicting a person’s weight based on their height using a straight line that best fits the trend in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is classification in supervised learning?

A

Classification is used to predict a categorical label (e.g., classifying animals as dogs, cats, or giraffes based on height and weight).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What is the key difference between regression and classification?

A

Regression predicts a continuous value (e.g., house price), while classification predicts a category (e.g., spam or not spam).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What are examples of regression use cases?

A

Predicting house prices, stock market trends, and weather forecasting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What are examples of classification use cases?

A

Email spam detection, image recognition, fraud detection, and medical diagnostics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What are the three main data splits in supervised learning?

A

Training set (60-80%), Validation set (10-20%), Test set (10-20%).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Why is a validation set used?

A

To fine-tune the model and optimize hyperparameters before final testing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What is feature engineering?

A

Feature engineering is the process of transforming raw data into meaningful features to improve model performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is an example of feature engineering?

A

Converting a birthdate column into an age column to make it more useful for machine learning models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What are the three main types of feature engineering?

A
  1. Feature Extraction (e.g., deriving age from birthdate) 2. Feature Selection (e.g., choosing important variables) 3. Feature Transformation (e.g., normalizing data for better performance).
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What is the difference between binary and multi-class classification?

A

Binary classification predicts two categories (e.g., spam or not spam), while multi-class classification predicts more than two categories (e.g., mammal, bird, reptile).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What is multi-label classification?

A

Multi-label classification allows multiple categories per input (e.g., a movie can be both ‘action’ and ‘comedy’).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

What is unsupervised learning?

A

Unsupervised learning is a type of machine learning where the algorithm finds patterns and structures in unlabeled data without explicit supervision.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

What are common techniques in unsupervised learning?

A

Clustering, association rule learning, and anomaly detection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

What is clustering in unsupervised learning and name any algorithm used for this purpose??

A

Clustering is the process of grouping data points based on similarities, such as customer segmentation.

One of the Algorithm Used for this is - K Means Clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

What is an example of clustering?

A

Grouping customers based on purchasing behaviors to create targeted marketing campaigns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

What is association rule learning and name any algorithm used for this purpose?

A

Association rule learning identifies relationships between items, such as frequently bought-together products in a supermarket.

One of the Algorithm Used for this is - Apriori

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

What is an example of association rule learning?

A

The Apriori algorithm finds that people who buy bread often also buy butter, helping supermarkets optimize product placement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

What is anomaly detection technique and name any algorithm used for this purpose?

A

Anomaly detection is the process of identifying data points that differ significantly from normal patterns, often used in fraud detection.

One of the Algorithm Used for this is - Isolation forests, one-class SVM, and autoencoders are popular anomaly detection methods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What is an example of anomaly detection learning?

A

It detects unusual transactions (outliers) that differ from normal patterns, helping identify potential fraud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What is semi-supervised learning?

A

Semi-supervised learning combines a small amount of labeled data with a large amount of unlabeled data to improve learning efficiency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

What is pseudo-labeling in semi-supervised learning?

A

Pseudo-labeling is the process where a model trained on labeled data assigns labels to unlabeled data, which is then used for further training.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

What is the benefit of semi-supervised learning?

A

It reduces the cost of labeling large datasets while still achieving high model accuracy by leveraging both labeled and unlabeled data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

What is self-supervised learning?

A

Self-supervised learning is a type of machine learning where a model generates its own pseudo-labels from unlabeled data to solve supervised learning tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

How does self-supervised learning differ from unsupervised learning?

A

Unlike unsupervised learning, self-supervised learning generates labels from the data itself, enabling it to solve tasks typically handled by supervised learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

What is an example of self-supervised learning?

A

Language models like GPT use self-supervised learning by predicting missing words in text, learning grammar, structure, and meaning without human-labeled data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

What are pre-text tasks in self-supervised learning?

A

Pre-text tasks are simple, self-generated tasks that a model solves to learn patterns in data, such as predicting missing words or the next word in a sentence.

65
Q

What are examples of pre-text tasks?

A

Examples include predicting the next word in a sentence, filling in missing words, reconstructing occluded images, or predicting future frames in a video.

66
Q

What is reinforcement learning?

A

A type of machine learning where an agent learns to make decisions by performing actions in an environment and maximizing cumulative reward.

67
Q

What is an agent in reinforcement learning?

A

The learner or decision-maker in the environment.

68
Q

What is the environment in reinforcement learning?

A

The external system that the agent interacts with.

69
Q

What are actions in reinforcement learning?

A

Choices made by the agent, such as moving up, down, left, or right in a maze.

70
Q

What is a reward in reinforcement learning?

A

Feedback provided by the environment based on the agent’s actions.

71
Q

What is an example of a reward system in reinforcement learning?

A

-1 for a step, -10 for hitting a wall, and +100 for reaching the exit in a maze.

72
Q

What is the state in reinforcement learning?

A

The current situation of the environment, which the agent observes before taking an action.

73
Q

What is the policy in reinforcement learning?

A

The strategy used by the agent to determine actions based on the current state.

74
Q

How does reinforcement learning improve over time?

A

Through many simulations, the agent learns from past mistakes and updates its policy to maximize cumulative rewards.

75
Q

What are the key steps in reinforcement learning?

A

Observe the state, choose an action, transition to a new state, receive a reward, and update the policy.

76
Q

What are some applications of reinforcement learning?

A

Gaming (chess, Go), robotics (navigation, object manipulation), finance (portfolio management), healthcare (treatment optimization), and autonomous vehicles (path planning).

77
Q

Why does reinforcement learning require multiple iterations?

A

The agent improves by trial and error, refining its policy over many attempts.

78
Q

What is Reinforcement Learning from Human Feedback (RLHF)?

A

A technique that incorporates human feedback into reinforcement learning to align AI models with human goals, wants, and needs.

79
Q

What are the four key steps in RLHF?

A
  1. Data collection, 2. Supervised fine-tuning, 3. Building a separate reward model, 4. Optimizing the language model with a reward-based model.

Refer Image from AWS

steps in image
1.

https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback/

80
Q

What happens during the data collection phase in RLHF?

A

A set of human-generated prompts and ideal responses are gathered to train the model.

81
Q

How is the reward model trained in RLHF?

A

Humans rank different AI-generated responses, helping the model learn human preferences.

82
Q

What is the role of the reward model in RLHF?

A

It serves as an automated evaluator of AI-generated responses, replacing the need for continuous human judgment.

83
Q

Explain AWS image which explains the three-step process of Reinforcement Learning from Human Feedback (RLHF)

https://aws.amazon.com/what-is/reinforcement-learning-from-human-feedback/

A

Learning from Human Feedback (RLHF):

Step 1: Supervised Fine-Tuning (SFT)
Goal: Train a base LLM (Large Language Model) using human demonstration data.

Process:

Collect human-labeled prompts and responses.

Fine-tune the base model using supervised learning to align with human-like responses.

Step 2: Training a Reward Model (RM)
Goal: Develop a model that evaluates AI-generated responses based on human preference.

Process:

A fine-tuned model (SFT) generates multiple responses to the same prompt.

Humans rank these responses to indicate preference.

A separate reward model (RM) is trained based on human ranking.

Step 3: Optimizing Policy using Proximal Policy Optimization (PPO)
Goal: Improve the model’s response generation by optimizing the policy using the reward model.

Process:

The fine-tuned model (SFT) generates responses to new prompts.

The reward model (RM) evaluates and assigns rewards.

The model updates its policy through reinforcement learning using PPO (Proximal Policy Optimization).

This iterative process ensures that the AI generates more human-aligned, contextually appropriate responses over time.

84
Q

What is overfitting in machine learning?

A

Overfitting occurs when a model performs well on training data but poorly on evaluation data because it memorizes noise instead of learning the underlying pattern.

85
Q

What is underfitting in machine learning?

A

Underfitting happens when a model performs poorly on both training and evaluation data because it is too simple to capture the underlying patterns in the data.

86
Q

What is bias in machine learning?

A

Bias is the error between the predicted value and the actual value, often caused by incorrect assumptions in the model.

87
Q

What is variance in machine learning?

A

Variance represents how much a model’s predictions change when trained on different datasets. High variance indicates overfitting.

88
Q

What causes high bias in a model?

A

High bias is caused by overly simplistic models that fail to capture the complexity of the data, leading to underfitting.

89
Q

What causes high variance in a model?

A

High variance is caused by overly complex models that fit training data too closely and fail to generalize well to new data.

90
Q

How can bias be reduced in a model?

A

Bias can be reduced by using a more complex model, adding more relevant features, or improving the data quality.

91
Q

How can variance be reduced in a model?

A

Variance can be reduced by simplifying the model, using fewer features, or increasing training data size.

92
Q

What is the ideal balance in model fitting?

A

A balanced model has low bias and low variance, meaning it generalizes well to unseen data.

93
Q

What does a high-bias, low-variance model indicate?

A

It indicates underfitting, where the model is too simple and fails to capture data patterns.

94
Q

What does a low-bias, high-variance model indicate?

A

It indicates overfitting, where the model memorizes training data but performs poorly on unseen data.

95
Q

What does a high-bias, high-variance model indicate?

A

It indicates a poor model that neither captures patterns well nor generalizes properly.

96
Q

What does a low-bias, low-variance model indicate?

A

It indicates a well-balanced model that effectively captures data patterns and generalizes well.

97
Q

How can we detect overfitting?

A

Overfitting can be detected if the model has high accuracy on training data but significantly lower accuracy on test data.

98
Q

How can we detect underfitting?

A

Underfitting can be detected if the model has low accuracy on both training and test data.

99
Q

What is the bias-variance tradeoff?

A

The bias-variance tradeoff refers to the challenge of balancing bias and variance to achieve optimal model performance.

100
Q

What type of metrics are used for classification problems?

A

“Confusion matrix

101
Q

What is a confusion matrix?

A

“A table used to evaluate classification models by comparing actual vs. predicted values.”

102
Q

What is precision in classification?

A

“Precision = True Positives / (True Positives + False Positives). Measures how many predicted positives were actually correct.”

103
Q

When is precision more important?

A

“When false positives are costly.”

104
Q

What is recall in classification?

A

“Recall = True Positives / (True Positives + False Negatives). Measures how many actual positives were correctly identified.”

105
Q

When is recall more important?

A

“When false negatives are costly.”

106
Q

What is the F1 score?

A

“A metric that balances precision and recall: F1 = 2 * (Precision * Recall) / (Precision + Recall).”

107
Q

What is accuracy?

A

“Accuracy = (True Positives + True Negatives) / Total Predictions. Often not used for imbalanced datasets.”

108
Q

What is AUC-ROC?

A

“A metric that evaluates classification models by plotting true positive rate vs. false positive rate.”

109
Q

What does an AUC-ROC score of 1 mean?

A

“The model is perfect in distinguishing classes.”

110
Q

What does an AUC-ROC score of 0.5 mean?

A

“The model performs no better than random chance.”

111
Q

What are different regression evaluation metrics?

A

MAE(Mean Absolute Error), MAPE(Mean Absolute Percentage Error), RMSE(Root Mean Squared Error), R-squared

112
Q

What is MAE (Mean Absolute Error)?

A

“The average absolute difference between actual and predicted values.”

113
Q

What is MAPE (Mean Absolute Percentage Error)?

A

“Measures prediction error as a percentage of actual values.”

114
Q

What is RMSE (Root Mean Squared Error)?

A

“A metric that squares errors before averaging to penalize larger errors more heavily.”

115
Q

What is R-squared?

A

“A measure of how well input features explain the variance in the target variable. Closer to 1 means a better model.”

116
Q

What is a balanced dataset?

A

“A dataset where each category has an equal or similar number of instances.”

117
Q

What is inferencing?

A

Inferencing is when a model makes predictions based on new data.

118
Q

What are the types of inferencing?

A

Real-time inferencing, batch inferencing, and edge inferencing.

119
Q

What is real-time inferencing?

A

Inferencing where predictions are made instantly, prioritizing speed over perfect accuracy.

120
Q

Where is real-time inferencing commonly used?

A

Chatbots, recommendation systems, fraud detection, self-driving cars.

121
Q

What is batch inferencing?

A

Inferencing where large amounts of data are processed at once, prioritizing accuracy over speed.

122
Q

Where is batch inferencing commonly used?

A

Data analysis, report generation, medical imaging, financial forecasting.

123
Q

What is edge inferencing?

A

Inferencing done on edge devices with limited computing power, often in areas with poor internet connectivity.

124
Q

What are the benefits of edge inferencing?

A

Low latency, offline capability, reduced cloud dependency.

125
Q

What is a limitation of edge inferencing?

A

Limited computing power, making it difficult to run large models like LLMs.

126
Q

What is an alternative to running LLMs on edge devices?

A

Hosting the model on a remote server and accessing it via API calls.

127
Q

What is the trade-off of using a remote LLM instead of local inference?

A

Higher latency and requires an internet connection, but allows for more powerful models.

128
Q

What are Small Language Models (SLMs)?

A

Compact AI models optimized for edge devices with low computational power.

129
Q

What are key trade-offs in inferencing?

A

Speed vs. accuracy, compute power vs. latency, online vs. offline capability.

130
Q

What are different stages in a machine learning project?

A

1️⃣ Identify Business Problem →
2️⃣ Frame as ML Problem →
3️⃣ Collect & Prepare Data →
4️⃣ Feature Engineering →
5️⃣ Model Training →
6️⃣ Hyperparameter Tuning →
7️⃣ Model Evaluation →

If Business Goals Not Met 🔄 Go Back to Data Collection / Feature Engineering

If Business Goals Met → Proceed

8️⃣ Model Testing →
9️⃣ Deployment →

🔄 Model Monitoring & Debugging →

🔄 Retrain with New Data → Repeat Process

This cycle ensures continuous improvement and adaptation of the ML model

131
Q

Why is feature engineering important in ML?

A

It transforms data into useful features for better model performance.

132
Q

What happens if the business goals are not met in ML?

A

Enhance data, perform data augmentation, or improve features.

133
Q

What is the purpose of monitoring and debugging an ML model?

A

To ensure the model remains accurate and adapts to changes.

134
Q

What is data augmentation?

A

The process of enhancing a dataset by adding more data points or variations.

135
Q

Why is exploratory data analysis (EDA) important?

A

To understand data structure, visualize relationships, and compute statistics.

136
Q

What is a hyperparameter?

A

Settings that define the model structure and learning process, set before training begins.

137
Q

What are examples of hyperparameters?

A

Learning rate, batch size, number of epochs, regularization.

138
Q

What does the learning rate control?

A

How fast the model incorporates new data.

139
Q

What is the effect of a high learning rate?

A

Faster convergence but risk of overshooting optimal solution.

140
Q

What is the effect of a low learning rate?

A

More precise convergence but slower training.

141
Q

What does batch size control?

A

Number of training examples used per iteration to update model weights.

142
Q

What is the effect of a small batch size?

A

More stable learning but requires more time to compute.

143
Q

What is the effect of a large batch size?

A

Faster training but may lead to less stable updates.

144
Q

What does number of epochs control?

A

How many times the model iterates over the entire training dataset.

145
Q

What happens if epochs are too few?

A

Underfitting – model does not learn enough from the data.

146
Q

What happens if epochs are too many?

A

Overfitting – model learns training data too well but fails on new data.

147
Q

What is regularization used for?

A

Adjusts balance between simple and complex models.

148
Q

How does regularization affect overfitting?

A

Increasing regularization reduces overfitting.

149
Q

What is overfitting?

A

Model performs well on training data but poorly on new data.

150
Q

What causes overfitting?

A

Small training data, too many epochs, overly complex model.

151
Q

How can overfitting be prevented?

A

Increase training data size, early stopping, data augmentation, regularization.

152
Q

What is the best way to prevent overfitting?

A

Increase the training data size.

153
Q

What is hyperparameter tuning?

A

Finding the best hyperparameter values to optimize model performance.

154
Q

What are methods for hyperparameter tuning?

A

Grid search, random search, SageMaker AMT.

155
Q

Why is hyperparameter tuning important?

A

Improves accuracy, reduces overfitting, enhances generalization.

156
Q

When is machine learning NOT appropriate?

A

When a problem has a deterministic solution that can be computed exactly using traditional programming.

157
Q

Example of a deterministic problem?

A

Calculating the probability of drawing a specific card from a known deck.

158
Q

Why not use ML for deterministic problems?

A

ML provides approximations, while deterministic code gives exact answers.

159
Q

What should you consider before applying ML?

A

Whether the problem requires exact solutions or can tolerate approximation.