AI / ML Flashcards

1
Q

____ is a broad field for the development of intelligent systems capable of performing tasks that typically require human intelligence.

A

AI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Identify the AI component described below:

Collect vast amount of data.

a) Data Layer
b) ML Framework or Algorithm Layer
c) Model Layer
d) Application Layer

A

data layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Identify the AI component described below:

data scientists and engineer work together to understand use cases, requirements, and frameworks that can solve them

a) Data Layer
b) ML Framework or Algorithm Layer
c) Model Layer
d) Application Layer

A

ML framework

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Identify the AI component described below:

implement a model and train it, we have the structure, the parameters and functions, optimizer function

a) Data Layer
b) ML Framework or Algorithm Layer
c) Model Layer
d) Application Layer

A

model layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Identify the AI component described below:

how to serve the model and its capabilities for your users

a) Data Layer
b) ML Framework or Algorithm Layer
c) Model Layer
d) Application Layer

A

application layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

____ is a type of AI for building methods that allow machines to learn.

A

Machine Learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

With Machine Learning, ____ is leveraged to improve computer performance on a set of task.

A

data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

With Machine Learning, you make ____ based on data used to train the model.

A

predictions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

True/False: In machine learning, no explicit programming rules are created, data is given to the algorithm.

A

t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What function does the Transformer Model provide?

A

Ability to process a sentence as a while instead of word by word. Faster and more efficient text processing (less training time).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a transformer-based LLM?

A

powerful models that can understand and generate human-like text. Trained on vast amounts of text data from the internet, books and other sources where they learn patterns and relationships between words and phrases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A ____ model can take in a mix of audio, image and text and output a mix of video, image and text.

A

multi-modal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Recap differences between AI vs ML vs DL vs Gen AI:

Sometimes we know “if this happens, then do that” (AI)
Sometimes we’ve seen a lot of similar things before, and we classify them (ML)
Sometimes we haven’t seen something before, but we have “learned” a lot of similiar concepts, so we can make a decision (DL)
Sometimes, we get creative and based on what we’ve learned, we can generate content (GenAI)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

GPT (Generative Pre-trained Transformer) - generate human text or computer code based on input prompts. (for language)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

BERT (Bidirectional Encoder Representations from Transformers) - similar intent to GPT, but reads the text in two directions. Makes it good for language translations. (for language)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

RNN (Recurrent Neural Network) - meant for sequential data such as time-series or text, useful in speech recognition, time-series prediction.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

ResNet (Residual Network) - Deep Convolutional Neural Network (CNN) used for image recognition tasks, object detection and facial recognition. (for images)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

SVM (Support Vector Machine) - ML algorithm for classification and regression.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

WaveNet - model to generate raw audio waveform, used in Speech Synthesis. (for audio)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

GAN (Generative Adversarial Network) - models used to generate synthetic data such as images, videos or sounds that resemble the training data. (for data augmentation)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

XGBoost (Extreme Gradient Boosting) - an implementation of gradient boosting

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

____ data includes both input features and corresponding output labels.

A

Labeled: for exmple an image of animals where each is labeled according to the animal type

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

With labeled data, you are able to use ____ learning where the model is trained to map inputs to known outputs.

A

supervised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

____ data includes only input features without any output labels.

A

Unlabeled: for example an image of cats and dogs with no label identifying what is cat and what is dog.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

With unlabeled data, you are able to use ____ learning where the model tries to find patterns or structures in the data.

A

unsupervised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

____ data is organized in a structured format, often in rows and columns.

A

structured; can be tabular data in rows and columns, or time series data (data points collected or recorded at successive points in time)
There are other types of structured data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

____ doesn’t follow a specific structure and is often text-heavy or multimedia content.

A

Unstructured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Examples of unstructured ____ data can include articles, social media posts, or customer reviews.

A

text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Examples of unstructured ____ data is data in the form of images which can vary widely in format and content.

A

image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Supervised Learning needs ____ data; its very powerful, but difficult to perform on millions of datapoints.

A

labeled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

With a Regression ML algorithm using supervised learning, we can make numerical ____ based on input data.

A

predictions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

With a Regression ML algorithm, the output variable is ____, meaning it can take any value within a range.

A

continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

A use case for ____ ML algorithm is when the goal is to predict a quanity or real value.

A

Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

With a Classification ML algorithm using supervised learning, we try to predict the ____ label of input data.

A

categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

With a Classification ML algorithm, the output variable is ____, meaning it falls into a specific category or class.

A

discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

A use case for ____ ML algorithm are scenarios where decisions or predictions need to be make between distinct categories (fraud, image classification, etc)

A

Classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Examples of the supervised learning classification algorithm can include binary, multiclass and multi-label classification. Identify each below:

a) classify emails as “spam” or “not spam”
b) classify animals in a zoo as “mammal”, “bird”, “reptile”
c) assign multiple lables to a movie, like “action” or “comedy”

A

a) binary classification
b) multiclass classification
c) multi-label classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

With supervised learning for a ML algorithm, typically you use 60%-80% of your dataset for ____.

A

training

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

With supervised learning for a ML algorithm, typically you use 10%-20% of your dataset for ____.

A

validation; used to rune model parameters and validate performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

With supervised learning for a ML algorithm, typically you use 10%-20% of your dataset for ____.

A

testing; used to evaluate the final model performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

____ engineering is the process of using domain knowledge to select and transform raw data into meaningful features.

A

Feature; an example would be converting a birth date to an age

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Identify the techniques of feature engineering used with supervised learning:

Extracting useful information from raw data, such as deriving age from date of birth.

a) feature extraction
b) feature selection
c) feature transformation

A

feature extraction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Identify the techniques of feature engineering used with supervised learning:

Selecting a subset of relevant features, like choosing important predictors in a regression model.

a) feature extraction
b) feature selection
c) feature transformation

A

feature selection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Identify the techniques of feature engineering used with supervised learning:

Transforming data for better model performance, such as normalizing numerical data.

a) feature extraction
b) feature selection
c) feature transformation

A

feature transformation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Feature Engineering on Stuctured Data: Predicting house prices based on features like size, location, and number of rooms.

Deriving new features like “price per square foot” is:

a) feature extraction
b) feature selection
c) feature transformation

A

feature extraction

46
Q

Feature Engineering on Stuctured Data: Predicting house prices based on features like size, location, and number of rooms.

Identifying and retaining important features such as location or number of bedrooms:

a) feature extraction
b) feature selection
c) feature transformation

A

feature selection

47
Q

Feature Engineering on Stuctured Data: Predicting house prices based on features like size, location, and number of rooms.

Normalizing features to ensure they are on a simliar scale, which helps algorithms like gradient descent converge faster:

a) feature extraction
b) feature selection
c) feature transformation

A

feature transformation

48
Q

Identify the feature engineering task on Unstuctured Data:

Converting text into numerical features using techniques like TF-IDF or word embeddings.

a)text data
b) image data

A

text data

49
Q

Identify the feature engineering task on Unstuctured Data:

Extracting features such as edges or textures using techniques like convolutional neural networks (CNN)

a)text data
b) image data

A

image data

50
Q

The goal of ____ learning is to discover inherent patterns, structures, or relationships within the unlabeled input data.

A

unsupervised

51
Q

The Unsupervised ML algorithm must uncover and create the groups itself, but humans still put labels on the ____ groups.

A

output

52
Q

Which unsupervised learning technique is used to group simliar data points together into clusters based on their features?

a) Clustering
b) Association Rule Learning
c) Anaomaly Detection

A

clustering

53
Q

Which unsupervised learning technique is used when you want to find relationships in the data (using the apriori algorithm)?

a) Clustering
b) Association Rule Learning
c) Anaomaly Detection

A

association rule learning

54
Q

Which unsupervised learning technique could be used to flag potentially fraudulent transactions for further investigation?

a) Clustering
b) Association Rule Learning
c) Anaomaly Detection

A

anomaly detection

55
Q

____ learning is when you use a small amount of labeled data and a large amount of unlabeled data to train systems.

A

semi-supervised

56
Q

With semi-supervised learning, where there is a small amount of labeled and large amount of unlabeled data, the model itself labels the remaining unlabeled data. This is called ____.

A

pseudo-labeling

57
Q

After the completion of pseudo-labeling in semi-supervised learning, the entire model is ____ on the resulting data mix without being explicitly programmed.

A

re-trained

58
Q

____ learning is when you have a model generate pseudo-labels for its own data without having humans label any data first.

A

self-supervised; widely used in NLP and image recognition tasks

59
Q

____ learning is a type of ML where an agent learns to make decisions by performing actions in an env to maximize cumaltive rewards.

A

Reinforcement; (remember robot max example)

60
Q

Reinforcement Learning from ____ uses human feedback to help ML models to self-learn more efficiently.

A

Human Feedback

61
Q

True/False: RLHF does not significantly enhance model performance.

A

False: it does enhance

62
Q

What are the four steps to RLHF?

A

1) data collection - set of human0generatedprompts and responses are created
2) supervised fine-tuning of a language model - fine-tune existing model with internal knowledge; then the model creates responses for the human-generated prompts
3) build a separate reward model - humans indicate which responses they prefer from same prompt; reward model can now estimate how a human would respond
4) optimize the language model with the reward-based model - use the reward model as a reward function for RL

63
Q

When a model has poor performance on evaluation data, but performs well on training data, it can be described as ____.

A

overfitting

64
Q

When a model has poor performance on training data, it is said to be ____.

A

underfitting

65
Q

A ____ model is neither overfitting or underfitting.

A

balanced

66
Q

____ is the difference or error between predicted and actual value.

A

Bias

67
Q

A ____ bias is when the model doesn’t closely match the training data; considered underfitting.

A

high

68
Q

What are two ways you can reduce bias?

A

user more complex model or increase number of features

69
Q

_____ is how much the performance of a model changes if trained on a different dataset which has a simliar distribution.

A

Variance

70
Q

If your model has a ____ variance, it is very sensitive to changes in the training data and is considered overfitting.

A

high

71
Q

What are two ways you can reduce variance?

A

feature selection (less, more important features)
split into training and test data sets multiple times

72
Q

Which model evaluation metrics are used to evaluate the accuracy of a binary classification?

A

precision, recall, F1, accuracy

73
Q

The confusion matrix is made up of ____ value and ____ value.

A

actual and predicted

74
Q

The purpose of the confusion matrix is to evaluate the ____ of a model that does ____.

A

performance
classifications

75
Q

True/False: A confusion matrix can be multi-dimensional.

A

t

76
Q

Identify the best metric for the confusion matrix scenario below:

When false positives are costly

a) precision
b) recall
c) F1 score
d) accuracy

A

precision

77
Q

Identify the best metric for the confusion matrix scenario below:

When false negatives are costly

a) precision
b) recall
c) F1 score
d) accuracy

A

recall

78
Q

Identify the best metric for the confusion matrix scenario below:

When you want a balance between precision and recall, especially in imbalanced datasets

a) precision
b) recall
c) F1 score
d) accuracy

A

F1 score

79
Q

Identify the best metric for the confusion matrix scenario below:

When dealing with balanced datasets

a) precision
b) recall
c) F1 score
d) accuracy

A

accuracy

80
Q

Identify the confusion matrix value:

When actual value is positive and predicted value is positive.

a) true positive
b) false negative
c) false positive
d) true negative

A

true positive

81
Q

Identify the confusion matrix value:

When actual value is positive and predicted value is negative.

a) true positive
b) false negative
c) false positive
d) true negative

A

false negative

82
Q

Identify the confusion matrix value:

When actual value is negative and predicted value is negative.

a) true positive
b) false negative
c) false positive
d) true negative

A

true negative

83
Q

Identify the confusion matrix value:

When actual value is negative and predicted value is positive.

a) true positive
b) false negative
c) false positive
d) true negative

A

false positive

84
Q

The ____ metric has a value from 0 to 1 (perfect model) and uses sensitivity (true positive rate) and 1-specificity (false positive rate).

A

AUC-ROC - Area under the curve-receiver operator curve

85
Q

When trying to choose your model for binary classification, its best to use the ____ metric to select the best model.

A

AUC-ROC

86
Q

The following regression metrics are used to give the ____ of a regression to see if it is acceptable:

Mean Absolute Error (MAE)
Mean Absolute Percentage Error (MAPE)
Root mean squared error (RMSE)
R2 (R Squared)

A

quality

87
Q

MAE, MAPE, RMSE, R2 are all used for evalutating models that predict a ____ value (i.e., regressions)

A

cointinuous

88
Q

Precision, recall, F1 and accuracy metrics are all used for evaluating a ____.

A

classification

89
Q

____ is when a model is making a prediction on new data.

A

Inferencing

90
Q

____ inferencing use cases are for when computers have to make decisions quickly as data arrives. Speed is preferred over perfect accuray. Example: chatbots

A

Real Time

91
Q

____ inferencing use cases are when large amounts of data that is analyzed at once. Often used for data analysis and when speed of the results is not a concern and accuracy is.

A

Batch

92
Q

Which language model characteristics are described below for inferencing on edge devices:
Very low latency, low compute footprint, offline capability, local inference

a) small language mode (SLM) on edge device
b) large language model (LLM) on remote server

A

small language model (SLM)

93
Q

Which language model characteristics are described below for inferencing on edge devices:
More powerful model, higher latency, must be online to be accessed

a) small language mode (SLM) on edge device
b) large language model (LLM) on remote server

A

large language model (LLM)

94
Q

____ are settings that define the model structure and learning algorithm and process.

A

Hyperparameters

95
Q

Hyperparameters are set before ____ begins.

A

training

96
Q

Hyperparameter ____ is about finding the best hyperparameter values to optimize the model performance. It improves model accuracy , reduces overfitting, and enhances generalization.

A

tuning

97
Q

Identify the hyperparameter based on the description below:

How large or small the steps are when updating the model’s weights during training.

a) learning rate
b) batch size
c) number of epochs

A

learning rate

98
Q

Identify the hyperparameter based on the description below:

Number of training examples used to update the model weights in one iteration.

a) learning rate
b) batch size
c) number of epochs

A

batch size

99
Q

Identify the hyperparameter based on the description below:

Refers to how many times the model will iterate over the entire training dataset.

a) learning rate
b) batch size
c) number of epochs

A

number of epochs

100
Q

A ____ learning rate (hyperparameter) can lead to faster convergence but risks overshooting the optimal solution.

A

high

101
Q

A ____ learning rate (hyperparameter) may result in more precise but slower convergence.

A

low

102
Q

____ batches (hyperparameter) can lead to more stable learning but require more time to computer.

A

Smaller

103
Q

____ batches (hyperparameter) are faster but may lead to less stable updates.

A

Larger

104
Q

Too ____ epochs (hyperparameters) can lead to underfitting.

A

few

105
Q

Too ____ epochs (hyperparameters) can lead to overfitting.

A

many

106
Q

____ is when the model gives good predictions for training data, but not for the new data.

A

Overfitting

107
Q

Identify the possible causes of overfitting:

a) training data size is too small and does not represent all possible input values
b) the model trains too long on a single same set of data
c) model complexity is high and learns from the “noise” within the training data
d) all of the above

A

all of the above

108
Q

What is the best way to prevent overfitting?

A

increase the training data size. Other methods include:
early stopping the training of the model; data augmentation (to increase diversity in the dataset); adjust hyperparameters (but you can’t “add” them)

109
Q

For ____ problems (the solution can be computer), it is better to write computer code that is adapted to the problem instead of ML.

A

deterministic

110
Q

Why should ML not be used to solve deterministic problems?

A

ML gives “approximation” answers, which are worse than the perfect answers of deterministic solution which can be computed.