AI / ML Information Fundamentals Flashcards

1
Q

What are the three main types of AI?

A

Machine learning
Deep learning
Generative AI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are multi-modal models?

A

They do not rely on a single type of input or output. They can support (text, or images, or audio)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does GPT mean?

A

Generative pre-trained Transformer. It generates human text or computer code based on input prompts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does BERT mean?

A

Bidirectional Encode Representations from Transformers. Similar to GPT, but reads the text in two directions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does RNN mean?

A

Recurrent Nureal Network. It is meant for sequential data such as time series or test. Useful in speech recognition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does GAN mean?

A

Generative Adversarial Network. Used to generate synthetic data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is labeled data?

A

Data that includes both input features and output labels. For example, where images of animals are also labeled with the name of the animal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is unlabeled data?

A

It is data where an input feature is defined, but no output labels are included. For example, I uploaded images of animals, but never provided the name of those animals as an output label. Used for unsupervised learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is structured data?

A

Tabular data like MS SQL and Excel.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is unstructured data?

A

Data that does not follow a specific structure. Articles, social media posts, customer reviews.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Is image data structured or unstructured data?

A

Unstructured.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is supervised learning

A

It can predict the output for new unseen input data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Does supervised learning require labeled or unlabeled data?

A

Labeled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is regression?

A

Used to predict a numeric value based on input data. For example, the cost of a house based on the size, the weight of a person based on their height.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is classification?

A

It is used to predict the categorical label of input data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Does classification have to be Binary?

A

No it can have multiple class classifications and multiple label classifications. E.g., “Acction” and “Comedy” for a movie or “Mammal” and “bird”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a validation set?

A

It is a subset of your training data used to tune and validate performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a test set?

A

It is used to evaluate the final model performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is feature engineering?

A

Transformation of your raw data. example, converting birth date to age. You can create new labels with calculated values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is unsupervised learning?

A

Discovery of inherent patterns, structures, or relationships within the input data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is clustering?

A

Used to group similar data points together based on their features.

22
Q

What is association rule learning?

A

It is performed on unsupervised data to find products frequently bought together.

23
Q

What is anomaly detection?

A

It is used for example to find fraud by finding outliers in data.

24
Q

What is semi-supervised learning?

A

It uses both labeled and unlabeled data for training.

25
Q

What is self-supervised learning?

A

It’s when a model generates labels for data without having humans label it first.

26
Q

What are pre-test tasks in self-supervised learning?

A

They solve simple tasks and learn patterns in the dataset.

27
Q

What is re-enforcement learning?

A

A type of machine learning where and agent learns to make decisions by performing actions to maximize cumulative rewards.

28
Q

What is an agent in reinforcement learning?

A

It is the learner or decision maker. example, deep racer.

29
Q

What is RLHF?

A

It is reinforcement learning from human feedback.

30
Q

What is model overfitting?

A

When the model performs well on the training data, but badly on the evaluation data.

31
Q

What is underfitting?

A

When the model performs poorly on the training data. It could mean the model is too simple or has poor data features.

32
Q

If a model is underfitting, what can you do?

A

Perform feature engineering.

33
Q

What is Bias?

A

The difference or error between the predicted and actual values.

34
Q

What is high bias?

A

When the model doesn’t closely match the training data. This means model underfitting.

35
Q

What can be done to reduce bias?

A

Increase the number of features or use a more complex model.

36
Q

What is model varience?

A

How much the performance changes if trained on a different dataset.

37
Q

What does high variance mean?

A

The model is overfitting and sensitive to changes in the training data.

38
Q

How do you reduce model varience?

A

Reducing feature selection and splitting your training and tests data sets multiple times.

39
Q

What are confusion matrices?

A

A metric to evaluate your model. True positive, false positive, true negative, false negative.

40
Q

What are Precision, Recall, F1, and Accuracy used for?

A

Metrics used for Binary classification.

41
Q

What is AUC-ROC used for?

A

Finding the best model for binary classification.

42
Q

What is the mean absolute error

A

mean between predicted and actual values

43
Q

What is the mean absolute percentage error?

A

mean in percentage between predicted and actual values

44
Q

What is inferencing?

A

When a model is making a predication on new data.

45
Q

What is the downside of real-time inferencing?

A

Speed is more important than accuracy. Example, a chat bot.

46
Q

What is the benefit of batch processing in the context of inferencing.

A

Accuracy is valued above speed.

47
Q

What kind of language models can run on the edge?

A

A small language model (SLM)

48
Q

What is the learning rate hyperparameter?

A

How large or small the steps are when updating the model’s weight. Too high can overshoot the correct answer and too low may take longer, but have more accuracy.

49
Q

What is the Batch Size hyperparameter?

A

The number of training examples used to update the model weights in one iteration. Smaller batches are more stable, but take more time. Larger batches are faster, but less stable.

50
Q

What is the Number of epochs hyperparameter?

A

The number of times the model will iterate over the dataset. too few is underfitting, too many is overfitting.

51
Q

What can be done to prevent overfitting?

A

Increase the training data size, stopping the training data early, feature engineering.

52
Q
A