Basics Flashcards

1
Q

Machine learning

A

his is often the foundation for an AI system, and is the way we “teach” a computer model to make predictions and draw conclusions from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Computer vision

A

Capabilities within AI to interpret the world visually through cameras, video, and images.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Natural language processing

A

Capabilities within AI for a computer to interpret written or spoken language, and respond in kind.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Document intelligence

A

Capabilities within AI that deal with managing, processing, and using high volumes of data found in forms and documents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Knowledge mining

A

Capabilities within AI to extract information from large volumes of often unstructured data to create a searchable knowledge store.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Generative AI

A

Capabilities within AI that create original content in a variety of formats including natural language, image, code, and more.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Automated machine learning:

A

this feature enables non-experts to quickly create an effective machine learning model from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Azure Machine Learning designer

A

a graphical interface enabling no-code development of machine learning solutions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Semantic segmentation

A

Semantic segmentation is an advanced machine learning technique in which individual pixels in the image are classified according to the object to which they belong. For example, a traffic monitoring solution might overlay traffic images with “mask” layers to highlight different vehicles using specific colors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Data metric visualization:

A

analyze and optimize your experiments with visualization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Notebooks

A

write and run your own code in managed Jupyter Notebook servers that are directly integrated in the studio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how is document intelligence used

A

reading and processing documents like filling out forms, checking forms, finding the right documents in massive scanned document lists

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Image analysis

A

You can create solutions that combine machine learning models with advanced image analysis techniques to extract information from images, including “tags” that could help catalog the image or even descriptive captions that summarize the scene shown in the image.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is knowledge mining used?

A

creating indexes of large number of internal/ external documents to improve search, nlp and tagging for finding documents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Supervised machine learning

A

Supervised machine learning is a general term for machine learning algorithms in which the training data includes both feature values and known label values. Supervised machine learning is used to train models by determining a relationship between the features and labels in past observations, so that unknown labels can be predicted for features in future cases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Regression

A

Regression is a form of supervised machine learning in which the label predicted by the model is a numeric value. For example:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Classification

A

Classification is a form of supervised machine learning in which the label represents a categorization, or class. There are two common classification scenarios.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Binary classification

A

n binary classification, the label determines whether the observed item is (or isn’t) an instance of a specific class. Or put another way, binary classification models predict one of two mutually exclusive outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Multiclass classification

A

Multiclass classification extends binary classification to predict a label that represents one of multiple possible classes.

20
Q

Unsupervised machine learning

A

Unsupervised machine learning involves training models using data that consists only of feature values without any known labels. Unsupervised machine learning algorithms determine relationships between the features of the observations in the training data.

21
Q

Clustering

A

The most common form of unsupervised machine learning is clustering. A clustering algorithm identifies similarities between observations based on their features, and groups them into discrete clusters

22
Q

what are the 4 key elements in a training of a supervised model

A

Split the data into training and testing
Use an algorithm to fit the data to the model
use the validation to testing the model against the subset
Compare results

23
Q

What is Mean Absolute Error (MAE)

A

The variance in this example indicates by how many ice creams each prediction was wrong. It doesn’t matter if the prediction was over or under the actual value (so for example, -3 and +3 both indicate a variance of 3). This metric is known as the absolute error for each prediction, and can be summarized for the whole validation set as the mean absolute error (MAE).

It is an average of the error between predicted and actuall results

24
Q

Mean Squared Error (MSE)

A

The mean absolute error metric takes all discrepancies between predicted and actual labels into account equally. However, it may be more desirable to have a model that is consistently wrong by a small amount than one that makes fewer, but larger errors. One way to produce a metric that “amplifies” larger errors by squaring the individual errors and calculating the mean of the squared values. This metric is known as the mean squared error (MSE).

25
Q

Root Mean Squared Error (RMSE)

A

The mean squared error helps take the magnitude of errors into account, but because it squares the error values, the resulting metric no longer represents the quantity measured by the label. In other words, we can say that the MSE of our model is 6, but that doesn’t measure its accuracy in terms of the number of ice creams that were mispredicted; 6 is just a numeric score that indicates the level of error in the validation predictions.

26
Q

Coefficient of determination (R2)

A

All of the metrics so far compare the discrepancy between the predicted and actual values in order to evaluate the model. However, in reality, there’s some natural random variance in the daily sales of ice cream that the model takes into account. In a linear regression model, the training algorithm fits a straight line that minimizes the mean variance between the function and the known label values. The coefficient of determination (more commonly referred to as R2 or R-Squared) is a metric that measures the proportion of variance in the validation results that can be explained by the model, as opposed to some anomalous aspect of the validation data (for example, a day with a highly unusual number of ice creams sales because of a local festival).

27
Q

confusion matrix

A

a matrix to compare TP, TN, FP, FN
(true positive, True neg, etc)

28
Q

Accuracy

A

The simplest metric you can calculate from the confusion matrix is accuracy - the proportion of predictions that the model got right. Accuracy is calculated as:

(TN+TP) ÷ (TN+FN+FP+TP)

29
Q

Recall

A

call is a metric that measures the proportion of positive cases that the model identified correctly. In other words, compared to the number of patients who have diabetes, how many did the model predict to have diabetes?

The formula for recall is:

TP ÷ (TP+FN)

30
Q

Precision

A

Precision is a similar metric to recall, but measures the proportion of predicted positive cases where the true label is actually positive. In other words, what proportion of the patients predicted by the model to have diabetes actually have diabetes?

The formula for precision is:

TP ÷ (TP+FP)

31
Q

F1-score

A

F1-score is an overall metric that combined recall and precision. The formula for F1-score is:

(2 x Precision x Recall) ÷ (Precision + Recall)

32
Q

Area Under the Curve (AUC)

A

Another name for recall is the true positive rate (TPR), and there’s an equivalent metric called the false positive rate (FPR) that is calculated as FP÷(FP+TN).

33
Q

Deep Learning

A

A process that mimics the human mind, in a limited fashion. We train a model (or fit a model) so that inputs have weights that impact the observation. A higher weight has more impact.

34
Q

What is the loss function?

A

loss function determines the overall variance, or loss, between predicted and actual label values.

35
Q

What is a feature?

A

Feature is the model input - these are the fields in the data that drive an answer (a label)

36
Q

What is a label

A

A label is the answer of a model - a yes/no or a value from a list. We look at features to predict a label

37
Q

what is feature selection?

A

during preprocessing you determine features that influence the prediction.

38
Q

what algorithm is used for clustering

A

k-means clustering

39
Q

what are the 4 typical steps for model training

A

Feature selection
removing data outliers
impute missing values
normalize numeric features

40
Q

the principle that describes raising awareness of the limitations of responsible AI-based
solutions is called

A

transparency

41
Q

What capabilities does azure cognitive services support

A

Azure Text Analytics supports chatbot integration,
multilingual content, and
confidence scoring.
It recognizes about 120 languages. Document sizes must
be under 5,120 characters.

42
Q

What is chitchat?

A

You can add personality to a chatbot by providing answers that use a specific
conversational tone. You use the chitchat feature to add the answers to a
chatbot knowledge base.

43
Q

Which is a multiclass classification algorithm

A

Decision Forest

44
Q

Principal Component Analysis is

A

to simplify complex data sets by finding the most important features or dimensions.

45
Q

The fraction of time when the model is correct is known as:

A

Accuracy

46
Q

Which of these confirms how often the model is correct:

A

Precision

47
Q

Which value identifies how much the model finds all there is to find?

A

Recall