Common AI Workloads Flashcards

1
Q

What features and capabilities does Azure Machine Learning provide?

A

Automated Machine Learning

Azure Machine Learning designer

Data and compute management

Pipelines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is labelling?

A

The process of identifying raw data (images, text files, audio, etc.) and adding one or more meaningful and informative labels to provide context for machine learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is unsupervised learning?

A

Unsupervised learning is a subcategory of ML defined by its use of unlabelled datasets to train models that discover hidden patterns or data groupings without human intervention.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is supervised learning?

A

Supervised learning is a subcategory of ML defined by its use of labelled datasets to train models that classify data or predict outcomes precisely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are two examples of supervised learning models?

A

Classification and Regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is an example of unsupervised learning models?

A

Clustering.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a dataset?

A

A collection of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What’s the difference between unsupervised ML labelling and supervised ML labelling?

A

With supervised ML, labelling is prerequisite to produce training data and each piece of data will generally be labelled by a human.

With unsupervised ML, labelling is produced by the computer and may not be human readable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is regression?

A

A form of machine learning that is used to predict a numeric label based on an item’s features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is time series forecasting?

A

Regression with a time-series element, that predicts numeric values at a future point in time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is classification?

A

A form of machine learning that is used to predict which category, or class, an item belongs to.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is clustering?

A

A form of machine learning that is used to group similar items into clusters based on their features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a ground truth?

A

A properly labelled dataset used as the objective standard to train and assess a given model.

The accuracy of the trained model is dependant on the accuracy of the ground truth.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the stages of the ML pipeline and what are they for?

A

Pre-processing - preparing data and feature engineering before passing the data to an ML model for training or inference.

Post processing - translating the output of a ML model back into a human readable format

Training - the process of training the model

Serving - the process of deploying the model to an endpoint to be used for inference

Inference - Invoking a ML model by sending a request and expecting back a prediction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is data cleaning?

A

The process of correcting errors within a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is data reduction?

A

Reducing the volume of data, or applying dimensionality reductions to reduce the dimensions of inputted vectors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is feature engineering?

A

Transforming data into numerical values (vectors) to be ingested by a ML model.

18
Q

What is sampling?

A

Balancing a dataset to be uniform across labels by adding or removing records.

19
Q

How are features and labels used in ML?

A

ML uses features to predict labels.

20
Q

What is a training dataset?

A

A training dataset is used to train an ML model.

21
Q

What is a validation dataset?

A

A validation dataset is used to estimate the accuracy of an ML model.

22
Q

What is an algorithm in ML?

A

A procedure run on data to create an ML model.

23
Q

How do ML algorithms work?

A

By performing pattern recognition. They learn from data or are fit on a dataset.

24
Q

What are the 5 model evaluation metrics for classification?

A

Accuracy

Precision

Recall

F1 Score

AUC

25
Q

What does MAE measure?

A

The average difference between predicted values and true values.

The lower this value is, the better the model is predicting.

26
Q

What does RMSE measure?

A

The average square root of the mean squared difference between predicted values and true values.

When compared to the MAE, a larger difference indicates greater variance in the individual errors.

27
Q

What unit are MAE and RMSE based on?

A

The same unit as the label.

28
Q

What is RSE?

A

A relative metric based on the differences between predicted values and true values.

29
Q

What is RAE?

A

A relative metric based on the absolute differences between predicted values and true values.

30
Q

What is the range of RSE and RAE?

A

0 to 1.

The closer to 0 the metric is, the better the model is performing.

31
Q

What can you use RSE and RAE for and why?

A

As the metrics are relative, they can be used to compare models where the labels are in different units.

32
Q

What does accuracy measure?

A

The ratio of correct predictions (true positives + true negatives) to the total number of predictions.

33
Q

What does precision measure?

A

The fraction of positives correctly identified.

True positives / true positives + false positives

34
Q

What does recall measure?

A

The fraction of classified positives that were actually positives.

(true positives / true positive + false positives)

35
Q

What is F1 Score?

A

An overall metric that essentially combines precision and recall.

36
Q

What is AUC?

A

Area under curve. It is the metric that measures the area under the ROC curve.

It can be any value from 0 to 1.

The larger the AUC, the better the model is performing.

37
Q

What is knowledge mining?

A

A discipline in AI that uses a combination of intelligent services to quickly learn from vast amounts of information.

38
Q

What would you use knowledge mining for?

A

Content research

Auditing, risk, and compliance management

Business process management

Customer support and feedback analysis

Digital asset management

Contract management

39
Q

What is feature selection?

A

The process of deciding which relevant original features to include and which irrelevant features to exclude for predictive modelling.

40
Q

What is the difference between feature selection and dimensionality reduction?

A

In feature selection, the original features don’t change. In dimensionality reduction new features are created from original features.