Machine Learning Engineering Associate 2 Flashcards by Yitzchak Meirovich

Data Wrangler

Visual data preparation tool in Amazon SageMaker for exploring; transforming; and analyzing data

How well did you know this?

Not at all

Perfectly

Glue

Fully managed extract; transform; and load (ETL) service

How well did you know this?

Not at all

Perfectly

Glue DataBrew

Visual data preparation tool that makes it easy to clean and normalize data

How well did you know this?

Not at all

Perfectly

Kinesis

Platform for streaming data on AWS

How well did you know this?

Not at all

Perfectly

Lambda

Serverless compute service for running code without provisioning servers

How well did you know this?

Not at all

Perfectly

SageMaker Ground Truth

Fully managed data labeling service for building accurate training datasets

How well did you know this?

Not at all

Perfectly

Class imbalance

Situation where classes in a dataset are not represented equally

How well did you know this?

Not at all

Perfectly

Server-side encryption

Data encryption performed by the storage service

How well did you know this?

Not at all

Perfectly

Client-side encryption

Data encryption performed by the client before sending to storage

How well did you know this?

Not at all

Perfectly

Data anonymization

Removing or encrypting personally identifiable information from datasets

How well did you know this?

Not at all

Perfectly

Supervised learning

ML approach where the model is trained on labeled data

How well did you know this?

Not at all

Perfectly

Unsupervised learning

ML approach where the model is trained on unlabeled data

How well did you know this?

Not at all

Perfectly

Reinforcement learning

ML approach where an agent learns to make decisions by interacting with an environment

How well did you know this?

Not at all

Perfectly

Feature importance

Measure of how much each feature contributes to the model’s predictions

How well did you know this?

Not at all

Perfectly

SHAP values

Shapley Additive exPlanations; a game theoretic approach to explain machine learning model outputs

How well did you know this?

Not at all

Perfectly

XGBoost

Gradient boosting algorithm known for speed and performance

How well did you know this?

Not at all

Perfectly

Epoch

Study These Flashcards

One complete pass through the entire training dataset

Early stopping

Study These Flashcards

Technique to stop training when performance on a validation set stops improving

Distributed training

Study These Flashcards

Spreading the training process across multiple compute resources

Hyperparameter tuning

Study These Flashcards

Process of finding the best combination of hyperparameters for a model

Transfer learning

Study These Flashcards

Using knowledge gained from solving one problem to solve a related problem

Dropout

Study These Flashcards

Technique where randomly selected neurons are ignored during training

Weight decay

Study These Flashcards

Adding a penalty term to the loss function to prevent overfitting

Random search

Study These Flashcards

Randomly sampling hyperparameters from a defined search space

Bayesian optimization

Using probabilistic model to guide the search for optimal hyperparameters

Confusion matrix

Table showing correct and incorrect predictions for each class

F1 score

Harmonic mean of precision and recall

ROC

Graph showing the performance of a classification model at all classification thresholds

AUC

Measure of the ability of a classifier to distinguish between classes

Overfitting

Model performs well on training data but poorly on unseen data

Underfitting

Model performs poorly on both training and unseen data

Concept drift

Changes in the underlying relationships between input and output variables

Data drift

Changes in the statistical properties of the input data

A/B testing

Experiment where two variants of a model are compared to determine which performs better

CloudTrail

Service that records API calls and other account activity in AWS

Cost Explorer

Tool for visualizing; understanding; and managing AWS costs and usage over time

IAM roles

Set of permissions that define what actions are allowed or denied in AWS

Security groups

Virtual firewalls for controlling inbound and outbound traffic to AWS resources

Network ACLs

Optional layer of security that acts as a firewall for controlling traffic in and out of subnets

Least privilege access

Principle of giving users the minimum levels of access necessary to complete their tasks

Machine Learning Engineering Associate 2 Flashcards

Data Transformation, Integrity and Feature Engineering (40 cards)