AWS SAGEMAKER Flashcards
what is aws sagemaker?
the one place for ML, a fully-managed service to build ML models with built-in algorithms
what is automatic model tuning (AMT)?
also called hyperparameter tuning, it’s a feature of aws sagemaker that automatically optimizes hyperparameters to improve performance in ML models
by defining the objective metric you want to tune, AMT automatically chooses hyperparameter ranges, search strategy, maximum runtime of a tuning job, and early stop conditions
saves time and money
inference is the proccess of making predicitons using a deployed model, what is the model deployment proccess like in sagemaker?
deployment with one click, automatically scalling, no server to manage
real-time inferences: one prediction at a time
asynchronous inferences: for large payload sizes up to 1GB, long processing times, near-real time latency requirements, request and responses are on s3
batch inferences: predictions for an entire dataset (multiple predictions), request and responses are in s3 - NOT for fine-tuning
serverless: meaning idle period between traffic spikes, can tolerate more latency
what is sagemaker studio?
IDE interface that allows E2E ML development from a unified interface
team collaboration
tune and debug ML models
deploy ML models
automated workflows
what is sagemaker clarify?
part of sagemaker studio
for FM evaluations
bias detection in datasets and models
explainability (why and how predictions are made)
explainability: a transparent and explainable ML model fosters TRUST and CONFIDENCE on predictions, and facilitates debugging and optimization
what is sagemaker data wrangler?
it’s a data quality tool
interface for data selection, cleasing, exploration, visualization and proccessing
prepares tabular and image data for ML
sql support
what is the sagemaker feature store?
a fully-managed repository for storing, sharing, and retrieving features used in ML models
can publish directly from sagemaker data wrangler into the feature store
features are discoverable within sagemaker studio
what is sagemaker groundtruth?
fully managed data labelling service for creating high-quality datasets for ML models
use workers from mechanical turk, your employees ir third party vendors for human feedback
in sagemaker groundtruth plus you can also use this workforce for data labelling
about governance in aws sagemaker
sagemaker model cards: gather essential model info in one place, describe how a model should be used in production
sagemaker model dashboard: centralized repository, information and insights for all models
sagemaker role manager: define roles for personas in the aws account
sagemaker model monitor: monitor the quality of your model in production, set up alerts for deviations in the model quality
sagemaker model registry: centralized repository that allows you to track, manage and version ML models
sagemaker pipelines: create a workflow that automates the proccess of building, entertaining and deploying a model
what is sagemaker jumpstart?
a ML hub to find pre-trained FMs, computer vision models or NLP models
models can be fully customized for data and use-cases and are deployed directly on sagemaker
provides pre-trained, open-source and proprietary models
you can evaluate and compare models quickly
what is sagemaker canvas?
build ML models with no coding required with a visual interface
access ready-to-use models from bedrock or jumpstart
what is MLFlow?
open-source tool which helps teams manage the entire ML lifecycle
tracking servers are used to track runs and experiments, launch on sagemaker with a few clicks
sagemaker summary
sagemaker allows you to build, train and develop models in one place
sagemaker: E2E ML service
sagemaker automatic model tuning: tune hyperparameters
sagemaker deployment and inference: real-time, serverless, batch, asynchronous
sagemaker studio: unified interface ror sagemaker
sagemaker data wrangler: explore and prepare datasets, create features
sagemaker feature store: store features in metadata in a central place
sagemaker clarify: compare models, explain model outputs, detect bias
sagemaker groundtruth: RLHF, humans for model grading and data labelling
sagemaker model cards: ML model documentation
sagemaker model dashboard: view all your models in one place
sagemaker model monitor: monitoring and alerts for your model
sagemaker model registry: centralized repository to manage ML model versions
sagemaker pipelines: CI/CD for ML
sagemaker role manager: IAM
sagemaker jumpstart: ML model hub and pre-built ML solutions
sagemaker canvas: no code interface for sagemaker
MLFlow on sagemaker: use MLFlow tracking servers on aws