SageMaker Flashcards
What is automatic tuning in Sage Maker ?
Specify destination and SageMaker automatically manages the tuning of hyper parameters thus saving you time and money.
What is included in SageMaker automatic tuning ?
hyperparameter](brain://weqHXmTqWEm83Ib-YOFJ3g/Hyperparameter) ranges, search strategy, maximum runtime of a tuning job and early stop condition
What are the four deployment options in SageMaker deploy
Real-Time which handles one prediction at a time and where you configure cpu, gpu and autoscaling rules.
Serverless - Idle period between traffic spikes can tolerate more latency ram is configurable and autoscaling is out of the box.
Asynchronous - High data loads 1gb stored in S3 requiring longer payload processing times
Batch - High latency (minutes to hours) used for bulk processing of large data sets.
What is SageMaker Canvas
SageMaker canvas is for developing ML models without the need for coding. It allows you to use a visual interface. Access to ready-to-use model from Bedrock and Jumpstart.
Build your own custom model using AutoML powered by SageMaker autopilot. You can also leverage Data Wrangler for data preparation.
What is SageMaker Clarify ?
This is where we compare between models using humans (AWS Workforce or your own) on metrics important to humans such as humour or friendliness. You can use your own datasets or bring your own and use the built in metrics and algorithms. It is part of SageMaker studio.
Model Explainability
This is a feature of Clarify where you can interrogate the model and find out why it came up with the responses that it did. It is the machine learning equivalent of a traditional programming debugger.
Detect Bias
This is the ability to detect and measure bias in your model or your datasets
What are the DataTools in SageMaker ?
Data Wrangler
Prepare tabular and image data for machine learning
Date preparation, transformation and feature engineering
Single interface for data selection, cleansing, exploration, visualisation and processing
SQL support
Data Quality tool
ML Features
Once data has been evaluated and prepared we need to select and create features that will be used as the inputs to ML models. It is important to have high quality features across your datasets in you company. An example is that after evaluation you may want to convert the birthdate field into and Age in years as it is more easily manipulated.
SageMaker Feature Store
This is where you can publish and re-use common features that you have created for your company. This means that the age feature from above can be re-used. These features can be published directly from data wrangler into the store.
What is SageMaker GroundTruth ?
This is the UAT tool where we evaluate how good a model is with a wider audience coming from Mechanical Turk, Your employees or third party vendors.
Can GroundTruth label your data ?
Yes
Does GroundTruth use human reinforcement learning ?
Yes
What is Jumpstart
The main aim is to get people up to speed and working with SageMaker. It is a hub where you can choose pre-trained models to be launched on SageMaker. There are more here than what is offered with bedrock. Models can be fully customised for your data and use case.
You can also use pre-build ML solutions for demand forecasting, credit rate prediction, fraud detection and computer vision.
What are SageMaker model cards ?
Shows essential information such as intended uses, risk ratings and training details.
What is the SageMaker model dashboard ?
Centralised repository and information insights for all models - such as deployment tracking and see models that exceed your thresholds for bias. This reports the metrics from model monitor and can alerts on deviations such as model drift.
What is the SageMaker role manager ?
Define roles for personas - such as Data Scientists, MLOps engineers etc for security
What is the Model Registry ?
This allows you to create an approval status for a model and automate model deployment. You can also catalog, manage model versions and see associated metadata.
What are the seven stages of SageMaker data pipelines ?
- Processing - for data processing
- Training - for training a model
- Tuning - Hyperparameter (brain://weqHXmTqWEm83Ib-YOFJ3g/Hyperparameter) tuning
- AutoML - Automatically train a model
- Model - To create or register a SageMaker model
- ClarifyCheck - Perform drift checks
- QualityCheck - As above