AWS Certified Machine Learning - Specialty (MLS-C01) Flashcards
NOTE: You can use S3 while training ML models with Sagemaker. S3 is integrated with Sagemaker to store training data and training output.
Amazon FSx for Lustre
Use when training data already in S3 and plan on run training jobs several times during different algos and parameters.
Speeds up tainkngnjobs by serving S3 data to Sagemaker at high speeds by copying data.
NOTE: If training data is already in AWS EFS then recommends using that as training data source.
EFS has benefit of directly launching training jobs without need for data movement, resulting in faster training start times.
Amazon Kinesis
Recommended for ingesting fast moving / real-time data. Allows you to build custom streaming data applications for specialized needs
Amazon Elastic MapReduce (EMR)
Provided a managed framework that can process massive quantities of data.