ML Fundamentals Flashcards

1
Q

allows people to store objects (files) in “buckets”
(directories)

A

Amazon S3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What pathway is this called: * <my_bucket>/my_folder1/another_folder/my_file.txt</my_bucket>

A

S3 Bucket Key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  • Pattern for speeding up range queries (ex: AWS Athena)
  • By Date: s3://bucket/my-dataset/year/month/day/hour/data_00.csv
  • By Product: s3://bucket/my-data-set/product-id/data_32.csv
A

Amazon S3 Data Partitioning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Durability or availability:
* If you store 10,000,000 objects with Amazon S3, you can on average
expect to incur a loss of a single object once every 10,000 years
* Same for all storage classes

A

Durability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Durability or availability:
* Measures how readily available a service is
* Varies depending on storage class

A

Availability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What S3 storage class is the below:
* 99.99% Availability
* Used for frequently accessed data
* Low latency and high throughput
* Sustain 2 concurrent facility failures
* Use Cases: Big Data analytics, mobile & gaming applications,
content distribution…

A

S3 Standard – General Purpose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What S3 Storage class:
*For data that is less frequently accessed, but requires rapid access
when needed
* Lower cost than S3 Standard
** 99.9% Availability
* Use cases: Disaster Recovery, backups

A
  • Amazon S3 Standard-Infrequent Access (S3 Standard-IA)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What S3 Storage class:
*For data that is less frequently accessed, but requires rapid access
when needed
* Lower cost than S3 Standard
* High durability (99.999999999%) in a single AZ; data lost when AZ is destroyed
* 99.5% Availability
* Use Cases: Storing secondary backup copies of on-premise data, or data you
can recreate

A
  • Amazon S3 One Zone-Infrequent Access (S3 One Zone-IA)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What S3 Storage class:
Small monthly monitoring and auto-tiering fee
* Moves objects automatically between Access Tiers based on usage
* There are no retrieval charges in S3 Intelligent-Tiering

A

S3 Intelligent-Tiering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe the S3 storage Intelligent Tiering classes below:
*__________: default tier
* Infrequent Access tier (automatic): objects not accessed for 30 days
* ______: objects not accessed for 90 days
* _________: configurable from 90 days to 700+ days
* ________: config. from 180 days to 700+ days

A

Frequent Access tier (automatic): default tier
* Infrequent Access tier (automatic): objects not accessed for 30 days
* Archive Instant Access tier (automatic): objects not accessed for 90 days
* Archive Access tier (optional): configurable from 90 days to 700+ days
* Deep Archive Access tier (optional): config. from 180 days to 700+ days

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  • Help you decide when to transition objects
    to the right storage class
  • Recommendations for Standard and
    Standard IA
  • Does NOT work for One-Zone IA or Glacier
  • Report is updated daily
  • 24 to 48 hours to start seeing data analysis
  • Good first step to put together Lifecycle
    Rules (or improve them)!
A

Amazon S3 Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

bucket wide rules from the S3 console - allows cross account

A

S3 Bucket policies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

_____ is a managed alternative to Apache Kafka
* Great for application logs, metrics, IoT, clickstreams
* Great for “real-time” big data
* Great for streaming processing frameworks (Spark, NiFi, etc…)
* Data is automatically replicated synchronously to 3 AZ

A

Amazon Kinesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

__________ low latency streaming ingest at scale

A

Kinesis Streams

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

________ perform real-time analytics on streams using SQL

A

Kinesis Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

_________ load streams into S3, Redshift, ElasticSearch & Splunk

A

Kinesis Firehose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

______ meant for streaming video in real-time

A

Kinesis Video Streams

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Kinesis Streams are divided in ordered ______

A

Shards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the two capacity modes for Kinesis Data streams?

A

Provisioned and On-Demand modes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What Kinesis data stream capacity mode is below:
*You choose the number of shards provisioned, scale manually or using API
* Each shard gets 1MB/s in (or 1000 records per second)
* Each shard gets 2MB/s out (classic or enhanced fan-out consumer)
* You pay per shard provisioned per hour

A

Provisioned

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What Kinesis data stream capacity mode is below:
* No need to provision or manage the capacity
* Default capacity provisioned (4 MB/s in or 4000 records per second)
* Scales automatically based on observed throughput peak during the last 30
days
* Pay per stream per hour & data in/out per GB

A

On-demand mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What Kinesis service is this:
*Fully Managed Service, no administration
* Near Real Time (60 seconds latency minimum for non full batches)
* Data Ingestion into Redshift / Amazon S3 / ElasticSearch / Splunk
* Automatic scaling
* Supports many data formats
* Data Conversions from CSV / JSON to Parquet / ORC (only for S3)
* Data Transformation through AWS Lambda (ex: CSV => JSON)
* Supports compression when target is Amazon S3 (GZIP, ZIP, and
SNAPPY

A

Kinesis data firehose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Whats the difference between kinesis data streams and firehose?

A

*Streams
* Going to write custom code (producer / consumer)
* Real time (~200 ms latency for classic, ~70 ms latency for enhanced fan-out)
* Automatic scaling with On-demand Mode
* Data Storage for 1 to 365 days, replay capability, multi consumers
*Firehose
* Fully managed, send to S3, Splunk, Redshift, ElasticSearch
* Serverless data transformations with Lambda
* Near real time (lowest buffer time is 1 minute)
* Automated Scaling
* No data storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What Kinesis tool is this:

Use cases
* Streaming ETL: select columns, make simple transformations, on streaming
data
* Continuous metric generation: live leaderboard for a mobile game
* Responsive analytics: look for certain criteria and build alerting (filtering)
* Features
* Pay only for resources consumed (but it’s not cheap)
* Serverless; scales automatically
* Use IAM permissions to access streaming source and destination(s)
* SQL or Flink to write the computation
* Schema discovery
* Lambda can be used for pre-processing

A

Kinesis data analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

For Kinesis Analytics, you Pay only for ______ (but it’s not cheap)

A

resources consumed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Is amazon kinesis serverless?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What amazon data product has the below characteristics:

  • Producers:
  • security camera, body-worn camera,
    AWS DeepLens, smartphone
    camera, audio feeds, images,
    RADAR data, RTSP camera.
  • One producer per video stream
  • Video playback capability
  • Consumers
  • build your own (MXNet, Tensorflow)
  • AWS SageMaker
  • Amazon Rekognition Video
  • Keep data for 1 hour to 10 years
A

Kinesis video stream

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

__________ create real-time machine learning
applications

A

Kinesis Data Stream

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

_____ ingest massive data near-real time

A

Kinesis Data Firehose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

___________ real-time ETL / ML algorithms on
streams

A

Kinesis Data Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

___________ real-time video stream to create ML
applications

A

Kinesis Video Stream

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q
  • Metadata repository for all
    your tables
  • Automated Schema
    Inference
  • Schemas are versioned
  • Integrates with Athena or
    Redshift Spectrum
    (schema & data discovery)
A

Glue data catalog

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

____ go through your data to infer schemas and partitions
* Works JSON, Parquet, CSV, relational store

A

Glue crawlers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Transform data, Clean Data, Enrich Data (before doing analysis)
* Generate ETL code in Python or Scala, you can modify the code
* Can provide your own Spark or PySpark scripts
* Target can be S3, JDBC (RDS, Redshift), or in Glue Data Catalog
* Fully managed, cost effective, pay only for the resources consumed
* Jobs are run on a serverless Spark platform

A

Glue ETL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What type of data store is this:

Data Warehousing, SQL
analytics (OLAP - Online
analytical processing)

A

Redshift

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What type of data store is this:

Relational Store, SQL (OLTP -
Online Transaction Processing)
* Must provision servers in
advance

A
  • RDS, Aurora:
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What type of data store is this:

NoSQL data store, serverless,
provision read/write capacity
* Useful to store a machine
learning model served by your
application

A
  • DynamoDB:
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What type of data store is this:

Object storage
* Serverless, infinite storage
* Integration with most AWS
Services

A

S3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What type of data storoe is this:

  • Indexing of data
  • Search amongst data points
  • Clickstream Analytics
A

OpenSearch (previously
ElasticSearch)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What type of data store is this:

  • Caching mechanism
  • Not really used for Machine
    Learning
A
  • ElastiCache
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What are these below features identifying what AWS data service:

Destinations include S3, RDS,
DynamoDB, Redshift and EMR
* Manages task dependencies
* Retries and notifies on failures
* Data sources may be on-premises
* Highly available

A

AWS Data Pipeline

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

What are the differences between AWS Data Pipeline and AWS Glue?

A

Glue:
* Glue ETL - Run Apache Spark code, Scala or Python based, focus on the
ETL
* Glue ETL - Do not worry about configuring or managing the resources
* Data Catalog to make the data available to Athena or Redshift Spectrum
* Data Pipeline:
* Orchestration service
* More control over the environment, compute resources that run code, & code
* Allows access to EC2 or EMR instances (creates resources in your own
account)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What AWS data service is below:

  • Run batch jobs as Docker images
  • Dynamic provisioning of the instances (EC2 & Spot Instances)
  • Optimal quantity and type based on volume and requirements
  • No need to manage clusters, fully serverless
  • You just pay for the underlying EC2 instances
A

AWS Batch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What is the difference between AWS Batch and Glue?

A
  • Glue:
  • Glue ETL - Run Apache Spark code, Scala or Python based, focus on
    the ETL
  • Glue ETL - Do not worry about configuring or managing the resources
  • Data Catalog to make the data available to Athena or Redshift
    Spectrum
  • Batch:
  • For any computing job regardless of the job (must provide Docker
    image)
  • Resources are created in your account, managed by Batch
  • For any non-ETL related work, Batch is probably better
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What AWS data service has the below features:

  • Quickly and securely migrate databases
    to AWS, resilient, self healing
  • The source database remains available
    during the migration
  • Supports:
  • Homogeneous migrations: ex Oracle to
    Oracle
  • Heterogeneous migrations: ex Microsoft SQL
    Server to Aurora
  • Continuous Data Replication using CDC
  • You must create an EC2 instance to
    perform the replication tasks
A

AWS Database Migration Service - DMS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What is the difference between AWS DMS and Glue?

A

Glue:
* Glue ETL - Run Apache Spark code, Scala or Python based, focus on
the ETL
* Glue ETL - Do not worry about configuring or managing the resources
* Data Catalog to make the data available to Athena or Redshift
Spectrum
* AWS DMS:
* Continuous Data Replication
* No data transformation
* Once the data is in AWS, you can use Glue to transform it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What AWS Data service has the below features:

For data migrations from on-premises to AWS storage services
* A DataSync Agent is deployed as a VM and connects to your
internal storage
* NFS, SMB, HDFS
* Encryption and data validation

A

AWS DataSync

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q
  • An Internet of Things (IOT) thing
  • Standard messaging protocol
  • Think of it as how lots of sensor
    data might get transferred to your
    machine learning model
  • The AWS IoT Device SDK can
    connect via ____
A

MQTT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What are the three major types of data?

A
  • Numerical
  • Categorical
  • Ordinal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

______ Represents some sort of quantitative
measurement
* Heights of people, page load times, stock
prices, etc.

A

Numerical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

_______ is Integer based; often counts of some event.
* How many purchases did a customer make in a
year?
* How many times did I flip “heads”?

A

Discrete data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

__________
* Has an infinite number of possible values
* How much time did it take for a user to check
out?
* How much rain fell on a given day?

A
  • Continuous Data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

___________ is Qualitative data that has no
inherent mathematical meaning
* Gender, Yes/no (binary data),
Race, State of Residence, Product
Category, Political Party, etc.

A

Categorical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

A mixture of numerical and
categorical
* Categorical data that has
mathematical meaning
* Example: movie ratings on a 1-5
scale.
* Ratings must be 1, 2, 3, 4, or 5
* But these values have mathematical
meaning; 1 means it’s a worse movie
than a 2.

A

Ordinal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

What AWS service has the below characteristics:

  • Interactive query service for S3 (SQL)
  • No need to load data, it stays in S3
  • Presto under the hood
  • Serverless!
  • Supports many data formats
  • CSV (human readable)
  • JSON (human readable)
  • ORC (columnar, splittable)
  • Parquet (columnar, splittable)
  • Avro (splittable)
  • Unstructured, semi-structured, or structured
A

Amazon athena

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

What AWS service uses the below scenarios?

  • Ad-hoc queries of web logs
  • Querying staging data before
    loading to Redshift
  • Analyze CloudTrail / CloudFront /
    VPC / ELB etc logs in S3
  • Integration with Jupyter, Zeppelin,
    RStudio notebooks
  • Integration with QuickSight
  • Integration via ODBC / JDBC with
    other visualization tools
A

amazon athena

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

What AWS service has the below cost model?

Pay-as-you-go
* $5 per TB scanned
* Successful or cancelled queries
count, failed queries do not.
* No charge for DDL
(CREATE/ALTER/DROP etc.)
* Save LOTS of money by using
columnar formats
* ORC, Parquet
* Save 30-90%, and get better
performance

A

Athena

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

What AWS Service has the below characteristics:

  • Fast, easy, cloud-powered business
    analytics service
  • Allows all employees in an organization
    to:
  • Build visualizations
  • Perform ad-hoc analysis
  • Quickly get business insights from data
  • Anytime, on any device (browsers, mobile)
  • Serverless
A

Quicksight

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

What is the in memory database that is used by quicksight?

A

SPICE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

What quicksight service is below:

Machine learning-powered
* Answers business questions with Natural
Language Processing
* “What are the top-selling items in Florida?”
* Offered as an add-on for given regions
* Personal training on how to use it is
required
* Must set up topics associated with
datasets
* Datasets and their fields must be NLP-friendly
* How to handle dates must be defined

A

Quicksight Q

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

What quicksight service is below:

Reports designed to
be printed
* May span many pages
* Can be based on
existing Quicksight
dashboards
* New in Nov 2022

A

Paginated Reports

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

What AWS Service is this:

  • Managed Hadoop framework on EC2
    instances
  • Includes Spark, HBase, Presto, Flink,
    Hive & more
  • EMR Notebooks
  • Several integration points with AWS
A

Amazon EMR (Elastic Map Reduce)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

What is this called:

Applying your knowledge of the data – and the model you’re
using - to create better features to train your model with.
* Which features should I use?
* Do I need to transform these features in some way?
* How do I handle missing data?
* Should I create new features from the existing ones?

A

Feature engineering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

What is The Curse of Dimensionality
?

A

Too many features can be a problem –
leads to sparse data
* Every feature is a new dimension
* Much of feature engineering is selecting
the features most relevant to the
problem at hand
* This often is where domain knowledge
comes into play

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

What AI data cleansing concept is below:

Replace missing values with the mean value
from the rest of the column (columns, not rows!
A column represents a single feature; it only
makes sense to take the mean from other
samples of the same feature.)
* Fast & easy, won’t affect mean or sample size
of overall data set
* Median may be a better choice than mean
when outliers are present

A

Mean replacement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

What are the cons of mean replacement?

A

Only works on column level, misses correlations
between features
* Can’t use on categorical features (imputing with
most frequent value can work in this case, though)
* Not very accurate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

What solution to missing data is this :

If not many rows contain missing data…
* …and dropping those rows doesn’t bias your
data…
* …and you don’t have a lot of time…
* …maybe it’s a reasonable thing to do.
* But, it’s never going to be the right
answer for the “best” approach.

A

Dropping data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

What are the three ways to solve missing data with machine learning techniques?

A

*KNN: Find K “nearest” (most similar) rows and average their values
* Assumes numerical data, not categorical
* There are ways to handle categorical data (Hamming distance), but
categorical data is probably better served by…
* Deep Learning
* Build a machine learning model to impute data for your machine learning
model!
* Works well for categorical data. Really well. But it’s complicated.
* Regression
* Find linear or non-linear relationships between the missing feature and other
features
* Most advanced technique: MICE (Multiple Imputation by Chained Equations)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

What kind of data is this:

Large discrepancy between
“positive” and “negative”
cases
* i.e., fraud detection. Fraud is
rare, and most rows will be notfraud
* Don’t let the terminology
confuse you; “positive” doesn’t
mean “good”
* It means the thing you’re testing
for is what happened.
* If your machine learning model
is made to detect fraud, then
fraud is the positive case.
* Mainly a problem with neural
networks

A

unbalanced data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

To improve AI Data quality, what is the term below:

Artificially generate new samples of the minority class using
nearest neighbors
* Run K-nearest-neighbors of each sample of the minority class
* Create a new sample from the KNN result (mean of the neighbors)
* Both generates new samples and undersamples majority class
* Generally better than just oversampling

A

SMOTE (* Synthetic Minority Over-sampling TEchnique)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

If you have too many false positives, one
way to fix that is to simply increase that
_________

A

threshold

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

_____ is simply the average of the squared
differences from the mean

A

Variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

_____ is just the square root
of the variance.

A

Standard Deviation 𝜎

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

Bucket observations together based
on ranges of values.
* Example: estimated ages of people
* Put all 20-somethings in one
classification, 30-somethings in another,
etc

A

Binning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

Applying some function to a feature to make it
better suited for training

A

Transforming

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

Transforming data into some new
representation required by the
model

A

encoding

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

Some models prefer feature data to be
normally distributed around 0 (most
neural nets)
* Most models require feature data to at
least be scaled to comparable values
* Otherwise features with larger magnitudes
will have more weight than they should
* Example: modeling age and income as
features – incomes will be much higher
values than ages

A

Scaling/normalization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

Many algorithms benefit from
_____ their training data
* Otherwise they may learn from
residual signals in the training
data resulting from the order in
which they were collected

A

shuffling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

What is Ground Truth?

A
  • Ground Truth manages humans who
    will label your data for training
    purposes
  • Ground Truth creates its own model as images are labeled by
    people
  • As this model learns, only images the model isn’t sure about are
    sent to human labelers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

Turnkey solution
* “Our team of AWS Experts”
manages the workflow and team of labelers
* You fill out an intake form
* They contact you and discuss
pricing

A

Ground truth plus

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q
  • AWS service for image recognition
  • Automatically classify images
A

Rekognition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q
  • AWS service for text analysis and topic modeling
  • Automatically classify text by topics, sentiment
A

Comprehend

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q
  • Important data for search – figures out what terms are most relevant for a document
    *
A

TF-IDF
* Stands for Term Frequency and Inverse Document Frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q
  • just measures how often a word occurs in a
    document
  • A word that occurs frequently is probably important to that document’s
    meaning
A

Term Frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

_____ is how often a word occurs in an entire
set of documents, i.e., all of Wikipedia or every web page
* This tells us about common words that just appear everywhere no
matter what the topic, like “a”, “the”, “and”, et

A

Document Frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

Can you explain bi grams and tri grams?

A

An extension of TF-IDF is to not only compute relevancy for
individual words (terms) but also for bi-grams or, more
generally, n-grams.
* “I love certification exams”
* Unigrams: “I”, “love”, “certification”, “exams”
* Bi-grams: “I love”, “love certification”, “certification exams”
* Tri-grams: “I love certification”, “love certification exams”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

What are the three types of neural networks?

A
  • Feedforward Neural Network
  • Convolutional Neural Networks
    (CNN)
  • Recurrent Neural Networks
    (RNNs)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

What kind of activation function is this:

It doesn’t really do
anything
* Can’t do backpropagation

A

Linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

What kind of activation function is this:

  • It’s on or off
  • Can’t handle multiple
    classification – it’s binary
    after all
  • Vertical slopes don’t
    work well with calculus!
A

Binary step function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

What kind of activation function is this:

  • These can create complex mappings between inputs and
    outputs
  • Allow backpropagation (because they have a useful derivative)
  • Allow for multiple layers (linear functions degenerate to a single
    layer)
A

Non linear activation function

91
Q

What kind of activation function is this:

  • Nice & smooth
  • Scales everything from 0-1
    (Sigmoid / Logistic) or -1 to 1
    (tanh / hyperbolic tangent)
  • But: changes slowly for high
    or low values
  • The “Vanishing Gradient”
    problem
  • Computationally expensive
  • Tanh generally preferred over
    sigmoid
A

Sigmoid / Logistic / TanH

92
Q

What kind of activation function is this:

Now we’re talking
* Very popular choice
* Easy & fast to
compute
* But, when inputs are
zero or negative, we
have a linear function
and all of its
problems

A

Rectified Linear Unit (ReLU)

93
Q

What kind of activation function is this:

Solves “dying ReLU” by
introducing a negative
slope below 0 (usually not
as steep as this)

A

Leaky ReLU

94
Q

What kind of activation function is this:

  • ReLU, but the slope in the
    negative part is learned
    via backpropagation
  • Complicated and YMMV
A

Parametric ReLU (PReLU)

95
Q

What kind of activation function is this:

  • From Google, performs really well
  • But it’s from Google, not Amazon…
  • Mostly a benefit with very deep
    networks (40+ layers)
A

Swish

96
Q

What kind of activation function is this:

  • Outputs the max of the inputs
  • Technically ReLU is a special case
    of maxout
  • But doubles parameters that need to
    be trained, not often practical.
A

Maxout

97
Q
  • Used on the final output layer of a
    multi-class classification problem
  • Basically converts outputs to
    probabilities of each classification
  • Can’t produce more than one label
    for something (sigmoid can)
A

Softmax

98
Q

What are convolutional neural networks used for?

A

When you have data that doesn’t
neatly align into columns
* Images that you want to find features
within
* Machine translation
* Sentence classification
* Sentiment analysis
* They can find features that aren’t in a
specific spot
* Like a stop sign in a picture
* Or words within a sentence
* They are “feature-location invariant”

99
Q

_________

They can find features that aren’t in a
specific spot
* Like a stop sign in a picture
* Or words within a sentence

A

convolutional neural network

100
Q

True or false:

CNNs are very resource-intensive (CPU, GPU,
and RAM)

A

true

101
Q

What are recurrent neural networks used for?

A

Time-series data
* When you want to predict future behavior based
on past behavior
* Web logs, sensor logs, stock trades
* Where to drive your self-driving car based on
past trajectories
* Data that consists of sequences of arbitrary
length
* Machine translation
* Image captions
* Machine-generated music

102
Q

What neural network should you use:

  • Time-series data
  • When you want to predict future behavior based
    on past behavior
  • Web logs, sensor logs, stock trades
  • Where to drive you
A

recurrent neural network

103
Q

________ deep learning architectures
are what’s hot
* Adopts mechanism of “self-attention”
* Weighs significance of each part of the input data
* Processes sequential data (like words, like an RNN),
but processes entire input all at once.
* The attention mechanism provides context, so no
need to process one word at a time.
* BERT, RoBERTa, T5, GPT-2 etc., DistilBERT
* DistilBERT: uses knowledge distillation to reduce
model size by 40%

A

Transformer

104
Q

What is it called when the below things are used in AI?

  • NLP models (and others) are too big
    and complex to build from scratch
    and re-train every time
  • The latest may have hundreds of billions
    of parameters!
  • Model zoos such as Hugging Face
    offer pre-trained models to start from
  • Integrated with Sagemaker via Hugging
    Face Deep Learning Containers
  • You can fine-tune these models for
    your own use cases
A

transfer learning

105
Q

Neural networks are trained
by ________ (or
similar means)

A

gradient descent

106
Q
  • Too high a learning rate
    means you might _________
A

overshoot
the optimal solution!

107
Q
  • Too small a learning rate will
    _____
A

take too long to find the
optimal solution

108
Q

Learning rate is an example
of a ___________

A

hyperparameter

109
Q

Smaller batch sizes can work their way out of _________

A

“local minima” more
easily

110
Q
  • Batch sizes that are too large can ________
A

end up getting stuck in the wrong solution

111
Q
  • Regularization techniques are
    intended to prevent ________.
A

overfitting

112
Q

true or false:

Overfitted models have learned patterns
in the training data that don’t generalize to
the real world

A

true

113
Q
  • Models that are good at making predictions on the data they were trained on, but not on new data it hasn’t seen
    before
A

overfitting

114
Q

What is the vanishing gradient problem?

A

When the slope of the learning
curve approaches zero, things
can get stuck

115
Q

_ regularization: sum of weights
* Performs feature selection – entire features go to 0
* Computationally inefficient
* Sparse output

A

L1 regularization

116
Q

__ regularization: sum of square of weights
* All features remain considered, just weighted
* Computationally efficient
* Dense output

A

L2 regularization

117
Q

What matrix does the below show?

  • A test for a rare disease can be
    99.9% accurate by just guessing
    “no” all the time
  • We need to understand true
    positives and true negative, as well
    as false positives and false
    negatives.
A

the confusion matrix

118
Q

____ = AKA Sensitivity, True Positive rate, Completeness
* Percent of positives rightly predicted
* Good choice of metric when you care a lot
about false negatives

A

recall

119
Q

What is the formula for recall?

A

𝑇𝑅𝑈𝐸 𝑃𝑂𝑆𝐼𝑇𝐼𝑉𝐸𝑆/
(𝑇𝑅𝑈𝐸 𝑃𝑂𝑆𝐼𝑇𝐼𝑉𝐸𝑆+𝐹𝐴𝐿𝑆𝐸 𝑁𝐸𝐺𝐴𝑇𝐼𝑉𝐸)

120
Q

____ = AKA Correct Positives
* Percent of relevant results
* Good choice of metric when you care a lot
about false positives
* i.e., medical screening, drug testing

A

precision

121
Q

What is the formula for precision?

A

𝑇𝑅𝑈𝐸 𝑃𝑂𝑆𝐼𝑇𝐼𝑉𝐸𝑆 /
(𝑇𝑅𝑈𝐸 𝑃𝑂𝑆𝐼𝑇𝐼𝑉𝐸𝑆+𝐹𝐴𝐿𝑆𝐸 𝑃𝑂𝑆𝐼𝑇𝐼𝑉𝐸𝑆)

122
Q
  • Plot of true positive rate (recall) vs. false
    positive rate at various threshold settings.
  • Points above the diagonal represent good classification (better than random)
  • Ideal curve would just be a point in the upper-left corner
  • The more it’s “bent” toward the upper-left, the better
A

ROC Curve

  • Receiver Operating Characteristic Curve
123
Q

Equal to probability that a classifier will rank
a randomly chosen positive instance higher
than a randomly chosen negative one
* ROC AUC of 0.5 is a useless classifier, 1.0
is perfect
* Commonly used metric for comparing
classifiers

A
  • Area Under the Curve (AUC)
124
Q

Good = higher area under
curve
* Similar to ROC curve
* But better suited for information
retrieval problems
* ROC can result in very small
values if you are searching
large number of documents for
a tiny number that are relevant

A
  • Precision / Recall curve
125
Q

__________ = Generate N new training sets by random sampling with replacement
* Each resampled model can be trained in parallel

A

bagging

126
Q

_____ = * Observations are weighted
* Some will take part in new training sets more often
* Training is sequential; each classifier takes into account the
previous one’s success.

A

boosting

127
Q

What type of sagemaker built in algorithm is this:

Linear regression
* Fit a line to your training data
* Predications based on that line
* Can handle both regression
(numeric) predictions and
classification predictions
* For classification, a linear threshold
function is used.
* Can do binary or multi-class

A

Linear learner

128
Q

For linear learner, it can handle both regression
(numeric) predictions and
_______ predictions

A

classification predictions

129
Q

Linear Learner: What training input does it expect?

A
  • RecordIO-wrapped protobuf
  • CSV
  • File or Pipe mode both supported
130
Q

Linear learner:

Preprocessing
* Training data must be ______(so all features
are weighted the same)
* Linear Learner can do this for you automatically

A

normalized

131
Q

What does sagemaker linear learner use in training?

A

Uses stochastic gradient descent

132
Q

What type of sagemaker built in algorithm is this:

Boosted group of decision trees
* New trees made to correct the errors of
previous trees
* Uses gradient descent to minimize loss as
new trees are added

A

XGBoost

133
Q

What type of training input does xgboost expect?

A

it takes CSV or libsvm input.

134
Q

With xgboost, Models are serialized/deserialized with ___

A

Pickle

135
Q

What type of sagemaker built in algorithm is this:

  • Input is a sequence of tokens,
    output is a sequence of tokens
  • Machine Translation
  • Text summarization
  • Speech to text
  • Implemented with RNN’s and CNN’s with attention
A

Seq2Seq

136
Q

What sagemaker built in algorithm maps to the below training inputs :

  • RecordIO-Protobuf
  • Tokens must be integers (this is unusual, since most algorithms want floating point
    data.)
  • Start with tokenized text files
  • Convert to protobuf using sample code
  • Packs into integer tensors with
    vocabulary files
  • A lot like the TF/IDF lab we did earlier.
  • Must provide training data, validation data, and vocabulary files.
A

Seq2Seq

137
Q

Seq2Seq can optimize on :

  • Accuracy
    -Vs. provided validation dataset
  • __ score
  • Compares against multiple reference translations
  • Perplexity
  • Cross-entropy
A

BLEU score

138
Q

Seq2Seq: Instance Types

Can only use ____ instance types
(P3 for example)
* Can only use a single machine for training
* But can use multi-GPU’s on one machine

A

GPU instance types

139
Q

What sagemaker algorithm has the below characteristics?

  • Forecasting one-dimensional time series data
  • Uses RNN’s
  • Allows you to train the same model over several related time series
  • Finds frequencies and seasonality
A

DeepAR

140
Q

What sagemaker algorithm has the below training input needs?

JSON lines format
* Gzip or Parquet
* Each record must contain:
* Start: the starting time stamp
* Target: the time series values
* Each record can contain:
* Dynamic_feat: dynamic features (such as, was
a promotion applied to a product in a time series of product purchases)
* Cat: categorical features

A

DeepAR

141
Q

For DeepAR, Always include entire _____ for
training, testing, and inference

A

time series

142
Q

For deepAR, start with ___, move up to __ if necessary.

A

CPU, GPU

143
Q

What sagemaker algorithm has the below characteristics:

  • Text classification
  • Predict labels for a sentence
  • Useful in web searches, information retrieval
  • Supervised
  • Word2vec
  • Creates a vector representation of words
  • Semantically similar words are represented by vectors close to each other
  • This is called a word embedding
  • It is useful for NLP, but is not an NLP algorithm
    in itself!
  • Used in machine translation, sentiment analysis
  • Remember it only works on individual words, not sentences or documents
A

BlazingText

144
Q

BlazingText: What training input does it expect?

A
  • For supervised mode (text classification):
  • One sentence per line
  • First “word” in the sentence is the string __label__ followed by the label
  • Also, “augmented manifest text format”
  • Word2vec just wants a text file with one training sentence per line.
145
Q

What type of sagemaker algorithm is below:

  • It creates low-dimensional dense embeddings of high-dimensional objects
  • It is basically word2vec, generalized to handle things other than words.
  • Compute nearest neighbors of objects
  • Visualize clusters
  • Genre prediction
  • Recommendations (similar items or
    users)
A

Object2Vec

146
Q

What type of algorithm has the below training requirements:

  • Data must be tokenized into integers
  • Training data consists of pairs of tokens and/or sequences of tokens
  • Sentence – sentence
  • Labels-sequence (genre to description?)
  • Customer-customer
  • Product-product
  • User-item
A

Object2Vec

147
Q

For object2vec, you Process data into ___ and shuffle it

A

JSON Lines

148
Q

What are important hyperparameters for Object2Vec?

A
  • The usual deep learning ones…
  • Dropout, early stopping, epochs, learning rate, batch size, layers, activation function, optimizer, weight decay
  • Enc1_network, enc2_network
  • Choose hcnn, bilstm, pooled_embedding
149
Q

What sagemaker algorithm is below:

  • Identify all objects in an image with
    bounding boxes
  • Detects and classifies objects with a
    single deep neural network
  • Classes are accompanied by
    confidence scores
  • Can train from scratch, or use pretrained models based on ImageNet
A

object detection

150
Q

What are the two variants of sagemaker object detection?

A

MXNet and Tensorflow
* Takes an image as input, outputs all instances of objects in the image with categories and
confidence scores
* MXNet
* Uses a CNN with the Single Shot multibox Detector (SSD) algorithm
* The base CNN can be VGG-16 or ResNet-50
* Transfer learning mode / incremental training
* Use a pre-trained model for the base network weights,
instead of random initial weights
* Uses flip, rescale, and jitter internally to avoid overfitting
* Tensorflow
* Uses ResNet, EfficientNet, MobileNet models from
the TensorFlow Model Garden

151
Q

What training input does object detection expect?

A
  • MXNet: RecordIO or image format (jpg or png)
  • With image format, supply a JSON file for annotation data for each image
152
Q

Whats the difference between object detection and image classification?

A

Object detection will show the specific point in the image where the object is. Image classification will classify the image and tell you what it is, not where it is

153
Q

Image Classification: What’s it for?

A
  • Assign one or more labels to an
    image
  • Doesn’t tell you where objects are, just what objects are in the image
154
Q

For image classification , there are Separate algorithms for ________ and _____

A

MXNet and Tensorflow

155
Q

Semantic Segmentation: What’s it for?

A
  • Pixel-level object classification
  • Different from image classification –
    that assigns labels to whole images
  • Different from object detection – that
    assigns labels to bounding boxes
  • Useful for self-driving vehicles,
    medical imaging diagnostics, robot sensing
156
Q
  • Useful for self-driving vehicles,
    medical imaging diagnostics, robot sensing
A

semantic segmentation

157
Q

Semantic Segmentation: What training input does it expect?

A
  • JPG Images and PNG annotations
  • For both training and validation
  • Label maps to describe annotations
  • Augmented manifest image format
    supported for Pipe mode.
  • JPG images accepted for inference
158
Q

What form of sagemaker algorithm tool has the below choices:
Choice of 3 algorithms:
* Fully-Convolutional Network (FCN)
* Pyramid Scene Parsing (PSP)
* DeepLabV3

A

semantic segmentation

159
Q

Random cut forest us used for ________

A

anomaly detection

160
Q

Neural Topic Model: What’s it for?

A
  • Organize documents into topics
  • Classify or summarize documents
    based on topics
  • It’s not just TF/IDF
  • “bike”, “car”, “train”, “mileage”, and
    “speed” might classify a document as
    “transportation” for example (although it
    wouldn’t know to call it that)
161
Q

What are the four data channels for neural topic model?

A
  • Four data channels
  • “train” is required
  • “validation”, “test”, and “auxiliary” optional
162
Q

Neural Topic Model: How is it used?

A
  • You define how many topics you want
  • These topics are a latent representation
    based on top ranking words
  • One of two topic modeling algorithms in
    SageMaker – you can try them both!
163
Q

Another topic modeling algorithm
* Not deep learning
* Unsupervised
* The topics themselves are unlabeled; they are just groupings of documents
with a shared subset of words
* Can be used for things other than words
* Cluster customers based on purchases
* Harmonic analysis in music

A
  • Latent Dirichlet Allocation (LDA)
164
Q

What sagemaker algorithm:

Unsupervised; generates however many topics you specify
* Optional test channel can be used for scoring results
* Per-word log likelihood
* Functionally similar to NTM, but CPU-based
* Therefore maybe cheaper / more efficient

A
  • Latent Dirichlet Allocation (LDA)
165
Q

Simple classification or regression algorithm
* Classification
* Find the K closest points to a sample point and return the most frequent label
* Regression
* Find the K closest points to a sample point and return the average value

A
  • K-Nearest-Neighbors - KNN
166
Q

for KNN: SageMaker includes a ___________ stage
* Avoid sparse data (“curse of dimensionality”)
* At cost of noise / accuracy
* “sign” or “fjlt” methods

A

dimensionality reduction

167
Q

These are important hyperparameters for what algorithm:

  • K!
  • Sample_size
A

KNN

168
Q

What sagemaker algorithm:

  • Unsupervised clustering
  • Divide data into K groups, where members of a group are as similar as possible to each other
  • You define what “similar” means
  • Measured by Euclidean distance
  • Web-scale K-Means clustering
A

K Means

169
Q

These are important hyperparameters for what algorithm:

  • K!
  • Choosing K is tricky
  • Plot within-cluster sum of squares as function of K
  • Use “elbow method”
  • Basically optimize for tightness of clusters
  • Mini_batch_size
  • Extra_center_factor
  • Init_method
A

K means

170
Q

What is the below sagemaker algorithm:

  • Dimensionality reduction
  • Project higher-dimensional data (lots of features) into lower-dimensional (like a
    2D plot) while minimizing loss of information
  • The reduced dimensions are called components
  • First component has largest possible variability
  • Second component has the next largest…
  • Unsupervised
A
  • Principal Component Analysis
    PCA
171
Q

PCA: What training input does it expect?

A
  • recordIO-protobuf or CSV
  • File or Pipe on either
172
Q

What sagemaker algorithm:

  • Covariance matrix is created, then singular value decomposition (SVD)
    Two modes:
  • Regular
  • For sparse data and moderate number of observations and features
  • Randomized
  • For large number of observations and features
  • Uses approximation algorithm
A
  • Principal Component Analysis
    PCA
173
Q

What sagemaker algorithm:

Dealing with sparse data
* Click prediction
* Item recommendations
* Since an individual user doesn’t interact with most pages / products the data is sparse
* Supervised
* Classification or regression
* Limited to pair-wise interactions
* User -> item for example

A

factorization machines

174
Q

What sagemaker algorithm:

Finds factors we can use to predict a classification (click or not? Purchase or not?) or value (predicted rating?) given a
matrix representing some pair of things (users & items?)
* Usually used in the context of
recommender systems

A

factorization machines

175
Q

What sagemaker algorithm:
* Unsupervised learning of IP address usage patterns
* Identifies suspicious behavior from IP addresses
* Identify logins from anomalous IP’s
* Identify accounts creating resources from anomalous IP’s

A

IP Insights

176
Q

What sagemaker algorithm:

  • Uses a neural network to learn latent vector representations of entities and IP addresses.
  • Entities are hashed and embedded
  • Need sufficiently large hash size
  • Automatically generates negative samples during training by randomly pairing entities and IP’s
A

IP Insights

177
Q

What sagemaker algorithm:

  • You have some sort of agent that “explores” some space
  • As it goes, it learns the value of different state changes in different conditions
  • Those values inform subsequent behavior of the agent
  • Examples: Pac-Man, Cat & Mouse game (game AI)
  • Supply chain management
  • HVAC systems
  • Industrial robotics
  • Dialog systems
  • Autonomous vehicles
  • Yields fast on-line performance once the space has been explored
A

reinforcement learning

178
Q

What sagemaker algorithm:

  • A specific implementation of reinforcement learning
  • You have:
  • A set of environmental states s
  • A set of possible actions in those states a
  • A value of each state/action Q
  • Start off with Q values of 0
  • Explore the space
  • As bad things happen after a given state/action, reduce its Q
  • As rewards happen after a given state/action, increase its Q
A

q learning

179
Q

Reinforcement Learning in SageMaker
* Uses a deep learning framework with ____ and ________

A

Tensorflow and MXNet

180
Q

What is this called:

  • SageMaker spins up a “HyperParameter Tuning Job” that trains as many combinations as you’ll allow
  • Training instances are spun up as needed, potentially a lot of them
  • The set of hyperparameters producing the best results can then be deployed as a model
  • It learns as it goes, so it doesn’t have to try every possible
    combination
A

Automatic Model Tuning

181
Q
  • Visual IDE for machine learning!
A

SageMaker Studio

182
Q

Create and share
Jupyter notebooks with
SageMaker Studio
* Switch between
hardware configurations
(no infrastructure to
manage)

A

Sagemaker notebooks

183
Q
  • Organize, capture, compare, and search your ML jobs
A

Sagemaker experiments

184
Q
  • Saves internal model state at periodical intervals
  • Gradients / tensors over time as a model is trained
  • Define rules for detecting unwanted conditions while training
  • A debug job is run for each rule you configure
  • Logs & fires a CloudWatch event when the rule is hit
A

sagemaker debugger

185
Q
  • Automates:
  • Algorithm selection
  • Data preprocessing
  • Model tuning
  • All infrastructure
  • It does all the trial & error for you
  • More broadly this is called AutoML
A

Sagemaker autopilot

186
Q
  • Integrates with SageMaker Clarify
  • Transparency on how models
    arrive at predictions
  • Feature attribution
A

autopilot explainability

187
Q
  • Get alerts on quality
    deviations on your deployed
    models (via CloudWatch)
  • Visualize data drift
  • Example: loan model starts
    giving people more credit due
    to drifting or missing input
    features
  • Detect anomalies & outliers
  • Detect new features
  • No code needed
A

Sagemaker model monitor

188
Q
  • _________ detects potential bias
  • i.e., imbalances across different groups / ages / income brackets
  • With ModelMonitor, you can monitor for bias and be alerted to new potential bias via CloudWatch
  • SageMaker Clarify also helps explain model behavior
  • Understand which features contribute the
    most to your predictions
A

SageMaker Clarify

189
Q
  • A “feature” is just a property used to train a machine learning model.
  • Like, you might predict someone’s political party based on “features” such as their address, income, age, etc.
  • Machine learning models require fast, secure access to feature data for training.
  • It’s also a challenge to keep it
    organized and share features
    across different models.
A

sagemaker feature store

190
Q
  • Creates & stores your ML workflow (MLOps)
  • Keep a running history of your models
  • Tracking for auditing and compliance
  • Automatically or manually-created tracking entities
  • Integrates with AWS Resource Access Manager for cross-account lineage
A

SageMaker ML Lineage Tracking

191
Q
  • Visual interface (in SageMaker
    Studio) to prepare data for machine learning
  • Import data
  • Visualize data
  • Transform data (300+
    transformations to choose from)
  • Or integrate your own custom xformswith pandas, PySpark, PySpark SQL
  • “Quick Model” to train your model with your data and measure its
    results
A

Sagemaker data wrangler

192
Q
  • No-code machine learning for
    business analysts
  • Upload csv data (csv only for now), select a column to predict, build it, and make predictions
  • Can also join datasets
  • Classification or regression
A

sagemaker canvas

193
Q

________
* For asynchronous or real-time inference endpoints
* Controls shifting traffic to new models
* “Blue/Green Deployments”
* All at once: shift everything, monitor, terminate
blue fleet
* Canary: shift a small portion of traffic and
monitor
* Linear: Shift traffic in linearly spaced steps
* Auto-rollbacks

A

Deployment Guardrails

194
Q

________
* Compare performance of shadow variant to production
* You monitor in SageMaker console and decide when to promote it

A

Shadow Tests

195
Q

One facet (demographic group) has fewer training values than another

A
  • Class Imbalance (CI)
196
Q
  • Imbalance of positive outcomes between facet values
A
  • Difference in Proportions of Labels (DPL)
197
Q
  • How much outcome distributions of facets diverge
A
  • Kullback-Leibler Divergence (KL), Jensen-Shannon
    Divergence(JS)
198
Q
  • P-norm difference between distributions of outcomes from facets
A
  • Lp-norm (LP)
199
Q
  • L1-norm difference between distributions of outcomes from facets
A
  • Total Variation Distance (TVD)
200
Q
  • Maximum divergence between outcomes in distributions from facets
A
  • Kolmogorov-Smirnov (KS)
201
Q
  • Disparity of outcomes between facets as a whole, and by subgroups
A
  • Conditional Demographic Disparity (CDD)
202
Q
  • Integrated into AWS Deep Learning Containers
    (DLCs)
  • Can’t bring your own container
  • Compile & optimize training jobs on GPU instances
  • Can accelerate training up to 50%
  • Converts models into hardware-optimized instructions
  • Tested with Hugging Face transformers library, or
    bring your own model
A

SageMaker Training Compiler

203
Q

What AI Service:

  • Natural Language Processing and Text Analytics
  • Input social media, emails, web pages, documents, transcripts, medical records (Comprehend Medical)
  • Extract key phrases, entities, sentiment, language, syntax, topics, and document
    classifications
  • Events detection
  • PII Identification & Redaction
  • Targeted sentiment (for specific entities)
  • Can train on your own data
A

Amazon comprehend

204
Q

What AI Service:

  • Uses deep learning for translation
  • Supports custom terminology
  • In CSV or TMX format
  • Appropriate for proper names, brand names, etc.
A

Amazon Translate

205
Q

What AI service:

  • Speech to text
  • Input in FLAC, MP3, MP4, or WAV, in a specified language
  • Streaming audio supported (HTTP/2 or WebSocket)
  • French, English, Spanish only
  • Speaker Identificiation
  • Specify number of speakers
  • Channel Identification
  • i.e., two callers could be transcribed separately
  • Merging based on timing of “utterances”
  • Automatic Language Identification
  • You don’t have to specify a language; it can detect the dominant one spoken.
  • Custom Vocabularies
  • Vocabulary Lists (just a list of special words – names, acronyms)
  • Vocabulary Tables (can include “SoundsLike”, “IPA”, and “DisplayAs”)
A

Amazon Transcribe

206
Q

What AI Service:

  • Neural Text-To-Speech, many voices & languages
  • Lexicons
  • Customize pronunciation of specific words & phrases
  • Example: “World Wide Web Consortium” instead of
    “W3C”
  • SSML
  • Alternative to plain text
  • Speech Synthesis Markup Language
  • Gives control over emphasis, pronunciation, breathing, whispering, speech rate, pitch, pauses.
  • Speech Marks
  • Can encode when sentence / word starts and ends in
    the audio stream
  • Useful for lip-synching animation
A

Amazon Polly

207
Q

What AI Service:

  • Computer vision
  • Object and scene detection
  • Can use your own face collection
  • Image moderation
  • Facial analysis
  • Celebrity recognition
  • Face comparison
  • Text in image
  • Video analysis
  • Objects / people / celebrities marked on timeline
  • People Pathing
  • Image and video libraries
A

Rekognition

208
Q

What AI Service:

  • Fully-managed service to deliver highly accurate forecasts with ML
    *“AutoML” chooses best model for your time series data
  • ARIMA, DeepAR, ETS, NPTS, CNN-QR Prophet
  • Works with any time series
  • Price, promotions, economic performance, etc.
  • Can combine with associated data to find relationships
  • Inventory planning, financial planning,
    resource planning
  • Based on “dataset groups,” “predictors,” and “forecasts.”
A

Amazon Forecast

209
Q

What AI Tool:

  • Billed as the inner workings of Alexa
  • Natural-language chatbot engine
  • A Bot is built around Intents
  • Utterances invoke intents (“I want to order a pizza”)
  • Lambda functions are invoked to fulfill the intent
  • Slots specify extra information needed by the intent
  • Pizza size, toppings, crust type, when to deliver, etc.
  • Can deploy to AWS Mobile SDK, Facebook Messenger, Slack, and Twilio
A

Amazon Lex

210
Q

What AI Service:

  • Fully-managed recommender engine
  • Same one Amazon uses
  • API access
  • Feed in data (purchases, ratings, impressions, cart adds, catalog, user demographics etc.) via S3 or API integration
  • You provide an explicit schema in Avro format
  • Javascript or SDK
  • GetRecommendations
  • Recommended products, content, etc.
  • Similar items
  • GetPersonalizedRanking
  • Rank a list of items provided
  • Allows editorial control / curation
A

Amazon Personalize

211
Q

What AI Service:
* Equipment, metrics, vision
* Detects abnormalities from sensor data automatically to detect equipment issues
* Monitors metrics from S3, RDS, Redshift, 3rd party SaaS apps
* Vision uses computer vision to detect defects in silicon wafers, circuit boards, etc.

A

Amazon Lookout

212
Q

What AI Service:

  • End to end system for monitoring industrial equipment & predictive maintenance
A

Amazon Monitron

213
Q

What AI Service:

  • Computer Vision at the edge
  • Brings computer vision to your existing IP cameras
A

AWS Panorama

214
Q

What AI Tool:

  • Upload your own historical fraud data
  • Builds custom models from a template you choose
  • Exposes an API for your online
    application
A

Amazon Fraud Detector

215
Q

What AI Service:

  • Automated code reviews!
  • Finds lines of code that hurt
    performance
  • Resource leaks, race
    conditions
  • Fix security vulnerabilities
A

Codeguru

216
Q

What AI Service:

  • For customer support call centers
  • Ingests audio data from recorded calls
  • Allows search on calls / chats
  • Sentiment analysis
  • Find “utterances” that correlate with successful calls
  • Categorize calls automatically
  • Measure talk speed and interruptions
  • Theme detection: discovers
    emerging issues
A

Contact Lens for Amazon Connect

217
Q

What AI Service:

  • Enterprise search with natural language
  • For example, “Where is the IT support desk?” “How do I connect to my VPN?”
  • Combines data from file systems, SharePoint, intranet, sharing services (JDBC, S3) into one searchable
    repository
  • ML-powered (of course) – uses thumbs up / down feedback
  • Relevance tuning – boost strength of document freshness, view counts, etc.
A

Amazon Kendra

218
Q

What AI Service:

  • Human review of ML predictions
  • Builds workflows for reviewing low-confidence predictions
  • Access the Mechanical Turk workforce or vendors
  • Integrated into Amazon Textract and Rekognition
  • Integrates with SageMaker
  • Very similar to Ground Truth
A

Amazon Augmented AI (A2I)

219
Q
  • All models in SageMaker are hosted in ________
A

Docker containers

220
Q
  • Docker containers are
    created from ______
A

images

221
Q
  • Images are built from a
    _______
A

Dockerfile

222
Q
  • Images are saved in a
    ________
A

repository

223
Q
  • Train once, run anywhere
  • Edge devices
  • ARM, Intel, Nvidia processors
  • Embedded in whatever – your car?
  • Optimizes code for specific
    devices
  • Tensorflow, MXNet, PyTorch,
    ONNX, XGBoost, DarkNet, Keras
  • Consists of a compiler and a
    runtime
A

Sagemaker Neo

224
Q
A