Important stuff_AWS Certification Flashcards

1
Q

Sagemaker

A

Data usually comes from S3, but can also ingest from Athena, EMR, Redshift and Amazon Keyspaces DB

Cannot be deployed on an EMR cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Kinesis Firehose data conversion capabilities

A

Kinesis Firehose has the ability to convert JSON data to Parquet or ORC format on the fly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Vanishing Gradient

A

A “vanishing gradient” results from multiplying together many small derivates of the sigmoid activation function in multiple layers. ReLU does not have a small derivative, and avoids this problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Quicksight

A

can directly read from S3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Kinesis Data Firehose

A
  • Fully Managed Service, no administration
  • Near Real Time (60 seconds latency minimum for non full batches)
  • Data Ingestion into Redshift / Amazon S3 / ElasticSearch / Splunk, ie. Load data into these services
  • Automatic scaling
  • Supports many data formats
  • Data Conversions from CSV / JSON to Parquet / ORC (only for S3)
  • Data Transformation through AWS Lambda (ex: CSV => JSON)
  • Supports compression when target is Amazon S3 (GZIP, ZIP, and SNAPPY)
  • Pay for the amount of data going through Firehose
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

convert to LibSVM

A

Neither Glue ETL nor Kinesis Analytics can convert to LibSVM format, and scikit-learn is not a distributed solution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Binning

A

Binning is the process of converting numeric data into categorical data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Kinesis Producer Library

A

The KPL is an easy-to-use, highly configurable library that helps you write to a Kinesis data stream. It acts as an intermediary between your producer application code and the Kinesis Data Streams API actions

The KPL can help build high-performance producers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Lamda Functions

A

Not meant to handle ETL jobs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Spark

A

not used for OLTP or batch processing jobs, more for transforming data as it comes in.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly