Past paper 3 Flashcards

1
Q

What is Apache Spark?

A

distributed processing framework and programming model that helps you do machine learning, stream processing, or graph analytics using Amazon EMR clusters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Amazon EMR?

A

a web service that makes it easy for you to process and analyze vast amounts of data using applications in the Hadoop ecosystem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What could you use you transform data into RecordIO-Protobuf format?

A

Apache Spark

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

WHat is AWS Glue?

A

serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Can AWS Glue transform data into RecordIO-Protobuf format?

A

No it cannot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is AWS Step Functions?

A

a low-code visual workflow service used to orchestrate AWS services, automate business processes and build serverless applications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Lambda not suited for?

A

Long -running processes such as transforming large datasets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Kinesis Firehose used for?

A

capture, transform, and load streaming data into Amazon repositories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which Amazon repositories can Kinesis Firehose load streaming data into?

A

Amazon S3, Amazon Redshift, Amazon Elasticsearch Service and Splunk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What type of processing should Kinesis Firehose not be used for?

A

Batch processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does a VPC endpoint allow connections between?

A

a virtual private cloud (VPC) and supported services, such as SageMaker, without requiring that you use an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is an interface endpoint?

A

an elastic network interface with a private IP address from the IP address range of your subset.W

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What traffic does an interface endpoint serve?

A

traffic destined for a service that is owned by AWS or owned by an AWS customer or partner.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a gateway endpoint used for?

A

used for traffic destined into either S3 or DynamoDB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the AWS Key Management Service used for?

A

to create and control the cryptographic keys used to protect your data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a permission boundary?

A

an advanced feature for using a managed policy to set the maximum permissions that an identity-based policy can grant an IAM entity

17
Q

What would you use to encrypt data at rest?

A

AWS Key Management Service to manage encryption keys

18
Q

How would you limit the the permissions of the root user?

A

AWS Organizations service control policy (SCP)

19
Q

How would you connect SageMaker API or to the SageMaker Runtime?

A

though an interface endpoint in your VPC.

20
Q

Define residual

A

the error between the predicted value and the observed actual value.

21
Q

When is a linear model not suitable for a problem? (residuals)

A

When the variance is not constant. (Residuals do not form a zero-centred bell-curve

22
Q

You are training a neural network and notice it is scoring highly on the training data but not the test. What do you do?

A
  • Use dropout
  • Use early stopping while training
  • Add parameter regularization
23
Q

What is regularization?

A

a set of different techniques that lower the complexity of a neural network model during training, and thus prevent the overfitting

24
Q

What does L1 (lasso) regularization do?

A

unimportant features get weights of zero

25
Q

What does L2 (ridge) regularization do?

A

unimportant features weights are forced NEAR zero (not zero)

26
Q
A