Domain 5 Flashcards

Question 1

Q

Does an IAM user have any perms by default?

Question 2

Q

CloudTrail captures all API calls from SageMaker except

Answer

A

Invoking endpoints

Question 3

Q

What levels can you block public access for S3?

Answer

A

Bucket and/or account level (enabled at the account level blocks existing and new)

Question 4

Q

This simplifies role creation for AI/ML activities. When you create a role by using SageMaker Role Manager, you choose a persona that will have the appropriate activities for that persona preselected. You can customize which activities are enabled if you like. In this way, it will create the permissions policy for the role for you. You also have the option to add additional IAM policies.

Answer

A

Amazon SageMaker Role Manager

Question 5

Q

SageMaker Role Manager can create these three personas

Answer

A

Data Scientist - or someone who needs to use SageMaker to perform general machine learning development and experimentation
MLOps - for someone who is managing models, pipelines, experiments, and endpoints, but doesn’t need to access the data in Amazon S3
SageMaker Compute - for creating a role that SageMaker compute resources can use to perform tasks such as training and inference

Question 6

Q

t of f Amazon S3, Amazon DynamoDB, and Amazon SageMaker, will encrypt your data by default without your having to enable it

Question 7

Q

t or f all requests to Amazon S3 and SageMaker through the APIs and console are made over a secure encrypted connection.

Question 8

Q

SageMaker distributed training, uses multiple nodes in a cluster. By default, inter-node traffic is not encrypted, but an option exists to enable it. Although this encryption might be required for very sensitive data, enabling inter-node encryption can increase training times for some algorithms, particularly deep learning ones.

Question 9

Q

torf In general, PII should not be removed from training data at the point of ingestion and transformation

Answer

A

false. In general, PII should be removed from training data at the point of ingestion and transformation

Question 10

Q

You should do this regarding VPCs for SageMaker Studio Notebooks

Answer

A

The best practice recommendation is to create a VPC in your account and specify your VPC when launching SageMaker Studio and notebooks. This will create an elastic network interface in your VPC and attach it to the notebook instance. By using your own VPC, you can control which traffic can access the internet by configuring security groups, network access lists, and network firewalls

Question 11

Q

How do you prevent SageMaker from giving your notebook instances internet access?

Answer

A

You can prevent SageMaker from giving your notebook instances internet access by specifying VPC only for the network access type. SageMaker Studio normally reaches required services like Amazon S3, Amazon CloudWatch, the SageMaker runtime, and the SageMaker API by using the public network. But when you use VPC only mode, the public endpoints for these services are no longer reachable. To keep all network traffic going over only a private network, you can use VPC interface endpoints.

Question 12

Q

Describe training data vulnerabilities

Answer

A

If a malicious actor gains access to the training data, they can introduce data that will change the model’s predictions.

Question 13

Q

Describe input vulnerabilities

Answer

A

An attacker can slightly manipulate input data in a way that will cause the model to misclassify it. For example, a company uses a face recognition model to recognize employees. An attacker can make subtle but carefully designed modifications to their image to cause the model to recognize them as someone else.

Question 14

Q

Describe output vulnerabilities

Answer

A

A sophisticated attacker can cause a model’s output to infer the training data

Question 15

Q

What is model inversion?

Answer

A

The attacker keeps feeding data into the model and studying the outputs. For example, the facial recognition model is trained on employee images, and its output is the employee’s name and the confidence score. The attacker can repeatedly feed the model facial images, making changes until the output is an employee’s name and a high-confidence score. The hacker then has a good image of an employee that they can use to pretend to be the employee

Question 16

Q

with enough input and output pairs, an attacker can

Answer

A

create a new model that works in reverse. That is they can train a new model on the original model’s outputs, and use it to infer the training input data. Similarly, the hacker can reverse engineer the model and make their own model that is very similar to original model.

Question 17

Q

prompt injection.

Answer

A

In this kind of attack, an attacker gives malicious instructions to the model in the prompt with the goal of influencing its output. For example, the attacker can prompt the LLM to ignore or alter its prompt template, which would permit the attacker to gain sensitive information

Question 18

Q

you can teach an LLM to detect prompt injection by using key attack patterns in training and

Answer

A

and return the response prompt attack detected.

Question 19

Q

To help a model avoid being tricked, you should

Answer

A

train models with adversarial input. Also, you can train your models frequently on new data so that any damage from corrupted training data will be undone.

Question 20

Q

routinely scan and monitor your data for

Answer

A

quality and detect anomalies before using it for training.

Question 21

Q

Amazon SageMaker Model Monitor monitors the quality of Amazon SageMaker machine learning models in production.

Answer

A

After deploying a model into your production environment, use Amazon SageMaker Model Monitor to continuously monitor the quality of your models in real time. You can use Amazon SageMaker Model Monitor to set up an automated alert system for deviations in model quality, such as data drift or anomalies

Question 22

Q

Amazon SageMaker Model Monitor also can be used for

Answer

A

monitoring data quality.SageMaker Model compares the data and model with baselines. It generates statistics and metrics that are visible on SageMaker Studio and also sent to Amazon CloudWatch

Question 23

Q

Datasets should be stored in Amazon S3 and

Answer

A

partitioned with prefixes to uniquely identify the training datase

Question 24

Q

SageMaker automatically uniquely identifies each training job and

Answer

A

stores other metadata such as hyperparameters and the unique identifiers for the container dataset and model output.

Question 25

Q

Model versions can be stored in a model catalog by using .

Answer

A

using SageMaker Model Registry.

Question 26

Q

Using SageMaker Model Registry, you can catalog models in model groups that contain different versions of a model. Each model package in a model group corresponds to a trained model. You can associate and view model metadata, including training metrics. Models can be deployed directly from the model registry. Model Registry also lets you maintain the status of a model such as spending, approved, or rejected.

Question 27

Q

Amazon SageMaker Model Cards

Answer

A

document, retrieve, and share essential model information from conception to deployment. Risk managers, data scientists, and ML engineers can use model cards to create an immutable record of intended model uses, risk ratings, training details, and evaluation results. Model cards can be exported to PDF and shared with relevant stakeholders.

Question 28

Q

Amazon SageMaker ML Lineage Tracking

Answer

A

automatically creates a graphical representation of all the elements of your end-to-end machine learning workflow. You can use this representation to establish model governance, reproduce your workflow, and maintain a record of your work history.

Question 29

Q

You can run queries against the lineage data to discover

Answer

A

relationships between the entities. For example, query the lineage to retrieve all the models that use a particular dataset or retrieve datasets that use a container image artifact.

Question 30

Q

SageMaker Feature Store

Answer

A

Serves as a single source of truth to store, retrieve, remove, track, share, discover and control access to features

Process raw data into features buy using a processing worklfow

View the lineage of a feature group

Run queries to retrieve the state of each feature at a point in time

Question 31

Q

Amazon SageMaker Model Dashboard is

Answer

A

a centralized portal accessible from the SageMaker console where you can view, search, and explore all models in your account.

Question 32

Q

Model Dashboard Capabilities

Answer

A

You can visualize workflow lineage and track your endpoint performance. You can track which models are deployed for inference and whether they’re used in batch transform jobs, or hosted on endpoints. If you set up monitors using Model Monitor, you can also track the performance of your models as they make real-time predictions on live data. You can use the dashboard to find models that violate thresholds that you set for data quality, model quality, bias, and explainability.

Question 33

Q

service organization controls

Answer

A

SOC report,

Question 34

Q

If a company wants to achieve SOC 2 for their customers, they use the AWS SOC 2 report as a starting point. Then

Answer

A

they have an auditor verify that they have configured the security controls they are responsible for correctly.

Question 35

Q

AWS Customer compliance center

Answer

A

Read customer compliance stories about how companies in regulated industries have solved various compliance, governance, and audit challenges. You can access compliance whitepapers and documentation on various topics. These topics include AWS answers to key compliance questions, an overview of AWS risk and compliance, and an auditing security checklist. Additionally, the customer compliance center includes an auditor learning path.

Question 36

Q

Name 3 emerging AI compliance standards

Answer

A

ISO 42001 and ISO 23894
EU Artificial Intelligence Act (Categories: unacceptable risk, high risk, unregulated)
NIST AI Risk Management Framework (govern, map, measure, and manage)

Question 37

Q

AI Risk management

Answer

A

Multiply the probability of occurrence by severity of consequences

Question 38

Q

The Algorithmic Accountability Act

Answer

A

has been introduced several times for consideration by the United States Congress. If enacted, it will require companies to assess the impacts of the AI systems that they use and sell. It will create new transparency about when and how such systems are used and empower consumers to make informed choices when interacting with AI systems. Its goal is to try to protect Americans from AI systems that can lead to unfair and unexplained results.

Question 39

Q

Amazon SageMaker Clarify can help you understand how different variables…

Answer

A

influence a model’s behavior and monitor for bias and feature attribution drift

Question 40

Q

What tool can you use that continually audits AWS usage for compliance, has a Generative AI framework, and allows you to collect evidence and add to an audit report?

Answer

A

AWS Audit Manager

Question 41

Q

True or false: Guardrails for Amazon Bedrock only let you use predefined topics

Answer

A

False. You can describe your own topic, and also have a response for when the prompt is blocked and one for when the response is blocked.

Question 42

Q

THis service provides a detailed inventory of current AWS resources, hirstory, and can remediate via Systems manager when a change is detected.

Answer

A

AWS config

Question 43

Q

What is an AWS Config conformance pack?

Answer

A

Package together config rules and remediation actions

Question 44

Q

Two useful conformance packs

Answer

A

Opeartional best practices for AI and ML
Security best practices for Amazon SageMaker

Question 45

Q

Whereas AWS Config monitors configurations at the resource level, Amazon Inspector

Answer

A

works at the application level. It checks applications and containers for security vulnerabilities and deviations from security best practices, such as open access to EC2 instances and installations of vulnerable software versions.

Question 46

Q

Trusted Advisor continuously evaluates your AWS environment by using best practice checks across the categories of cost optimization, performance, resilience, security, operational excellence, and service limits. It recommends actions to remediate any deviations from best practices.

Question 47

Q

Three major parts of data governance

Answer

A

Curation
Discovery and understanding
Protection

Question 48

Q

What are three important roles in data governance strategy

Answer

A

Data steward
Data Owner
IT Roles

Question 49

Q

This tool helps with data governance by doing data profiling and tracks data lineage

Answer

A

AWS Glue DataBrew for data governance

Question 50

Q

Stores medata for datasets including location and schema and data types, configurable manually or via crawler jobs

Answer

A

AWS Glue Data Catalog

Question 51

Q

AWS Glue Data Quality

Answer

A

Recommends quality rules
Detects anomalies

Question 52

Q

You can use this feature of S3 to ensure data retention and compliance, expire data, and transition to less costly storage

Answer

A

Lifecycle policies

Question 53

Q

What is AWS Lake Formation

Answer

A

you can manage fine-grained access control for a data lake, built in Amazon S3, and catalog using AWS Glue Data Catalog, Lake Formation permissions are enforced using granular controls at the column, row, and cell levels across AWS analytics and machine learning services. Lake Formation helps you break down data in silos and combine different types of structured and unstructured data into a centralized repository. You can identify existing data stores in Amazon S3 or relational and NoSQL databases and move the data into your data lake. Then, you catalog the data and set up the permissions. A user might try to access the data by using an integrated analytical engine like Athena, AWS Glue, Amazon EMR, or Amazon Redshift. In that case, the AWS Glue Data Catalog checks the permissions with Lake Formation before granting access.

Question 54

Q

Generative AI Security Scoping Matrix shows increased levels of scope, depending on the way AI is consumed or implemented. Scopes 1 and 2

Answer

A

carry the least responsibility because you are consuming a third-party consumer or enterprise application

Question 55

Q

Generative AI Security Scoping Matrix With scopes 3, 4, and 5, you are building your own AI solution. Your data can be used in the training, fine-tuning, or output of the model. You are responsible for classifying the data and model for risk, implementing threat modeling, limiting access, implementing security controls, and assuring the model endpoint’s resilience.

Question 56

Q

Inherent risk

Answer

A

represents the amount of risk the AI system exhibits in absence of mitigations or controls.

Question 57

Q

Residual risk

Answer

A

captures the remaining risks after factoring in mitigation strategies.

Question 58

Q

To estimate the risk of an event, you can use a

Answer

A

likelihood scale in combination with a severity scale to measure the probability of occurrence as well as the degree of consequences. A helpful starting point when developing these scales might be the NIST RMF

Question 59

Q

How to assess risk?

Answer

A

describing the AI use case that needs to be assessed and identify all relevant stakeholders

identify potentially harmful events associated with the use case

sing the final assessment summary, organizations will have to define what risk levels are acceptable for their AI systems as well as consider relevant regulations and policies