GenAI Flashcards

Question 1

Q

What instances for computer vision?

Answer

A

G5

high GPU power from optimized NVIDIA GPUs, low latency, high bandwidth

can process lots of real-time imaging data

Question 2

Q

computer vision

what is it and what AWS services are relevant

Answer

A

allows computers to interpret and analyze visual data

captures images, algorithims process, makes decisions like identifying objects, classifying images, tracking movements

Amazon Rekognition: For image and video analysis.

SageMaker: To train custom vision models.

EC2 G5 Instances: Ideal for GPU-intensive computer vision tasks.

Question 3

Q

difference between traditional genAI and agentic AI

Answer

A

genAI responds to inputs to provide content generation – reactive but no actions without direct user interaction, no memory or goal setting

agentic AI - extends GenAI by combining content generation with autonomy, can initiate and complete tasks, make decisions, interact with systems without human input

agentic AI can automate end to end processes

Question 4

Q

task & workflow orchestration

relevant AWS services

Answer

A

task orchestration – manages how tasks are triggered, executed, connected

workflow orestration – organizes + manages sequence of tasks like a roadmap

services:
-step functions: orchestrates workflows by connecting tasks in a flowchart
-AWS lambda: executes individual tasks
-AWS EventBridge: detects events, decides which service/task is then best for the job and routes it there

Question 5

Q

EventBridge vs. Lambda

Answer

A

eventbridge detects events then routes to a task/service, Lambda takes this events and executes a task

Question 6

Q

example of AWS service stack for a computer vision agent

Answer

A

API gateway - doc upload
eventbridge

Question 7

Q

instance recommendation for complex physics simulations

what types of applications?

Answer

A

p5 instances for power, speed, scalability

H100 GPUs in P5 instances explicitly optimized for speed + precision

applications: robotics, AVs, fluid dynamics (think air or liquid analysis in automotive, healthcare, aerospace)

Question 8

Q

instances for training vs inference workloads

Answer

A

training: P5 (think power, precision, physics) on NVIDIA H100

inference: G5 or Inferentia-powered Inf1 instances (think G = guess = infer, G-rated = more mellow) // moderate power

Question 9

Q

distilling and pruning

Answer

A

enable you to make models smaller/more compact without sacrificing quality (this means it can use less compute resources)

Question 10

Q

common libraries for GenAI

Answer

A

PyTorch and TensorFlow - two of the most popular open-source machine learning (ML) frameworks. These tools provide developers with the libraries, tools, and building blocks to create, train, and deploy machine learning models

PyTorch/TensorFlow are the foundational engines/base layers that run the math and train the AI

Hugging Face/Nemo are the apps that use PyTorch/Tensor Flow to do the calculations then handle the other hard parts of AI

Question 11

Q

Elastic Fabric Adapter

Answer

A

increases speed (lower latency, higher throughput) for AI/ML/HPC applications by enabling high speed data transfers + fast communication across EC2 instances (allows them to talk to each other)

used for distributed model training (deep learning models), distributed ML frameworks (tensorcore)

Question 12

Q

main aws services for GenAI

Answer

A

EC2 Instances with GPUs: For GenAI workloads, P3 or P4 EC2 instances with NVIDIA GPUs are commonly used for training deep learning models. Inf1 instances (based on AWS Inferentia chips) are designed for high-performance inference workloads.

Amazon SageMaker: A fully managed service for building, training, and deploying machine learning models, including GenAI models. SageMaker offers tools like SageMaker Studio, SageMaker Training, and SageMaker Pipelines for end-to-end ML workflows.

AWS Lambda: For serverless, event-driven inference of GenAI models in real time, especially for lightweight model deployment.

Amazon S3: Used for storing training data, model weights, and large datasets used in GenAI.

Amazon EFS or FSx: For distributed storage when handling large-scale datasets and models.

AWS Deep Learning AMIs: Pre-configured AMIs with popular deep learning frameworks like TensorFlow, PyTorch, and MXNet, making it easy to get started with GenAI models.

Question 13

Q

HPC

Answer

A

use of powerful computing systems to solve complex computational problems like scientific/climate/financial modeling

Question 14

Q

why GenAI startups would use image repository

Answer

A

images allow developers to package their AI models, dependencies, and environments into containers, making it easy to deploy them consistently across different environments (development, testing, production).

Question 15

Q

when GenAI would use spot instance

Answer

A

non-critical deep learning workloads that are not time sensitive like batch training or data preprocessing

most significant (90%) savings

Question 16

Q

when GenAI would use reserved instances

Answer

A

if company will have consistent, predictable workloads (like regular inference or long term production deployments), like app with AI models being offered via API 24/7

1 or 3 year commitment, 75% savings

Question 17

Q

Latest AWS ML/AI Instances

Answer

A

Amazon EC2 P5, P5e and P5en Instances, G5, P4, Trn, Infr

Question 18

Q

elastic fabric adapter

Answer

A

high performance networking layer for high speed data transfers between memory of EC2 instances (bypasses CPUs)

increases speed (lower latency, higher throughput) for AI/ML/HPC applications

used for (summary):
- parallel processing, machine learning, deep learning

used for (detailed)
- distributed model training (deep learning models, LLMs)
- running ML inference at scale by synchronizing model inference servers across instance
- distributed ML frameworks (tensorflow, pytorch, hugging face)
- parallel AI / multi-node workflows

Question 19

Q

Main EC2 Instances for machine learning

Answer

A

P5 instances (lot of customers use to train): intel sapphire rapids CPU and NVIDIA H100 or H200 Tensor Core CPUs. For deep learning and EFA (elastic fabric adapter) applications
G5 instances: graphics intensive (i.e. computer vision) and ML inference. NVIDIA A10G tensor core GPUs
Trn1 and Inf2

Question 20

Q

EC2 capacity blocks for ML and which are supported?

Answer

A

reserve GPU instances for a future date to run your machine learning (ML) workloads – think pre-order / upfront payment to secure the capacity

Capacity Blocks supports Amazon EC2 P5en, P5e, P5, and P4d, Trn2 and Trn1 instances

*think Revolve pre-order - guarantee order at certain date

Question 21

Q