GenAI Flashcards

1
Q

What instances for computer vision?

A

G5

high GPU power from optimized NVIDIA GPUs, low latency, high bandwidth

can process lots of real-time imaging data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

computer vision

what is it and what AWS services are relevant

A

allows computers to interpret and analyze visual data

captures images, algorithims process, makes decisions like identifying objects, classifying images, tracking movements

Amazon Rekognition: For image and video analysis.

SageMaker: To train custom vision models.

EC2 G5 Instances: Ideal for GPU-intensive computer vision tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

difference between traditional genAI and agentic AI

A

genAI responds to inputs to provide content generation – reactive but no actions without direct user interaction, no memory or goal setting

agentic AI - extends GenAI by combining content generation with autonomy, can initiate and complete tasks, make decisions, interact with systems without human input

agentic AI can automate end to end processes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

task & workflow orchestration

relevant AWS services

A

task orchestration – manages how tasks are triggered, executed, connected

workflow orestration – organizes + manages sequence of tasks like a roadmap

services:
-step functions: orchestrates workflows by connecting tasks in a flowchart
-AWS lambda: executes individual tasks
-AWS EventBridge: detects events, decides which service/task is then best for the job and routes it there

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

EventBridge vs. Lambda

A

eventbridge detects events then routes to a task/service, Lambda takes this events and executes a task

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

example of AWS service stack for a computer vision agent

A

API gateway - doc upload
eventbridge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

instance recommendation for complex physics simulations

what types of applications?

A

p5 instances for power, speed, scalability

H100 GPUs in P5 instances explicitly optimized for speed + precision

applications: robotics, AVs, fluid dynamics (think air or liquid analysis in automotive, healthcare, aerospace)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

instances for training vs inference workloads

A

training: P5 (think power, precision, physics) on NVIDIA H100

inference: G5 or Inferentia-powered Inf1 instances (think G = guess = infer, G-rated = more mellow) // moderate power

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

distilling and pruning

A

enable you to make models smaller/more compact without sacrificing quality (this means it can use less compute resources)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

common libraries for GenAI

A

PyTorch and TensorFlow - two of the most popular open-source machine learning (ML) frameworks. These tools provide developers with the libraries, tools, and building blocks to create, train, and deploy machine learning models

PyTorch/TensorFlow are the foundational engines/base layers that run the math and train the AI

Hugging Face/Nemo are the apps that use PyTorch/Tensor Flow to do the calculations then handle the other hard parts of AI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Elastic Fabric Adapter

A

increases speed (lower latency, higher throughput) for AI/ML/HPC applications by enabling high speed data transfers + fast communication across EC2 instances (allows them to talk to each other)

used for distributed model training (deep learning models), distributed ML frameworks (tensorcore)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

main aws services for GenAI

A

EC2 Instances with GPUs: For GenAI workloads, P3 or P4 EC2 instances with NVIDIA GPUs are commonly used for training deep learning models. Inf1 instances (based on AWS Inferentia chips) are designed for high-performance inference workloads.

Amazon SageMaker: A fully managed service for building, training, and deploying machine learning models, including GenAI models. SageMaker offers tools like SageMaker Studio, SageMaker Training, and SageMaker Pipelines for end-to-end ML workflows.

AWS Lambda: For serverless, event-driven inference of GenAI models in real time, especially for lightweight model deployment.

Amazon S3: Used for storing training data, model weights, and large datasets used in GenAI.

Amazon EFS or FSx: For distributed storage when handling large-scale datasets and models.

AWS Deep Learning AMIs: Pre-configured AMIs with popular deep learning frameworks like TensorFlow, PyTorch, and MXNet, making it easy to get started with GenAI models.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

HPC

A

use of powerful computing systems to solve complex computational problems like scientific/climate/financial modeling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

why GenAI startups would use image repository

A

images allow developers to package their AI models, dependencies, and environments into containers, making it easy to deploy them consistently across different environments (development, testing, production).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

when GenAI would use spot instance

A

non-critical deep learning workloads that are not time sensitive like batch training or data preprocessing

most significant (90%) savings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

when GenAI would use reserved instances

A

if company will have consistent, predictable workloads (like regular inference or long term production deployments), like app with AI models being offered via API 24/7

1 or 3 year commitment, 75% savings

16
Q

Latest AWS ML/AI Instances

A

Amazon EC2 P5, P5e and P5en Instances, G5, P4, Trn, Infr

17
Q

elastic fabric adapter

A

high performance networking layer for high speed data transfers between memory of EC2 instances (bypasses CPUs)

increases speed (lower latency, higher throughput) for AI/ML/HPC applications

used for (summary):
- parallel processing, machine learning, deep learning

used for (detailed)
- distributed model training (deep learning models, LLMs)
- running ML inference at scale by synchronizing model inference servers across instance
- distributed ML frameworks (tensorflow, pytorch, hugging face)
- parallel AI / multi-node workflows

18
Q

Main EC2 Instances for machine learning

A
  • P5 instances (lot of customers use to train): intel sapphire rapids CPU and NVIDIA H100 or H200 Tensor Core CPUs. For deep learning and EFA (elastic fabric adapter) applications
  • G5 instances: graphics intensive (i.e. computer vision) and ML inference. NVIDIA A10G tensor core GPUs
  • Trn1 and Inf2
19
Q

EC2 capacity blocks for ML and which are supported?

A

reserve GPU instances for a future date to run your machine learning (ML) workloads – think pre-order / upfront payment to secure the capacity

Capacity Blocks supports Amazon EC2 P5en, P5e, P5, and P4d, Trn2 and Trn1 instances

*think Revolve pre-order - guarantee order at certain date