Section 3: Intro to AWS and Cloud Computing Flashcards

1
Q

What are the 6 advantages of Cloud Computing?

A
  1. Trade CAPEX for OPEX
  2. Massive Economies of Scale
  3. Stop Guessing Capacity
  4. Increase Speed and Agility
  5. Stop spending money running and maintaining DC
  6. Go Global in minutes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Types of Cloud Computing

A

IaaS
SaaS
PaaS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

You Manage in IaaS

A

Applications
Data
Runtime
Middleware
OS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

You Manage by in PaaS

A

Applications
Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

You manage in SaaS

A

Nothing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Pricing Models

A

Compute - Pay for compute time.

Storage - Pay for data stored in the cloud

Data transfer OUT of the cloud

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Gen AI?

A

Creating NEW data based on prompts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Name of ChatGPT Foundation Model?

A

GPT-4o

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

LLM

A

Large Language Model

Designed to create coherent, human-like, text.

ChatGPT is example.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is needed to use an LLM?

A

Prompt

“what is AWS”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

AWS’s Gen AI Tool

A

Bedrock

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Amazon Titan

A

High Performing FM from AWS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Fine-Tuning?

A

When you adapt a copy of a foundation model with your own data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Where do you add data for fine tuning?

A

S3 Bucket

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Instruction based fine tuning uses what?

A

labeled examples that are prompt-response pairs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Single Turn Messaging

A

Part of instruction based fine tuning, to determine how a chatbot should reply.

  1. System
  2. Messages
  3. role
  4. Content
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Multi-Turn messaing

A

Chatbot conversation. How to handle them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Transfer Learning

A

The broader concept of re-using a pre-trained model to adapt it to a new related task.

Widely used for image classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Use Cases for Fine Tuning

A
  1. A chatbot
  2. Training using up to date information
  3. training with exclusive data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

ROUGE

A

Recall-Oriented Understudy for Gisting Evaluation

Evaluating automatic summarization and machine translation systems.

ROUGE-N = Measure the number of matching n-grams between reference and generated text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

ROUGE-N

A

Measure the number of matching n-grams between reference and generated text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

ROUGE-L

A

Longest common subsequence between a reference and generated text.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

BLEU

A

Bilingual Evaluation Understudy

-Evaluate translation text. Considers precision and brevity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

BERTScore

A

Semantic similarity between generated texts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Perplexity

A

How well the model predicts the next token (lower is better).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

ARPU

A

Average Revenue per User

Business Metric to evaluate a model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

RAG

A

Retrieval-Augmented Generation

Allows a FM to reference a data source outside of its training data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

RAG Vector Databases

A
  1. Amazon OpenSearch Service - search and analytics database.
  2. Amazon DocumentDB [MongoDB compatibility] - NoSQL database.
  3. Aurora - ralational DB, proprietary on AWS
  4. Amazon RDS for PostgreSQl - relational DB, open source
  5. Amazon Neptune - graph database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

RAG Data Sources

A
  1. S3
  2. Confluence
  3. SharePoint
  4. Salesforce
  5. Web pages
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

tokenization

A

converting raw text into a sequence of tokens

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

types of tokenization

A

Word-based - text is split into individual words

Subword - some words can be split too (un-help-ful)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Context Window

A

the number of tokens an LLM can consider when generating text.

The larger the context window, the more information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is the first factor to consider when looking at a model?

A

Context Window

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Embeddings

A

Create vectors out of text, images, or audio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Vector

A

Array of numerical values. So each word as some/many numerical values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What can really power search applications?

A

Embedding models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Guardrails

A

Control the interaction between users and FM in Bedrock.

Filter out harmful and undesirable content.

Can remove PII, enhance privacy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Agents

A

manages and carry out various multi-step tasks related to infrastructure provisioning, application deployment, and operational activities.

Think like a chatbot agent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Model Invocation Logging?

How?

A

Sending logs of all invocations to Amazon CloudWatch and S3

  • AWS Cloudwatch
40
Q

Bedrock Studio

A

Gives access to Amazon Bedrock to your team so they can easily create AI-powered applications.

41
Q

Watermark Detection

A

Check if an image was generated by Amazon Titan Generator

42
Q

Bedrock Pricing Models

A
  1. On-Demand, Text and Embedding are per token, image is per generated image.
  2. Batch - multiple predictions at a time, discounts up to 50%.
  3. Provisioned - purchase model unites for a certain time.
43
Q

Model Improvement Cost order

A

$ - Prompt Engineering
$$ - Retrieval Augmented Generation (RAG)
$$$ - Instruction-based Fine-tuning
$$$$ - Domain Adaption fine tuning

44
Q

What type of Gen AI can recognize and interpret various forms of input data, such as text, images, and audio?

A

Multimodal model

45
Q

Which AWS service can help store embeddings within vector databases?

A

Amazon OpenSearch Serverless

46
Q

Prompt Engineering

A

Developing, designing, and optimizing prompts to enhance the output of FMs for your needs.

47
Q

Improved Prompting Consists of?

A
  1. Instructions - a task for the model to do.
  2. Context - external information to guide the model
  3. Input Data - the input for which you want a response.
  4. Output Indicator - the output type or format.
48
Q

Negative Prompting

A

A technique where you explicitly instruct the model on what NOT to include or do in its response.

49
Q

Prompt Performance - System Prompts

A

How the model should behave and reply.

50
Q

Prompt Performance - Temperature

A

Value: 0-1

Creativity of the model’s output.

Low Value - more conservative

High Value - more diverse, less predictable, less coherent.

51
Q

Prompt Performance - Top P

A

Value 0-1

Low P - Consider the 25% most likely words, more coherent.

High P - Consider a broad range of possible words.

52
Q

Prompt Performance - Top K

A

Limits the number of probable words.

Low K - more coherent, less probable words.

High K - more probable words, more diverse

53
Q

Prompt Performance - Length

A

Maximum Length of the answer

54
Q

Prompt Performance - Stop Sequence

A

Tokens that signal the model to stop generating output.

55
Q

Prompt Latency

A

How fast the model responds.

Impacted by model size, model type, number of token in input, number of tokens in output.

Not impacted by Top P, Top K, Temperature!!

56
Q

Zero Shot Prompting

A

Present a task to the model without providing examples or explicit training for that specific task.

57
Q

Few Shots Prompting

A

Provide examples of a task to the model to guides its output.

58
Q

Chain of Thought Prompting

A

Divide the task into a sequence of reasoning steps, leading to more structure and coherence.

Think ‘Step by Step”

59
Q

How to simplify and standardize the process of generating prompts?

A

Prompt Templates

60
Q

AWS’s Solution for a fully managed Gen AI based on your company’s knowledge and data?

A

Amazon Q Business

61
Q

What is Amazon Q Built on?

Which FM?

A

Built on Amazon Bedrock

Can’t choose the FM, it consists of a few.

62
Q

What benefit is there by having Amazon Q + IAM Identity Center?

A

Users receive responses generated only from the documents they have access to.

63
Q

Amazon Q Business - Admin Controls

A

Controls and customize responses to your organizational needs.

Admin Controls = Gaurdrails

64
Q

Q Apps

A

Part of Amazon Q Business

Create Gen AI powered apps without coding by using natural language.

65
Q

Functions of Amazon Q Developer

A
  1. Answer questions about the AWS documentation and AWS Service selection.
  2. Answer questions about resources in your AWS account.
  3. Suggest CLI to run to make changes to your account.
  4. Helps you do bill analysis, resolve errors, troubleshooting
  5. AI Code companion
66
Q

QuickSight

A

Used to visualize your data and create dashboards about them.

67
Q

Amazon Q for EC2

A

EC2 - instances are virtual servers.

Amazon Q for EC2 - provides guidance and suggestions for EC2 instance types that are best suited to your new workload.

68
Q

Amazon Q for Glue

A

Glue - is an ETL (Extract Transform and Load) service used to move data across places.

69
Q

PartyRock

A

GenAI app-building playground (powered by Bedrock)

70
Q

What is AWS Q Developer?

A

An AI Coding assistant

71
Q

AI Components

A
  1. Data Layer - where you collect vast amount of data.
  2. ML Framework & Algorithm Layer
  3. Model Layer - implement a model and train it.
72
Q

What is ML?

A

Machine Learning

Type of AI for building methods that allow machines to learn.

Data is what is leveraged.

Great for making predictions.

73
Q

What is Deep learning?

A

Subset of Machine Learning.

Uses neurons and synapses like our brain, to train models.

Process more complex patterns in the data than traditional ML.

74
Q

Why DEEP learning?

A

Deep because there’s more than one layer of learning.

75
Q

Computer Vision

A

Part of Deep Learning

Image classification, object detection, and image segmentation.

76
Q

NLP

A

Natural Language Processing

Part of Deep Learning

test classification, sentiment analysis, machine translation, language generation.

77
Q

Transformer Model

A

Able to process a sentence as a whole instead of word by word.

78
Q

Diffusion Model

A

Adding or subtracting noise from an image.

79
Q

Multi-Modal Models

A

Multiple types of inputs, and can create multiple types of outputs.

80
Q

GPT

A

GENERATIVE PRE-TRAINED TRANSFORMER

Generate human text or computer code based on input prompts.

81
Q

BERT

A

BIDIRECTIONAL ENCODER REPRESENTATIONS FROM TRANSFORMERS

Similar intent to GPT, but reads the text in two directions.

82
Q

RNN

A

RECURRENT NEURAL NETWORK

Meant for sequential data such as time-series or text, useful in speech recognition, time-series prediction.

83
Q

ResNet

A

RESIDUAL NETWORK

Deep Convolutional Neural Network (CDN) used for image recognition tasks, objects detection, facial recognition.

84
Q

SVM

A

SUPPORT VECTOR MACHINE

ML algorithm for classification and regression.

85
Q

WaveNet

A

model to generate raw audio waveform, used in speech synthesis.

86
Q

GAN

A

GENERATIVE ADVERSARIAL NETWORK

Models used to gnerate synthetic data such as images, videos, or sounds that resemble the training data.

Helpful for data augmentation.

87
Q

XGBoost

A

EXTREME GRADIENT BOOSTING

An implementation of gradient boosting.

88
Q

Labeled vs Unlabeled Data

A

Labeled - includes both input features and corresponding output labels.

Unlabeled - Data that includes only input features without any output labels.

89
Q

Structured Data vs Unstructured

A

Structured - Put into rows and columns (like excel)

Unstrcuted - no rhyme or reason.

90
Q

Tabular Data

A

Data that is arranged in a table with rows. Structued Data.

91
Q

Time Series Data

A

Structured Data

Data points collected or recorded at successive points in time.

92
Q

Articles, Customer Reviews, and Social Media posts are what kind of data?

A

Unstructured Data

93
Q

Supervised Learning - Regrssion

A

Used to predict a numeric value based on input data.

Output variable is CONTINUOUS

94
Q

Supervised Learning - Classification

A

Used to predict the categorical label of input data.

Output variable is DISCRETE

95
Q

Validation Set

A

Used to tune model parameters and validate performance.

96
Q

Feature Engineering

A

Process of using domain knowledge to select and transform raw data into meaningful features.