1 - Fundamentals of AI/ML Flashcards

1
Q

____ data is a dataset where each instance or example is accompanied by a label or target variable that represents the desired output or classification.

A

Labeled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

____ data is a dataset where the instances or examples do not have any associated labels or target variables. The data consists only of input features, without any corresponding output or classification.

A

Unlabeled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

____ data refers to data that is organized and formatted in a predefined manner, typically in the form of tables or databases with rows and columns. This type of data is suitable for traditional machine learning algorithms that require well-defined features and labels.

A

Structured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What type of structured data includes data stored in spreadsheets, databases, or CSV files, with rows representing instances and columns representing features or attributes?

A

Tabular data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What type of structured data consists of sequences of values measured at successive points in time, such as stock prices, sensor readings, or weather data?

A

Time-series data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

____ data is data that lacks a predefined structure or format, such as text, images, audio, and video. This type of data requires more advanced machine learning techniques to extract meaningful patterns and insights.

A

Unstructured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What type of unstructured data includes documents, articles, social media posts, and other textual data?

A

text data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What type of unstructured data includes digital images, photographs, and video frames?

A

image data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The ML learning process is traditionally divided into what three broad categories?

A

supervised learning, unsupervised learning, and reinforcement learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In ____ learning, the algorithms are trained on labeled data. The goal is to learn a mapping function that can predict the output for new, unseen input data.

A

supervised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

____ learning refers to algorithms that learn from unlabeled data. The goal is to discover inherent patterns, structures, or relationships within the input data.

A

Unsupervised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In ____ learning, the machine is given only a performance score as guidance and semi-supervised learning, where only a portion of training data is labeled. Feedback is provided in the form of rewards or penalties for its actions, and the machine learns from this feedback to improve its decision-making over time.

A

reinforcement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

After the model has been trained, it is time to begin the process of using the information that a model has learned to make predictions or decisions. This is called ____.

A

inferencing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are two main types of inferencing in machine learning?

A

batch / real-time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

____ inferencing is when the computer takes a large amount of data, such as images or text, and analyzes it all at once to provide a set of results.

A

Batch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which type of inferencing in machine learning is often used for tasks like data analysis, where the speed of the decision-making process is not as crucial as the accuracy of the results?

A

Batch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

____ inferencing is when the computer has to make decisions quickly, in response to new information as it comes in.

A

Real-time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Which type of inferencing in machine learning is important for applications where immediate decision-making is critical, such as in chatbots or self-driving cars. The computer has to process the incoming data and make a decision almost instantaneously, without taking the time to analyze a large dataset?

A

Real-time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

At the core of deep learning are neural networks, which have lots of tiny units called ____ that are connected together.

A

nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

The nodes in neural networks are organized into a ____ layer, one or more ____ layers, and an ____ layer.

A

input / hidden / output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When we show a neural network many examples, like data about customers who bought certain products or used certain services, it figures out how to ____ by adjusting the connections between its nodes.

A

identify patterns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

True/False: When a neural network learns to recognize patterns from examples, it can then look at data for completely new customers that it has never seen before and still make predictions about what they might buy or how they might behave.

A

true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

____ is a field of artificial intelligence that makes it possible for computers to interpret and understand digital images and videos.

A

Computer Vision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

____ is a branch of artificial intelligence that deals with the interaction between computers and human languages.

A

Natural language processing (LNP)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Generative AI is powered by models that are pretrained on internet-scale data, and these models are called ____ models.

A

foundation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

With ____ models, instead of gathering labeled data for each model and training multiple models as in traditional ML, you can adapt a single model to perform multiple tasks.

A

foundation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

____ models perform tasks including text generation, text summarization, information extraction, image generation, chatbot interactions, and question answering.

A

Foundation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

The foundation model ____ is a comprehensive process that involves several stages, each playing a crucial role in developing and deploying effective and reliable foundation models.

A

lifecycle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

True/False: Foundation models require no training.

A

False: FMs require training on massive datasets from diverse sources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

____ data can be used at scale for pre-training because it is much easier to obtain compared to ____ data.

A

Unlabeled / labeled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

____ data includes raw data, such as images, text files, or videos, with no meaningful informative labels to provide context.

A

Unlabeled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Although traditional ML models rely on supervised, unsupervised, or reinforcement learning patterns, ____ are typically pre-trained through self-supervised learning.

A

foundation models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

____ learning makes use of the structure within the data to autogenerate labels.

A

Self-supervised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

During the initial ____ stage, the FM’s algorithm can learn the meaning, context, and relationship of the words in the datasets. For example, the model might learn whether drink means beverage, the noun, or swallowing the liquid, the verb.

A

pre-training

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

After the initial pre-training, the foundation model can be further pre-trained on additional data. This is known as ____ pre-training.

A

continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Pre-trained language models can be ____ through techniques like prompt engineering, retrieval-augmented generation (RAG), and fine-tuning on task-specific data.

A

optimized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Whether or not you fine-tune a model or use a pre-trained model off the shelf, the next logical step is to ____ the model.

A

evaluate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

When the foundation model meets the desired performance criteria, it can be ____ in the target production environment.

A

deployed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

After ____, the model’s performance is continuously monitored, and feedback is collected from users, domain experts, or other stakeholders. This feedback, along with model monitoring data, is used to identify areas for improvement, detect potential biases or drift, and inform future iterations of the model. The feedback loop permits continuous enhancement of the foundation model through fine-tuning, continuous pre-training, or re-training, as needed.

A

deployment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

____ are powerful models that can understand and generate human-like text. They are trained on vast amounts of text data from the internet, books, and other sources, and learn patterns and relationships between words and phrases.

A

large language models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

____ are the basic units of text that the model processes. They can be words, phrases, or individual characters like a period. They also provide standardization of input data, which makes it easier for the model to process.

A

Tokens

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

____ are numerical representations of tokens, where each token is assigned a vector (a list of numbers) that captures its meaning and relationships with other tokens.

A

Embeddings

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

____ use tokens, embeddings, and vectors to understand and generate text.

A

large language models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

____ is a deep learning architecture system that starts with pure noise or random data. The models gradually add more and more meaningful information to this noise until they end up with a clear and coherent output, like an image or a piece of text.

A

Diffusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

____ models learn through a two-step process of forward diffusion and reverse diffusion.

A

Diffusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Using ____ diffusion, the system gradually introduces a small amount of noise to an input image until only the noise is left over.

A

forward

47
Q

In ____ diffusion, the noisy image is gradually introduced to denoising until a new image is generated.

A

reverse

48
Q

Instead of just relying on a single type of input or output, like text or images, ____ models can process and generate multiple modes of data simultaneously.

A

multimodal

49
Q

____ models can be used for automating video captioning, creating graphics from text instructions, answering questions more intelligently by combining text and visual info, and even translating content while keeping relevant visuals.

A

Multimodal

50
Q

____ are a type of generative model that involves two neural networks competing against each other in a zero-sum game framework. The two networks are generator and discriminator.

A

Generative adversarial networks (GANs)

51
Q

A ____ neural network generates new synthetic data (for example, images, text, or audio) by taking random noise as input and transforming it into data that resembles the training data distribution.

A

generator

52
Q

A ____ neural network takes real data from the training set and synthetic data generated by the generator as input. Its goal is to distinguish between the real and generated data.

A

discriminator

53
Q

During GAN training, the ____ tries to generate data that can fool the ____ into thinking it’s real, while the discriminator tries to correctly classify the real and generated data. This adversarial process continues until the generator produces data that is indistinguishable from the real data.

A

generator / discriminator

54
Q

____ are a type of generative model that combines ideas from autoencoders (a type of neural network) and variational inference (a technique from Bayesian statistics). The model consists of encoders and decoders.

A

Variational autoencoders (VAEs)

55
Q

The ____ neural network takes the input data (for example, an image) and maps it to a lower-dimensional latent space, which captures the essential features of the data.

A

encoder

56
Q

The ____ neural network takes the latent representation from the encoder and generates a reconstruction of the original input data.

A

decoder

57
Q

A key part of the foundation model lifecycle is the optimization phase. An FM can be further optimized in several different ways. These techniques vary in complexity and cost, with the fastest and lowest cost option being ____.

A

prompt engineering

58
Q

Prompts act as ____ for foundation models.

A

instructions

59
Q

A prompt’s ____ depends on the task that you are giving to a model.

A

form

60
Q

____ focuses on developing, designing, and optimizing prompts to enhance the output of FMs for your needs.

A

Prompt engineering

61
Q

True/False: Prompt engineering examples contain some or all of the following elements: instructions, context, input data, output indicator

A

t

62
Q

____ is a supervised learning process that involves taking a pre-trained model and adding specific, smaller datasets. Adding these narrower datasets modifies the weights of the data to better align with the task.

A

Fine-tuning

63
Q

What are two ways to fine-tune a model?

A

instruction / reinforcement learning from human feedback (RLHF)

64
Q

____ fine-tuning uses examples of how the model should respond to a specific instruction. Prompt tuning is a type of instruction fine-tuning.

A

Instruction

65
Q

____ provides human feedback data, resulting in a model that is better aligned with human preferences.

A

reinforcement learning from human feedback (RLHF)

66
Q

If you are working on a task that requires industry knowledge, you can take a pre-trained model and ____ the model with industry data.

A

fine-tune

67
Q

____ is a technique that supplies domain-relevant data as context to produce responses based on that data.

A

Retrieval-augmented generation (RAG)

68
Q

Rather than having to fine-tune a foundation model with a small set of labeled examples, ____ retrieves a small set of relevant documents and uses that to provide context to answer the user prompt.

A

Retrieval-augmented generation (RAG)

69
Q

With ____, you can build, train, and deploy ML models for any use case with fully managed infrastructure, tools, and workflows.

A

SageMaker

70
Q

List 3 AWS ‘Text and documents’ AI/ML services

A

Comprehend, Translate, Textract

71
Q

List the chatbot services offered by AWS AI/ML services

A

Lex

72
Q

List 2 AWS speech services offered by AWS AI/ML services

A

Polly, Transcribe

73
Q

List the vision service offered by AWS AI/ML services

A

Rekognition

74
Q

List the search service offered by AWS AI/ML services

A

Kendra

75
Q

List the recommendations service offered by AWS AI/ML services

A

Personalize

76
Q

Amazon ____ uses ML and natural language processing (NLP) to help you uncover the insights and relationships in your unstructured data.

A

Comprehend

77
Q

Which Amazon AI/ML service performs the following functions:
- Identifies the language of the text
- Extracts key phrases, places, people, brands, or events
- Understands how positive or negative the text is
- Analyzes text using tokenization and parts of speech
- And automatically organizes a collection of text files by topic

A

Comprehend

78
Q

Amazon ____ is a neural machine translation service that delivers fast, high-quality, and affordable language translation.

A

Translate

79
Q

____ is a form of language translation automation that uses deep learning models to deliver more accurate and more natural-sounding translation than traditional statistical and rule-based translation algorithms.

A

Neural machine translation

80
Q

With Amazon ____, you can localize content such as websites and applications for your diverse users, translate large volumes of text for analysis, and efficiently implement cross-lingual communication between users.

A

Translate

81
Q

Amazon ____ is a service that automatically extracts text and data from scanned documents. This service goes beyond optical character recognition (OCR) to also identify the contents of fields in forms and information stored in tables.

A

Textract

82
Q

Amazon ____ is a fully managed AI service to design, build, test, and deploy conversational interfaces into any application using voice and text.

A

Lex

83
Q

Amazon ____ provides the advanced deep learning functionalities of automatic speech recognition (ASR) for converting speech to text, and natural language understanding (NLU) to recognize the intent of the text.

A

Lex

84
Q

Amazon ____ permits you to build applications with highly engaging user experiences and lifelike conversational interactions, and create new categories of products.

A

Lex

85
Q

With Amazon ____, the same deep learning technologies that power Amazon Alexa are now available to any developer. You can efficiently build sophisticated, natural-language conversational bots and voice-enabled interactive voice response (IVR) systems.

A

Lex

86
Q

Amazon ____ is a service that turns text into lifelike speech. This service lets you create applications that talk, so you can build entirely new categories of speech-enabled products.

A

Polly

87
Q

Amazon ____ is an AI service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice. It includes a wide selection of lifelike voices spread across dozens of languages, so you can select the ideal voice and build speech-enabled applications that work in many different countries.

A

Polly

88
Q

Amazon ____ is an automatic speech recognition (ASR) service for automatically converting speech to text. The service can transcribe audio files stored in common formats, like WAV and MP3, with time stamps for every word so that you can quickly locate the audio in the original source by searching for the text.

A

Transcribe

89
Q

Amazon ____ for a variety of business applications, including the following:

Transcription of voice-based customer service calls
Generation of subtitles on audio and video content
Conducting (text based) content analysis on audio and video content

A

Transcribe

90
Q

Amazon ____ facilitates adding image and video analysis to your applications. It uses proven, highly scalable, deep learning technology that requires no ML expertise to use.

A

Rekognition

91
Q

With Amazon ____, you can identify objects, people, text, scenes, and activities in images and videos, and even detect inappropriate content.

A

Rekognition

92
Q

Amazon ____ provides highly accurate facial analysis and facial search capabilities. You can use it to detect, analyze, and compare faces for a wide variety of user verification, people counting, and public safety use cases.

A

Rekognition

93
Q

Amazon ____ is an intelligent search service powered by ML.

A

Kendra

94
Q

Amazon ____ is an ML service that developers can use to create individualized recommendations for customers who use their applications.

A

Personalize

95
Q

With Amazon ____, you provide an activity stream from your application (page views, signups, purchases, and so forth). You also provide an inventory of the items that you want to recommend, such as articles, products, videos, or music.

A

Personalize

96
Q

Amazon ____ processes and examines the data, identifies what is meaningful, selects the right algorithms, and trains and optimizes a personalization model that is customized for your data.

A

Personalize

97
Q

AWS ____ is a 1/18th scale race car that gives you an interesting and fun way to get started with reinforcement learning (RL).

A

DeepRacer

98
Q

____ is an advanced ML technique that takes a very different approach to training models than other ML methods. Its superpower is that it learns very complex behaviors without requiring any labeled training data, and it can make short-term decisions while optimizing for a longer-term goal.

A

Reinforcement learning

99
Q

The ____ service layer in the AWS AI/ML stack consists of SageMaker JumpStart, Bedrock, Q, and Q Developer

A

Generative AI

100
Q

____ helps you quickly get started with ML by providing a set of solutions for the most common use cases, which can be readily deployed.

A

SageMaker JumpStart

101
Q

____ solutions are fully customizable and showcase the use of AWS CloudFormation templates and reference architectures so that you can accelerate your ML journey.

A

SageMaker JumpStart

102
Q

____ supports one-click deployment and fine-tuning of more than 150 popular open-source models such as natural language processing, object detection, and image classification models.

A

SageMaker JumpStart

103
Q

____ is a fully managed service that makes foundation models (FMs) from Amazon and leading AI startups available through an API.

A

Amazon Bedrock

104
Q

With the Amazon ____ serverless experience, you can quickly get started, experiment with FMs, privately customize them with your own data, and seamlessly integrate and deploy FMs into your AWS applications.

A

Bedrock

105
Q

Amazon ____ can help you get fast, relevant answers to pressing questions, solve problems, generate content, and take actions using the data and expertise found in your company’s information repositories, code, and enterprise systems.

A

Q

106
Q

When you chat with Amazon ____, it provides immediate, relevant information and advice to help streamline tasks, speed decision-making, and help spark creativity and innovation.

A

Q

107
Q

Designed to improve developer productivity, Amazon ____ provides ML–powered code recommendations to accelerate development of C#, Java, JavaScript, Python, and TypeScript applications.

A

Q Developer

108
Q

The Amazon ____ service integrates with multiple integrated development environments (IDEs) and helps developers write code faster by generating entire functions and logical blocks of code—often consisting of more than 10–15 lines of code.

A

Q Developer

109
Q

AWS generative AI services are designed to be highly responsive and available. However, higher levels of responsiveness and availability often come at an increased cost. For example, services with lower latency and higher availability (for example, multi-Region deployment) will typically have higher pricing compared to alternatives with lower performance and availability guarantees.

A
110
Q

To ensure redundancy and high availability, AWS generative AI services can be deployed across multiple Availability Zones or even across multiple AWS Regions. This redundancy comes with an additional cost, because resources have to be provisioned and data replicated across multiple locations.

A
111
Q

AWS offers different compute options (for example, CPU, GPU, and custom hardware accelerators) for generative AI services. Higher-performance options, such as GPU instances, generally come at a higher cost but can provide significant performance improvements for certain workloads.

A
112
Q

Many AWS generative AI services, such as Amazon Q Developer and Amazon Bedrock, use a token-based pricing model. This means that you pay for the number of tokens (a unit of text or code) generated or processed by the service. The more tokens you generate or process, the higher the cost.

A
113
Q

Some AWS generative AI services, like Amazon Polly and Amazon Transcribe, let you provision a specific amount of throughput (for example, audio or text processing capacity) in advance. Higher provisioned throughput levels typically come at a higher cost but can ensure predictable performance for time-sensitive workloads.

A
114
Q

AWS provides pre-trained models for various generative AI tasks, but you can also bring your own custom models or fine-tune existing models. Training and deploying custom models can incur additional costs, depending on the complexity of the model, the training data, and the compute resources required.

A