CAIC 9.5 Flashcards

Question 1

Q

What are common marketing tactics employed by retailers?

Answer

A

Direct marketing emails, digital advertisements, incentives, discounts

These tactics are often based on customer demographics.

Question 2

Q

What is the goal of using ML models in marketing campaigns?

Answer

A

To optimize effectiveness, target the right customers, achieve high conversion rates, minimize costs

This involves analyzing customer data and demographics.

Question 3

Q

What is unsupervised clustering in customer segmentation?

Answer

A

A method to group customers based on data, such as basic demographics

This helps create unique marketing campaigns for each segment.

Question 4

Q

What do highly personalized marketing campaigns utilize?

Answer

A

Accurate individual profiles using behavior data, historical transaction data, social media data

This leads to higher conversion rates.

Question 5

Q

What is contextual advertising?

Answer

A

A targeted marketing technique that displays ads relevant to web page content

Example: Cooking product ads on cooking recipe websites.

Question 6

Q

How does generative AI enhance targeted marketing?

Answer

A

By creating dynamically personalized content, such as customized images and text

This is tailored to individual customer preferences and interests.

Question 7

Q

What is sentiment analysis?

Answer

A

A text classification problem that determines if sentiment is positive, negative, or neutral

It uses labeled text data, such as product reviews.

Question 8

Q

What techniques do retailers use to assess brand perception?

Answer

A

Soliciting feedback, monitoring social media channels

This helps retailers understand customer emotions and sentiments.

Question 9

Q

What is the main purpose of inventory planning and demand forecasting?

Answer

A

To manage inventory costs while maximizing revenue and avoiding out-of-stock situations

Traditional methods have limitations in accuracy.

Question 10

Q

Which techniques do retailers use for demand forecasting?

Answer

A

Statistical techniques, ML techniques such as regression analysis and deep learning

These approaches create accurate demand forecasts.

Question 11

Q

What are the three main stages of the autonomous driving system architecture?

Answer

A

Perception and localization, decision and planning, control

Each stage plays a crucial role in the functioning of autonomous vehicles.

Question 12

Q

What is the role of the perception stage in autonomous driving?

Answer

A

To gather information about surroundings and determine the vehicle’s position

It uses sensors like RADAR, LIDAR, and cameras.

Question 13

Q

What is the function of the decision and planning stage in autonomous vehicles?

Answer

A

Controls motion and behavior based on data from the perception stage

It analyzes data to determine the optimal path for the vehicle.

Question 14

Q

How do AI and ML enhance the control module in autonomous vehicles?

Answer

A

By translating decisions into physical actions and optimizing the vehicle’s performance

This includes adaptive control systems and reinforcement learning.

Question 15

Q

What is the purpose of Advanced Driver Assistance Systems (ADAS)?

Answer

A

To enhance driving experience and safety by detecting hazards and issuing warnings

Examples include lane departure warnings and automatic emergency braking.

Question 16

Q

What is the main role of ML solutions architects?

Answer

A

To understand common ML algorithms and design technology infrastructure for deployment

This knowledge helps in selecting suitable data science solutions.

Question 17

Q

What is an objective function in ML algorithms?

Answer

A

A metric used to minimize or maximize, such as the disparity between projected and actual sales

It guides the optimization process.

Question 18

Q

What is the purpose of gradient descent in ML?

Answer

A

To optimize model parameters by calculating the rate of error change

This iterative approach helps reduce errors in predictions.

Question 19

Q

What is the primary purpose of gradient descent?

Answer

A

To optimize neural networks and various ML algorithms

Question 20

Q

What does gradient descent calculate to update model parameters?

Answer

A

The rate of error change (gradient) associated with each input variable

Question 21

Q

What is the role of the learning rate in gradient descent?

Answer

A

Controls the magnitude of parameter updates at each iteration

Question 22

Q

List the key steps involved in the gradient descent optimization process.

Answer

A

Initialize the value of W randomly
Calculate the error (loss) using the assigned value of W
Compute the gradient of the error with respect to the loss function
Update the value of W to reduce the error
Repeat until the gradient becomes zero

Question 23

Q

What is the normal equation in relation to machine learning?

Answer

A

A one-step analytical solution for calculating the coefficients of linear regression models

Question 24

Q

What are some factors to consider when selecting a ML algorithm?

Answer

A

Problem type
Dataset size
Number and nature of features
Computational requirements
Interpretability of results
Assumptions about data distribution

Question 25

Q

What is classification in machine learning?

Answer

A

A task that assigns categories or classes to data points

Question 26

Q

What is regression in machine learning?

Answer

A

A technique used to predict continuous numeric values

Question 27

Q

What is linear regression used for?

Answer

A

To predict a scalar output based on a linear function of input variables

Question 28

Q

What does logistic regression estimate?

Answer

A

The probability of an event occurring

Question 29

Q

What is the primary output of logistic regression?

Answer

A

A probability score between 0 and 1

Question 30

Q

What are the advantages of logistic regression?

Answer

A

Fast training speed
Interpretability

Question 31

Q

What is the main advantage of decision trees over linear models?

Answer

A

Ability to capture non-linear relationships and interactions between features

Question 32

Q

What algorithms are used for splitting data in decision trees?

Answer

A

Gini purity index
Information gain

Question 33

Q

What is a limitation of decision trees?

Answer

A

Prone to overfitting, especially with noisy data

Question 34

Q

What is the primary benefit of random forests?

Answer

A

Improved accuracy by combining predictions from multiple trees

Question 35

Q

How do random forests reduce overfitting?

Answer

A

By introducing randomness and using diverse subsets of features

Question 36

Q

What distinguishes gradient boosting from random forests?

Answer

A

Gradient boosting sequentially aggregates results while random forests use parallel independent learners

Question 37

Q

What is a key advantage of gradient boosting?

Answer

A

Ability to handle imbalanced datasets effectively

Question 38

Q

What is a disadvantage of gradient boosting?

Answer

A

Lacks parallelization capabilities, making it slower in training

Question 39

Q

What is gradient boosting?

Answer

A

A machine learning algorithm that can achieve higher performance than other algorithms when properly tuned.

Question 40

Q

What is a key advantage of gradient boosting?

Answer

A

It supports custom loss functions, providing flexibility in modeling real-world applications.

Question 41

Q

What is one limitation of gradient boosting?

Answer

A

It lacks parallelization capabilities, making it slower in training compared to parallelizable algorithms.

Question 42

Q

How does gradient boosting handle noisy data?

Answer

A

It is sensitive to noisy data, including outliers, which can lead to overfitting and reduced generalization performance.

Question 43

Q

What is XGBoost?

Answer

A

A widely-used implementation of gradient boosting that enables training a single tree across multiple cores and CPUs.

Question 44

Q

What are some improvements XGBoost offers over traditional gradient boosting?

Answer

A

Faster training times and powerful regularization techniques to mitigate overfitting.

Question 45

Q

What is K-NN?

Answer

A

A versatile algorithm used for both classification and regression tasks based on the proximity of data points.

Question 46

Q

What distance metric is commonly used in K-NN?

Answer

A

Euclidean distance.

Question 47

Q

What is the majority voting process in K-NN classification?

Answer

A

The most frequent class among the K nearest neighbors is assigned to the new data point.

Question 48

Q

What is one advantage of K-NN?

Answer

A

Its simplicity and lack of the need for training or tuning with hyperparameters.

Question 49

Q

What is a significant drawback of K-NN?

Answer

A

It is not suitable for high-dimensional datasets due to the diminished meaning of proximity.

Question 50

Q

What is an artificial neuron?

Answer

A

A computational unit that processes inputs and produces an output, similar to a biological neuron.

Question 51

Q

What does the activation function in an artificial neuron do?

Answer

A

Modifies the output of the linear function, capturing non-linear relationships.

Question 52

Q

What is a Multi-Layer Perceptron (MLP)?

Answer

A

An artificial neural network with multiple layers of interconnected neurons.

Question 53

Q

What is backpropagation?

Answer

A

The process of adjusting the weights of neurons based on the total error propagated back through the network.

Question 54

Q

What is the purpose of an MLP in machine learning?

Answer

A

To perform classification and regression tasks by capturing intricate nonlinear patterns.

Question 55

Q

What is clustering in data mining?

Answer

A

A method of grouping items together based on shared attributes.

Question 56

Q

What is the K-means clustering algorithm used for?

Answer

A

Grouping similar data points into clusters based on proximity to centroids.

Question 57

Q

What is a time series?

Answer

A

A sequence of data points recorded at successive time intervals.

Question 58

Q

What are the key characteristics of time series data?

Answer

A

Trend
Seasonality
Stationarity

Question 59

Q

What does the ARIMA model stand for?

Answer

A

Auto-Regressive Integrated Moving Average.

Question 60

Q

What is the autoregressive component of ARIMA?

Answer

A

The value of a variable in a given period is influenced by its own previous values.

Question 61

Q

What does DeepAR utilize for forecasting?

Answer

A

A recurrent neural network (RNN) to capture patterns in target time series.

Question 62

Q

What is a major disadvantage of DeepAR?

Answer

A

The black-box nature of deep learning models makes forecasts difficult to explain.

Question 63

Q

What is a recommender system?

Answer

A

A machine learning system designed to suggest items to users based on various data inputs.

Question 64

Q

What is a significant drawback of DeepAR?

Answer

A

The black-box nature of the deep learning model, which lacks interpretability and transparency.

Answer 65

A

Predicting a user’s preference for items based on user or item attribute similarities or user-item interactions.

Answer 66

A

Retail
Media and entertainment
Finance
Healthcare

Answer 67

A

A recommendation algorithm that predicts user preferences by analyzing the collective experiences and behaviors of different users.

Answer 68

A

It provides highly personalized recommendations matched to each user’s unique interests.

Answer 69

A

Collaborative models struggle when new users or items with no ratings are introduced.

Answer 70

A

It learns vector representations for both users and items to predict missing entries in the user-item interaction matrix.

Answer 71

A

Processing and analyzing image data.

Answer 72

A

It reduces the dimensionality of the extracted features.

Answer 73

A

Max pooling
Average pooling

Answer 74

A

Signals from initial inputs diminish as they traverse through multiple layers.

Answer 75

A

By implementing a layer-skipping technique with skip connections.

Answer 76

A

The relationship between computers and human language.

Answer 77

A

To generate low-dimensional representations for words or sentences that capture semantic meaning.

Answer 78

A

TF (Term Frequency)
IDF (Inverse Document Frequency)

Answer 79

A

They lack the ability to capture the semantic meaning of words and often result in large and sparse input vectors.

Answer 80

A

Creating numerical representations for entities that capture their semantic similarity.

Answer 81

A

Efficient training due to high degrees of parallelism.

Answer 82

A

Striking the right balance between exploration and exploitation.

Answer 83

A

Object identification
Image classification
Face recognition
Activity detection

Answer 84

A

A technique used to generate low-dimensional representations (mathematical vectors) for words or sentences that capture the semantic meaning of the text.

Answer 85

A

Words or sentences with similar semantic meanings tend to occur in similar contexts.

Answer 86

A

They are closer to each other than those with different meanings.

Answer 87

A

A metric that measures how similar two vectors are by calculating the cosine of the angle between them.

Answer 88

A

CBOW (Continuous Bag of Words)
Continuous-skip-gram

Answer 89

A

It tries to predict a word for a given window of surrounding words.

Answer 90

A

It tries to predict surrounding words for a given word.

Answer 91

A

To run across running text and choose one of the words as the target and the rest as inputs.

Answer 92

A

A straightforward one-hidden-layer MLP network.

Answer 93

A

They serve as the actual embeddings for the words after training.

Answer 94

A

Transfer learning.

Answer 95

A

It produces a fixed embedding representation for each word, disregarding contextual variations.

Answer 96

A

Bidirectional Encoder Representations from Transformers.

Answer 97

A

They consider the surrounding words or overall context, allowing for more nuanced representations.

Answer 98

A

Predicting randomly masked words in sentences
Predicting the next sentence from a given sentence.

Answer 99

A

They allow BERT to handle out-of-vocabulary (OOV) words more effectively.

Answer 100

A

The encoder part.

Answer 101

A

Question answering
Named entity extraction
Text summarization

Answer 102

A

Adding an additional output layer to the BERT network for a specific task and updating the pre-trained model weights.

Answer 103

A

A type of generative model designed to generate realistic data instances, such as images.

Answer 104

A

Generator
Discriminator

Answer 105

A

To generate instances of data.

Answer 106

A

Learns to distinguish between real and fake instances generated by the Generator.

Answer 107

A

A process where a model learns how to perform a task with just a few examples.

Answer 108

A

GPT uses the Transformer decoder block while BERT uses the Transformer encoder block.

Answer 109

A

Next word prediction.

Answer 110

A

It has 175 billion parameters after training.

Answer 111

A

Models that are pre-trained on massive datasets and can handle multiple tasks.

Answer 112

A

Pathways.

Answer 113

A

An LLM available in multiple sizes from 7 billion to 65 billion parameters.

Answer 114

A

It requires fewer computational resources.

Answer 115

A

From 7 billion parameters to 65 billion parameters

Answer 116

A

Requires fewer computational resources

Answer 117

A

Generating creative text
Answering questions
Solving mathematical problems

Answer 118

A

A noncommercial license emphasizing usage in research contexts

Answer 119

A

It performs extremely well with additional training data

Answer 120

A

176 billion parameters

Answer 121

A

46 different languages and 13 programming languages

Answer 122

A

More than 1,000 researchers

Answer 123

A

Individuals and institutions can use and build upon the model under agreed terms

Answer 124

A

Within the Hugging Face ecosystem

Answer 125

A

Models specifically trained for industries to solve tough domain-focused problems

Answer 126

A

A domain-focused LLM specifically trained for the finance industry

Answer 127

A

Sentiment analysis
Named entity recognition
News classification
Question answering

Answer 128

A

Over 700 billion tokens

Answer 129

A

Generating misinformation (hallucinations)
Toxic content
Potential bias
High resource consumption

Answer 130

A

Named entity extraction
Document classification
Sentiment analysis

Answer 131

A

High-resolution, photorealistic images and generative art

Answer 132

A

Diffusion model

Answer 133

A

By adding noise to input data until unrecognizable, then reversing the process

Answer 134

A

Diffusion steps

Answer 135

A

Optimizing a set of learnable parameters through backpropagation

Answer 136

A

Capturing complex dependencies, intricate patterns, and structures