CAIC 9.5 Flashcards

1
Q

What are common marketing tactics employed by retailers?

A

Direct marketing emails, digital advertisements, incentives, discounts

These tactics are often based on customer demographics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the goal of using ML models in marketing campaigns?

A

To optimize effectiveness, target the right customers, achieve high conversion rates, minimize costs

This involves analyzing customer data and demographics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is unsupervised clustering in customer segmentation?

A

A method to group customers based on data, such as basic demographics

This helps create unique marketing campaigns for each segment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do highly personalized marketing campaigns utilize?

A

Accurate individual profiles using behavior data, historical transaction data, social media data

This leads to higher conversion rates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is contextual advertising?

A

A targeted marketing technique that displays ads relevant to web page content

Example: Cooking product ads on cooking recipe websites.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does generative AI enhance targeted marketing?

A

By creating dynamically personalized content, such as customized images and text

This is tailored to individual customer preferences and interests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is sentiment analysis?

A

A text classification problem that determines if sentiment is positive, negative, or neutral

It uses labeled text data, such as product reviews.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What techniques do retailers use to assess brand perception?

A

Soliciting feedback, monitoring social media channels

This helps retailers understand customer emotions and sentiments.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the main purpose of inventory planning and demand forecasting?

A

To manage inventory costs while maximizing revenue and avoiding out-of-stock situations

Traditional methods have limitations in accuracy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which techniques do retailers use for demand forecasting?

A

Statistical techniques, ML techniques such as regression analysis and deep learning

These approaches create accurate demand forecasts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the three main stages of the autonomous driving system architecture?

A

Perception and localization, decision and planning, control

Each stage plays a crucial role in the functioning of autonomous vehicles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the role of the perception stage in autonomous driving?

A

To gather information about surroundings and determine the vehicle’s position

It uses sensors like RADAR, LIDAR, and cameras.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the function of the decision and planning stage in autonomous vehicles?

A

Controls motion and behavior based on data from the perception stage

It analyzes data to determine the optimal path for the vehicle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do AI and ML enhance the control module in autonomous vehicles?

A

By translating decisions into physical actions and optimizing the vehicle’s performance

This includes adaptive control systems and reinforcement learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the purpose of Advanced Driver Assistance Systems (ADAS)?

A

To enhance driving experience and safety by detecting hazards and issuing warnings

Examples include lane departure warnings and automatic emergency braking.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the main role of ML solutions architects?

A

To understand common ML algorithms and design technology infrastructure for deployment

This knowledge helps in selecting suitable data science solutions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is an objective function in ML algorithms?

A

A metric used to minimize or maximize, such as the disparity between projected and actual sales

It guides the optimization process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the purpose of gradient descent in ML?

A

To optimize model parameters by calculating the rate of error change

This iterative approach helps reduce errors in predictions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the primary purpose of gradient descent?

A

To optimize neural networks and various ML algorithms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What does gradient descent calculate to update model parameters?

A

The rate of error change (gradient) associated with each input variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the role of the learning rate in gradient descent?

A

Controls the magnitude of parameter updates at each iteration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

List the key steps involved in the gradient descent optimization process.

A
  • Initialize the value of W randomly
  • Calculate the error (loss) using the assigned value of W
  • Compute the gradient of the error with respect to the loss function
  • Update the value of W to reduce the error
  • Repeat until the gradient becomes zero
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the normal equation in relation to machine learning?

A

A one-step analytical solution for calculating the coefficients of linear regression models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are some factors to consider when selecting a ML algorithm?

A
  • Problem type
  • Dataset size
  • Number and nature of features
  • Computational requirements
  • Interpretability of results
  • Assumptions about data distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is classification in machine learning?

A

A task that assigns categories or classes to data points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is regression in machine learning?

A

A technique used to predict continuous numeric values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is linear regression used for?

A

To predict a scalar output based on a linear function of input variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does logistic regression estimate?

A

The probability of an event occurring

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What is the primary output of logistic regression?

A

A probability score between 0 and 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What are the advantages of logistic regression?

A
  • Fast training speed
  • Interpretability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the main advantage of decision trees over linear models?

A

Ability to capture non-linear relationships and interactions between features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What algorithms are used for splitting data in decision trees?

A
  • Gini purity index
  • Information gain
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is a limitation of decision trees?

A

Prone to overfitting, especially with noisy data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What is the primary benefit of random forests?

A

Improved accuracy by combining predictions from multiple trees

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

How do random forests reduce overfitting?

A

By introducing randomness and using diverse subsets of features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What distinguishes gradient boosting from random forests?

A

Gradient boosting sequentially aggregates results while random forests use parallel independent learners

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What is a key advantage of gradient boosting?

A

Ability to handle imbalanced datasets effectively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What is a disadvantage of gradient boosting?

A

Lacks parallelization capabilities, making it slower in training

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

What is gradient boosting?

A

A machine learning algorithm that can achieve higher performance than other algorithms when properly tuned.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

What is a key advantage of gradient boosting?

A

It supports custom loss functions, providing flexibility in modeling real-world applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

What is one limitation of gradient boosting?

A

It lacks parallelization capabilities, making it slower in training compared to parallelizable algorithms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

How does gradient boosting handle noisy data?

A

It is sensitive to noisy data, including outliers, which can lead to overfitting and reduced generalization performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

What is XGBoost?

A

A widely-used implementation of gradient boosting that enables training a single tree across multiple cores and CPUs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

What are some improvements XGBoost offers over traditional gradient boosting?

A

Faster training times and powerful regularization techniques to mitigate overfitting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What is K-NN?

A

A versatile algorithm used for both classification and regression tasks based on the proximity of data points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

What distance metric is commonly used in K-NN?

A

Euclidean distance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

What is the majority voting process in K-NN classification?

A

The most frequent class among the K nearest neighbors is assigned to the new data point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

What is one advantage of K-NN?

A

Its simplicity and lack of the need for training or tuning with hyperparameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

What is a significant drawback of K-NN?

A

It is not suitable for high-dimensional datasets due to the diminished meaning of proximity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

What is an artificial neuron?

A

A computational unit that processes inputs and produces an output, similar to a biological neuron.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

What does the activation function in an artificial neuron do?

A

Modifies the output of the linear function, capturing non-linear relationships.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

What is a Multi-Layer Perceptron (MLP)?

A

An artificial neural network with multiple layers of interconnected neurons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

What is backpropagation?

A

The process of adjusting the weights of neurons based on the total error propagated back through the network.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

What is the purpose of an MLP in machine learning?

A

To perform classification and regression tasks by capturing intricate nonlinear patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

What is clustering in data mining?

A

A method of grouping items together based on shared attributes.

56
Q

What is the K-means clustering algorithm used for?

A

Grouping similar data points into clusters based on proximity to centroids.

57
Q

What is a time series?

A

A sequence of data points recorded at successive time intervals.

58
Q

What are the key characteristics of time series data?

A
  • Trend
  • Seasonality
  • Stationarity
59
Q

What does the ARIMA model stand for?

A

Auto-Regressive Integrated Moving Average.

60
Q

What is the autoregressive component of ARIMA?

A

The value of a variable in a given period is influenced by its own previous values.

61
Q

What does DeepAR utilize for forecasting?

A

A recurrent neural network (RNN) to capture patterns in target time series.

62
Q

What is a major disadvantage of DeepAR?

A

The black-box nature of deep learning models makes forecasts difficult to explain.

63
Q

What is a recommender system?

A

A machine learning system designed to suggest items to users based on various data inputs.

64
Q

What is a significant drawback of DeepAR?

A

The black-box nature of the deep learning model, which lacks interpretability and transparency.

65
Q

What is the primary function of a recommender system?

A

Predicting a user’s preference for items based on user or item attribute similarities or user-item interactions.

66
Q

In which industries has the recommender system gained widespread adoption?

A
  • Retail
  • Media and entertainment
  • Finance
  • Healthcare
67
Q

What is collaborative filtering?

A

A recommendation algorithm that predicts user preferences by analyzing the collective experiences and behaviors of different users.

68
Q

What is a major benefit of collaborative filtering?

A

It provides highly personalized recommendations matched to each user’s unique interests.

69
Q

What is the cold-start problem in collaborative filtering?

A

Collaborative models struggle when new users or items with no ratings are introduced.

70
Q

What does matrix factorization do in collaborative filtering?

A

It learns vector representations for both users and items to predict missing entries in the user-item interaction matrix.

71
Q

What is the primary function of a convolutional neural network (CNN)?

A

Processing and analyzing image data.

72
Q

What role does the pooling layer play in a CNN?

A

It reduces the dimensionality of the extracted features.

73
Q

What are two commonly used pooling techniques?

A
  • Max pooling
  • Average pooling
74
Q

What is the vanishing gradient problem in CNNs?

A

Signals from initial inputs diminish as they traverse through multiple layers.

75
Q

How does ResNet address the vanishing gradient problem?

A

By implementing a layer-skipping technique with skip connections.

76
Q

What does natural language processing (NLP) focus on?

A

The relationship between computers and human language.

77
Q

What is the purpose of embedding in NLP?

A

To generate low-dimensional representations for words or sentences that capture semantic meaning.

78
Q

What are the two components of TF-IDF?

A
  • TF (Term Frequency)
  • IDF (Inverse Document Frequency)
79
Q

What is a limitation of Bag of Words (BOW) and TF-IDF?

A

They lack the ability to capture the semantic meaning of words and often result in large and sparse input vectors.

80
Q

What does the term ‘embedding’ refer to in machine learning?

A

Creating numerical representations for entities that capture their semantic similarity.

81
Q

What is the primary advantage of using CNNs for image data?

A

Efficient training due to high degrees of parallelism.

82
Q

What is a key challenge faced by MAB algorithms?

A

Striking the right balance between exploration and exploitation.

83
Q

What is a practical application of computer vision technology?

A
  • Object identification
  • Image classification
  • Face recognition
  • Activity detection
84
Q

What is embedding?

A

A technique used to generate low-dimensional representations (mathematical vectors) for words or sentences that capture the semantic meaning of the text.

85
Q

What does the underlying idea of embedding suggest?

A

Words or sentences with similar semantic meanings tend to occur in similar contexts.

86
Q

How are semantically similar entities represented in embedding space?

A

They are closer to each other than those with different meanings.

87
Q

What is cosine similarity?

A

A metric that measures how similar two vectors are by calculating the cosine of the angle between them.

88
Q

What are the two techniques for learning embedding in Word2Vec?

A
  • CBOW (Continuous Bag of Words)
  • Continuous-skip-gram
89
Q

How does CBOW work?

A

It tries to predict a word for a given window of surrounding words.

90
Q

How does continuous-skip-gram work?

A

It tries to predict surrounding words for a given word.

91
Q

What is the purpose of a sliding window in CBOW?

A

To run across running text and choose one of the words as the target and the rest as inputs.

92
Q

What type of network is used to train Word2Vec embeddings?

A

A straightforward one-hidden-layer MLP network.

93
Q

What is the purpose of the hidden layer’s weights in Word2Vec?

A

They serve as the actual embeddings for the words after training.

94
Q

What is the term for using embeddings as features for downstream tasks?

A

Transfer learning.

95
Q

What limitation does Word2Vec have regarding word meanings?

A

It produces a fixed embedding representation for each word, disregarding contextual variations.

96
Q

What does BERT stand for?

A

Bidirectional Encoder Representations from Transformers.

97
Q

What is the main advantage of contextualized word embeddings?

A

They consider the surrounding words or overall context, allowing for more nuanced representations.

98
Q

What are the two main tasks BERT performs?

A
  • Predicting randomly masked words in sentences
  • Predicting the next sentence from a given sentence.
99
Q

What is the significance of subword level embeddings in BERT?

A

They allow BERT to handle out-of-vocabulary (OOV) words more effectively.

100
Q

What component of the transformer architecture does BERT primarily use?

A

The encoder part.

101
Q

What are some NLP tasks that BERT can be used for?

A
  • Question answering
  • Named entity extraction
  • Text summarization
102
Q

What is fine-tuning in the context of BERT?

A

Adding an additional output layer to the BERT network for a specific task and updating the pre-trained model weights.

103
Q

What is a GAN?

A

A type of generative model designed to generate realistic data instances, such as images.

104
Q

What are the two networks in a GAN?

A
  • Generator
  • Discriminator
105
Q

What is the role of the Generator in a GAN?

A

To generate instances of data.

106
Q

What does the Discriminator in a GAN do?

A

Learns to distinguish between real and fake instances generated by the Generator.

107
Q

What is few-shot learning?

A

A process where a model learns how to perform a task with just a few examples.

108
Q

What is the main difference between GPT and BERT?

A

GPT uses the Transformer decoder block while BERT uses the Transformer encoder block.

109
Q

What is the primary training approach used by GPT?

A

Next word prediction.

110
Q

What does GPT-3 exemplify in terms of model parameters?

A

It has 175 billion parameters after training.

111
Q

What are foundation models?

A

Models that are pre-trained on massive datasets and can handle multiple tasks.

112
Q

What is the architecture called that PaLM uses?

113
Q

What is LLaMA?

A

An LLM available in multiple sizes from 7 billion to 65 billion parameters.

114
Q

What is a key advantage of LLaMA compared to larger models?

A

It requires fewer computational resources.

115
Q

What is the parameter range of LLaMA?

A

From 7 billion parameters to 65 billion parameters

116
Q

What advantages does LLaMA offer compared to larger models?

A

Requires fewer computational resources

117
Q

What capabilities does LLaMA provide?

A
  • Generating creative text
  • Answering questions
  • Solving mathematical problems
118
Q

What type of license has Meta issued for LLaMA?

A

A noncommercial license emphasizing usage in research contexts

119
Q

How does LLaMA perform when fine-tuned?

A

It performs extremely well with additional training data

120
Q

What is the parameter count of BLOOM?

A

176 billion parameters

121
Q

In how many languages can BLOOM generate text?

A

46 different languages and 13 programming languages

122
Q

How many researchers contributed to the development of BLOOM?

A

More than 1,000 researchers

123
Q

What is the Responsible AI License associated with BLOOM?

A

Individuals and institutions can use and build upon the model under agreed terms

124
Q

Where is BLOOM easily accessible?

A

Within the Hugging Face ecosystem

125
Q

What are domain-specific LLMs?

A

Models specifically trained for industries to solve tough domain-focused problems

126
Q

What is BloombergGPT?

A

A domain-focused LLM specifically trained for the finance industry

127
Q

What financial NLP tasks does BloombergGPT enhance?

A
  • Sentiment analysis
  • Named entity recognition
  • News classification
  • Question answering
128
Q

How many tokens comprise the comprehensive dataset used for BloombergGPT?

A

Over 700 billion tokens

129
Q

What are some significant limitations of LLMs?

A
  • Generating misinformation (hallucinations)
  • Toxic content
  • Potential bias
  • High resource consumption
130
Q

What common problems have existing NLP techniques solved?

A
  • Named entity extraction
  • Document classification
  • Sentiment analysis
131
Q

What recent advancements have AI made in image generation?

A

High-resolution, photorealistic images and generative art

132
Q

What is the new type of deep learning model used for image generation?

A

Diffusion model

133
Q

How does a diffusion model generate realistic data?

A

By adding noise to input data until unrecognizable, then reversing the process

134
Q

What is the process of adding noise to data in a diffusion model called?

A

Diffusion steps

135
Q

What method does a diffusion model use to learn data generation?

A

Optimizing a set of learnable parameters through backpropagation

136
Q

What does the iterative process of a diffusion model allow for?

A

Capturing complex dependencies, intricate patterns, and structures