How Google Does Machine Learning Flashcards

Question 1

Q

Which of the following are best practices for data quality management?

Preventing duplicates
All options are correct.
Automating data entry
Resolving missing values

Answer

A

All options are correct.

Question 2

Q

Which of the following are categories of data quality tools?

Monitoring tools
Neither option is correct
Both Cleaning tools and Monitoring tools
Cleaning tools

Answer

A

Both Cleaning tools and Monitoring tools

Question 3

Q

Which of the following is not a Data Quality attribute?

Accuracy
Redundancy
Auditability
Consistency

Answer

A

Redundancy

Question 4

Q

What are the features of low data quality?

Incomplete data
All options are correct
Duplicated data
Unreliable info

Answer

A

All options are correct

Question 5

Q

Which of the following refers to the Orderliness of data?

The data entered has the required format and structure
None of the options are correct.
The data represents reality within a reasonable period
The data record with specific details appears only once in the database

Answer

A

The data entered has the required format and structure

Question 6

Q

Which of the following machine learning models have labels, or in other words, the correct answers to whatever it is that we want to learn to predict?

Unsupervised Model
Supervised Model
None of the options are correct
Reinforcement Model

Answer

A

Supervised Model

Question 7

Q

Which statement is true?

Depending on the problem you are trying to solve, the data you have, explainability, etc. will not determine which machine learning methods you use to find a solution.
Depending on the problem you are trying to solve, the data you have, explainability, etc. will determine which machine learning methods you use to find a solution.
None of the options are correct.
Determining which machine learning methods you use to find a solution depends only on the problem or hypothesis.

Answer

A

Depending on the problem you are trying to solve, the data you have, explainability, etc. will determine which machine learning methods you use to find a solution.

Question 8

Q

What is a type of Supervised machine learning model?

Regression model.
Classification model.
None of the options are correct.
Regression models & Classification models

Answer

A

Regression models & Classification models

Question 9

Q

Which model would you use if your problem required a discrete number of values or classes?

Regression Model
Unsupervised Model
Classification Model
Supervised Model

Answer

A

Classification Model

Question 10

Q

When the data isn’t labelled, what is an alternative way of predicting the output?

Clustering Algorithms
Linear regression
None of the options are correct.
Logistic regression

Answer

A

Clustering Algorithms

Question 11

Q

Which of the following is not true about Exploratory Data Analysis?

Does not provide insight into the data.
Deals with unknowns.
Discovers new knowledge.
Generates a posteriori hypothesis.

Answer

A

Does not provide insight into the data.

Question 12

Q

What are the objectives of exploratory data analysis?

Uncover a parsimonious model, one which explains the data with a minimum number of predictor variables.
All options are correct.
Gain maximum insight into the data set and its underlying structure.
Check for missing data and other mistakes

Answer

A

All options are correct.

Question 13

Q

Exploratory Data Analysis is majorly performed using the following methods:

Both Univariate and Bivariate
None of the options are correct.
Bivariate
Univariate

Answer

A

Both Univariate and Bivariate

Question 14

Q

Which of the following is not a component of Exploratory Data Analysis?

Statistical Analysis and Clustering
Hyperparameter tuning
Anomaly Detection
Accounting and Summarizing

Answer

A

Hyperparameter tuning

Question 15

Q

Which is the correct sequence of steps in data analysis and data visualisation of Exploratory Data Analysis?

Data Exploration -> Model Building -> Present Results -> Data Cleaning
Data Exploration -> Model Building -> Data Cleaning -> Present Results
Data Exploration -> Data Cleaning -> Model Building -> Present Results
Data Exploration -> Data Cleaning -> Present Results -> Model Building

Answer

A

Data Exploration -> Data Cleaning -> Model Building -> Present Results

Question 16

Q

To predict the continuous value of our label, which of the following algorithm is used?

Classification
Unsupervised
None of the options are correct.
Regression

Answer

A

Regression

Question 17

Q

We can minimize the error between our predicted continuous value and the label’s continuous value using which model?

Regression
Both regression and classification
None of the options are correct.
Classification

Answer

A

Regression

Question 18

Q

What is the most essential metric a regression model uses?

Mean squared error as their loss function
Both Mean squared error as their loss function & cross entropy
None of the options are correct.
Cross entropy

Answer

A

Mean squared error as their loss function

Question 19

Q

If we want to minimize the error or misclassification between our predicted class and the labels class, which of the following models can be used?

Regression
Classification
None of the options are correct.
Categorical

Answer

A

Classification

Question 20

Q

Let’s say we want to predict the gestation weeks of a baby, what kind of machine learning model can be used?

Categorical
Classification
None of the options are correct.
Regression

Answer

A

Regression

Question 21

Q

Fill in the blanks. In the video, we presented a linear equation. This hypothesis equation is applied to every _________ of our dataset, where the weight values are fixed, and the feature values are from each associated column, and our machine learning data set.

None of the options are correct.
Row and Column
Row
Column

Question 22

Q

True or False: Classification is the problem of predicting a discrete class label output for an example, while regression is the problem of predicting a continuous quantity output for an example.

False.
True.

Question 23

Q

Which of the following statements is true?

None of the options are correct.
Typically, for linear regression problems , the loss function is Mean Squared Error and typically, for classification problems , the loss function is Mean Squared Error.
Typically, for linear regression problems , the loss function is Mean Squared Error.
Typically, for classification problems , the loss function is Mean Squared Error.

Answer

A

Typically, for linear regression problems , the loss function is Mean Squared Error.

Question 24

Q

Fill in the blanks. Fundamentally, classification is about predicting a _______ and regression is about predicting a __________.

Label, Quantity
Log Loss, Label
Quantity, Label
RMSE, Label

Answer

A

Label, Quantity

Question 25

Q

What are the steps involved in the Perceptron Learning Process?

All options are correct.
Takes the inputs, multiplies them by their weights, and computes their sum.
Adds a bias factor, the number 1 multiplied by a weight.
Feeds the sum through the activation function.

Answer

A

All options are correct.

Question 26

Q

What are the elements of a perceptron?

All options are correct.
Input function x
Bias b
Activation function

Answer

A

All options are correct.

Question 27

Q

Which of the following statements is correct?

A perceptron is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.
A perceptron is a type of sequential classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.
A perceptron is a type of modular classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.
A perceptron is a type of monitoring classifier.

Answer

A

A perceptron is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.

Question 28

Q

Which model is the linear classifier, also used in Supervised learning?

All options are correct.
Neuron
Dendrites
Perceptron

Answer

A

Perceptron

Question 29

Q

Which of the following is an algorithm for supervised learning of binary classifiers - given that a binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class.

None of the options are correct.
Binary classifier
Perceptron
Linear regression

Answer

A

Perceptron

Question 30

Q

Which activation functions are needed to get the complex chain functions that allow neural networks to learn data distributions.

Linear activation functions
All options are correct
Nonlinear activation functions
None of the options are correct

Answer

A

Nonlinear activation functions

Question 31

Q

Which of the following activation functions are used for nonlinearity?

Sigmoid
Tanh
Hyperbolic tangent
All options are correct.

Answer

A

All options are correct.

Question 32

Q

If we wanted our outputs to be in the form of probabilities, which activation function should I choose in the final layer?

Sigmoid
ReLU
Tanh
ELU

Question 33

Q

Which activation function has a range between zero and Infinity?

Sigmoid
ReLU
Tanh
ELU

Question 34

Q

A single unit for a non-input neuron has ____________________ a/an

Weighted Sum
Output of the activation function
Activation function
All options are correct.

Answer

A

All options are correct.

Question 35

Q

In a decision classification tree, what does each decision or node consist of?

Linear classifier of one feature
Euclidean distance minimizer
Linear classifier of all features
Mean squared error minimizer

Answer

A

Linear classifier of one feature

Question 36

Q

A random forest is usually more complex than an individual decision tree; this makes it harder to visually interpret ?

True
False

Question 37

Q

Which of the following statements is true?

Mean squared error minimizer and euclidean distance minimizer are not used in regression and classification.
Mean squared error minimizer and euclidean distance minimizer are used in classification, not regression.
Mean squared error minimizer and euclidean distance minimizer are used in regression and classification.
Mean squared error minimizer and euclidean distance minimizer are used in regression, not classification.

Answer

A

Mean squared error minimizer and euclidean distance minimizer are used in regression, not classification.

Question 38

Q

Decision trees are one of the most intuitive machine learning algorithms. They can be used for which of the following?

Both classification and regression
None of the options are correct
Classification
Regression

Answer

A

Both classification and regression

Question 39

Q

Which of the following is the distance between two separate vectors?

None of the options are correct
Margin
New Line
Space

Question 40

Q

What is the significance of kernel transformation?

None of the options are correct.
It maps the data from our input vector space to a vector space that has features that can be linearly separated.
It maps the data from our input vector space to a vector space that has features that can be linearly separated and it transforms the data from our input vector space to a vector space.
It transforms the data from our input vector space to a vector space.

Answer

A

It maps the data from our input vector space to a vector space that has features that can be linearly separated.

Question 41

Q

Which of the following statements is true about Support Vector Machines (SVM)?

Support Vector Machines (SVMs) are a particularly powerful and flexible class of supervised algorithms for both classification and regression. SVM are used for text classification tasks such as category assignment, detecting spam, and sentiment analysis.
Both options are correct
SVMs are based on the idea of finding a hyperplane that best divides a dataset into two classes. Support vectors are the data points nearest to the hyperplane, the points of a data set that, if removed, would alter the position of the dividing hyperplane. As a simple example, for a classification task with only two features, you can think of a hyperplane as a line that linearly separates and classifies a set of data.

Answer

A

Both options are correct

Question 42

Q

Which statement is true regarding kernel methods?

In machine learning, kernel methods are a class of algorithms for pattern analysis, whose best known member is the support vector machine (SVM).
In machine learning, kernel methods are a class of algorithms for network infrastructure analysis, whose best known member is the support vector machine (SVM).
In machine learning, kernel methods are a class of algorithms for protocol analysis, whose best known member is the support vector machine (SVM).
In machine learning, kernel methods are a class of algorithms for cloud protocol analysis, whose best known member is the support vector machine (SVM).

Answer

A

In machine learning, kernel methods are a class of algorithms for pattern analysis, whose best known member is the support vector machine (SVM).

Question 43

Q

Which of the following statements is true about a decision boundary?

None of the options are correct.
The more generalizable the decision boundary, the wider the margin.
The more generalizable the decision boundary, the less the margin.
The less generalizable the decision boundary, the wider the margin.

Answer

A

The more generalizable the decision boundary, the wider the margin.

Question 44

Q

Which of the following statements is true?

Dropout can help a model generalize by randomly setting the output for a given neuron to 0. In setting the output to 0, the cost function becomes more sensitive to neighbouring neurons changing the way the weights will be updated during the process of backpropagation.
Dropout can help a model generalize by randomly setting the output for a given neuron to 1. In setting the output to 1, the cost function becomes more sensitive to neighbouring neurons changing the way the weights will be updated during the process of backpropagation.
Both options are correct.

Answer

A

Dropout can help a model generalize by randomly setting the output for a given neuron to 0. In setting the output to 0, the cost function becomes more sensitive to neighbouring neurons changing the way the weights will be updated during the process of backpropagation.

Question 45

Q

Which of the following is not a type of modern neural network?

Convolutional Neural Network
Modular Neural Network
Recurrent Neural Network
Sine Neural Network

Answer

A

Sine Neural Network

Question 46

Q

Which of the following are ways to improve generalization?

Adding dropout layers. Dropout is a technique used to prevent a model from overfitting. Dropout works by randomly setting the outgoing edges of hidden units (neurons that make up hidden layers) to 0 at each update of the training phase.
Performing data augmentation, which is a technique to artificially create new training data from existing training data. This is done by applying domain-specific techniques to examples from the training data that create new and different training examples.
Adding noise - for example, adding Gaussian noise to input variables.Gaussian noise, or white noise, has a mean of zero and a standard deviation of one and can be generated as needed using a pseudorandom number generator.
All options are correct.

Answer

A

All options are correct.

Question 47

Q

Which statement is true regarding the “dropout technique” used in neural networks?

Dropout is a technique used to prevent a model from overfitting. Dropout works by randomly setting the outgoing edges of hidden units (neurons that make up hidden layers) to 0 at each update of the training phase.
Dropout is a technique used to prevent a model from underfitting. Dropout works by randomly setting the outgoing edges of hidden units (neurons that make up hidden layers) to 0 at each update of the training phase.
Dropout is a technique used to prevent a model from overfitting. Dropout works by randomly setting the outgoing edges of hidden units (neurons that make up hidden layers) to 1 at each update of the training phase.
None of the options are correct.

Answer

A

Dropout is a technique used to prevent a model from overfitting. Dropout works by randomly setting the outgoing edges of hidden units (neurons that make up hidden layers) to 0 at each update of the training phase.

Question 48

Q

Which statement is true regarding neural networks?

Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns.
Neural networks interpret sensory data through a kind of machine perception, labeling or clustering raw input.
The patterns neural networks recognize are numerical, contained in vectors, into which all real-world data, be it images, sound, text or time series, must be translated.
All options are correct.

Answer

A

All options are correct.