How Google Does Machine Learning Flashcards

1
Q

Which of the following are best practices for data quality management?

  1. Preventing duplicates
  2. All options are correct.
  3. Automating data entry
  4. Resolving missing values
A
  1. All options are correct.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which of the following are categories of data quality tools?

  1. Monitoring tools
  2. Neither option is correct
  3. Both Cleaning tools and Monitoring tools
  4. Cleaning tools
A
  1. Both Cleaning tools and Monitoring tools
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which of the following is not a Data Quality attribute?

  1. Accuracy
  2. Redundancy
  3. Auditability
  4. Consistency
A
  1. Redundancy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the features of low data quality?

  1. Incomplete data
  2. All options are correct
  3. Duplicated data
  4. Unreliable info
A
  1. All options are correct
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which of the following refers to the Orderliness of data?

  1. The data entered has the required format and structure
  2. None of the options are correct.
  3. The data represents reality within a reasonable period
  4. The data record with specific details appears only once in the database
A
  1. The data entered has the required format and structure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which of the following machine learning models have labels, or in other words, the correct answers to whatever it is that we want to learn to predict?

  1. Unsupervised Model
  2. Supervised Model
  3. None of the options are correct
  4. Reinforcement Model
A
  1. Supervised Model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which statement is true?

  1. Depending on the problem you are trying to solve, the data you have, explainability, etc. will not determine which machine learning methods you use to find a solution.
  2. Depending on the problem you are trying to solve, the data you have, explainability, etc. will determine which machine learning methods you use to find a solution.
  3. None of the options are correct.
  4. Determining which machine learning methods you use to find a solution depends only on the problem or hypothesis.
A
  1. Depending on the problem you are trying to solve, the data you have, explainability, etc. will determine which machine learning methods you use to find a solution.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a type of Supervised machine learning model?

  1. Regression model.
  2. Classification model.
  3. None of the options are correct.
  4. Regression models & Classification models
A
  1. Regression models & Classification models
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which model would you use if your problem required a discrete number of values or classes?

  1. Regression Model
  2. Unsupervised Model
  3. Classification Model
  4. Supervised Model
A
  1. Classification Model
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When the data isn’t labelled, what is an alternative way of predicting the output?

  1. Clustering Algorithms
  2. Linear regression
  3. None of the options are correct.
  4. Logistic regression
A
  1. Clustering Algorithms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Which of the following is not true about Exploratory Data Analysis?

  1. Does not provide insight into the data.
  2. Deals with unknowns.
  3. Discovers new knowledge.
  4. Generates a posteriori hypothesis.
A
  1. Does not provide insight into the data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the objectives of exploratory data analysis?

  1. Uncover a parsimonious model, one which explains the data with a minimum number of predictor variables.
  2. All options are correct.
  3. Gain maximum insight into the data set and its underlying structure.
  4. Check for missing data and other mistakes
A
  1. All options are correct.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Exploratory Data Analysis is majorly performed using the following methods:

  1. Both Univariate and Bivariate
  2. None of the options are correct.
  3. Bivariate
  4. Univariate
A
  1. Both Univariate and Bivariate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which of the following is not a component of Exploratory Data Analysis?

  1. Statistical Analysis and Clustering
  2. Hyperparameter tuning
  3. Anomaly Detection
  4. Accounting and Summarizing
A
  1. Hyperparameter tuning
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which is the correct sequence of steps in data analysis and data visualisation of Exploratory Data Analysis?

  1. Data Exploration -> Model Building -> Present Results -> Data Cleaning
  2. Data Exploration -> Model Building -> Data Cleaning -> Present Results
  3. Data Exploration -> Data Cleaning -> Model Building -> Present Results
  4. Data Exploration -> Data Cleaning -> Present Results -> Model Building
A
  1. Data Exploration -> Data Cleaning -> Model Building -> Present Results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

To predict the continuous value of our label, which of the following algorithm is used?

  1. Classification
  2. Unsupervised
  3. None of the options are correct.
  4. Regression
A
  1. Regression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

We can minimize the error between our predicted continuous value and the label’s continuous value using which model?

  1. Regression
  2. Both regression and classification
  3. None of the options are correct.
  4. Classification
A
  1. Regression
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the most essential metric a regression model uses?

  1. Mean squared error as their loss function
  2. Both Mean squared error as their loss function & cross entropy
  3. None of the options are correct.
  4. Cross entropy
A
  1. Mean squared error as their loss function
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

If we want to minimize the error or misclassification between our predicted class and the labels class, which of the following models can be used?

  1. Regression
  2. Classification
  3. None of the options are correct.
  4. Categorical
A
  1. Classification
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Let’s say we want to predict the gestation weeks of a baby, what kind of machine learning model can be used?

  1. Categorical
  2. Classification
  3. None of the options are correct.
  4. Regression
A
  1. Regression
21
Q

Fill in the blanks. In the video, we presented a linear equation. This hypothesis equation is applied to every _________ of our dataset, where the weight values are fixed, and the feature values are from each associated column, and our machine learning data set.

  1. None of the options are correct.
  2. Row and Column
  3. Row
  4. Column
A
  1. Row
22
Q

True or False: Classification is the problem of predicting a discrete class label output for an example, while regression is the problem of predicting a continuous quantity output for an example.

False.
True.

A

True

23
Q

Which of the following statements is true?

  1. None of the options are correct.
  2. Typically, for linear regression problems , the loss function is Mean Squared Error and typically, for classification problems , the loss function is Mean Squared Error.
  3. Typically, for linear regression problems , the loss function is Mean Squared Error.
  4. Typically, for classification problems , the loss function is Mean Squared Error.
A
  1. Typically, for linear regression problems , the loss function is Mean Squared Error.
24
Q

Fill in the blanks. Fundamentally, classification is about predicting a _______ and regression is about predicting a __________.

  1. Label, Quantity
  2. Log Loss, Label
  3. Quantity, Label
  4. RMSE, Label
A
  1. Label, Quantity
25
Q

What are the steps involved in the Perceptron Learning Process?

  1. All options are correct.
  2. Takes the inputs, multiplies them by their weights, and computes their sum.
  3. Adds a bias factor, the number 1 multiplied by a weight.
  4. Feeds the sum through the activation function.
A
  1. All options are correct.
26
Q

What are the elements of a perceptron?

  1. All options are correct.
  2. Input function x
  3. Bias b
  4. Activation function
A
  1. All options are correct.
27
Q

Which of the following statements is correct?

  1. A perceptron is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.
  2. A perceptron is a type of sequential classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.
  3. A perceptron is a type of modular classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.
  4. A perceptron is a type of monitoring classifier.
A
  1. A perceptron is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.
28
Q

Which model is the linear classifier, also used in Supervised learning?

  1. All options are correct.
  2. Neuron
  3. Dendrites
  4. Perceptron
A
  1. Perceptron
29
Q

Which of the following is an algorithm for supervised learning of binary classifiers - given that a binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class.

  1. None of the options are correct.
  2. Binary classifier
  3. Perceptron
  4. Linear regression
A
  1. Perceptron
30
Q

Which activation functions are needed to get the complex chain functions that allow neural networks to learn data distributions.

  1. Linear activation functions
  2. All options are correct
  3. Nonlinear activation functions
  4. None of the options are correct
A
  1. Nonlinear activation functions
31
Q

Which of the following activation functions are used for nonlinearity?

  1. Sigmoid
  2. Tanh
  3. Hyperbolic tangent
  4. All options are correct.
A
  1. All options are correct.
32
Q

If we wanted our outputs to be in the form of probabilities, which activation function should I choose in the final layer?

  1. Sigmoid
  2. ReLU
  3. Tanh
  4. ELU
A
  1. Sigmoid
33
Q

Which activation function has a range between zero and Infinity?

  1. Sigmoid
  2. ReLU
  3. Tanh
  4. ELU
A
  1. ReLU
34
Q

A single unit for a non-input neuron has ____________________ a/an

  1. Weighted Sum
  2. Output of the activation function
  3. Activation function
  4. All options are correct.
A
  1. All options are correct.
35
Q

In a decision classification tree, what does each decision or node consist of?

  1. Linear classifier of one feature
  2. Euclidean distance minimizer
  3. Linear classifier of all features
  4. Mean squared error minimizer
A
  1. Linear classifier of one feature
36
Q

A random forest is usually more complex than an individual decision tree; this makes it harder to visually interpret ?

True
False

A

True

37
Q

Which of the following statements is true?

  1. Mean squared error minimizer and euclidean distance minimizer are not used in regression and classification.
  2. Mean squared error minimizer and euclidean distance minimizer are used in classification, not regression.
  3. Mean squared error minimizer and euclidean distance minimizer are used in regression and classification.
  4. Mean squared error minimizer and euclidean distance minimizer are used in regression, not classification.
A
  1. Mean squared error minimizer and euclidean distance minimizer are used in regression, not classification.
38
Q

Decision trees are one of the most intuitive machine learning algorithms. They can be used for which of the following?

  1. Both classification and regression
  2. None of the options are correct
  3. Classification
  4. Regression
A
  1. Both classification and regression
39
Q

Which of the following is the distance between two separate vectors?

  1. None of the options are correct
  2. Margin
  3. New Line
  4. Space
A
  1. Margin
40
Q

What is the significance of kernel transformation?

  1. None of the options are correct.
  2. It maps the data from our input vector space to a vector space that has features that can be linearly separated.
  3. It maps the data from our input vector space to a vector space that has features that can be linearly separated and it transforms the data from our input vector space to a vector space.
  4. It transforms the data from our input vector space to a vector space.
A
  1. It maps the data from our input vector space to a vector space that has features that can be linearly separated.
41
Q

Which of the following statements is true about Support Vector Machines (SVM)?

  1. Support Vector Machines (SVMs) are a particularly powerful and flexible class of supervised algorithms for both classification and regression. SVM are used for text classification tasks such as category assignment, detecting spam, and sentiment analysis.
  2. Both options are correct
  3. SVMs are based on the idea of finding a hyperplane that best divides a dataset into two classes. Support vectors are the data points nearest to the hyperplane, the points of a data set that, if removed, would alter the position of the dividing hyperplane. As a simple example, for a classification task with only two features, you can think of a hyperplane as a line that linearly separates and classifies a set of data.
A
  1. Both options are correct
42
Q

Which statement is true regarding kernel methods?

  1. In machine learning, kernel methods are a class of algorithms for pattern analysis, whose best known member is the support vector machine (SVM).
  2. In machine learning, kernel methods are a class of algorithms for network infrastructure analysis, whose best known member is the support vector machine (SVM).
  3. In machine learning, kernel methods are a class of algorithms for protocol analysis, whose best known member is the support vector machine (SVM).
  4. In machine learning, kernel methods are a class of algorithms for cloud protocol analysis, whose best known member is the support vector machine (SVM).
A
  1. In machine learning, kernel methods are a class of algorithms for pattern analysis, whose best known member is the support vector machine (SVM).
43
Q

Which of the following statements is true about a decision boundary?

  1. None of the options are correct.
  2. The more generalizable the decision boundary, the wider the margin.
  3. The more generalizable the decision boundary, the less the margin.
  4. The less generalizable the decision boundary, the wider the margin.
A
  1. The more generalizable the decision boundary, the wider the margin.
44
Q

Which of the following statements is true?

  1. Dropout can help a model generalize by randomly setting the output for a given neuron to 0. In setting the output to 0, the cost function becomes more sensitive to neighbouring neurons changing the way the weights will be updated during the process of backpropagation.
  2. Dropout can help a model generalize by randomly setting the output for a given neuron to 1. In setting the output to 1, the cost function becomes more sensitive to neighbouring neurons changing the way the weights will be updated during the process of backpropagation.
  3. Both options are correct.
A
  1. Dropout can help a model generalize by randomly setting the output for a given neuron to 0. In setting the output to 0, the cost function becomes more sensitive to neighbouring neurons changing the way the weights will be updated during the process of backpropagation.
45
Q

Which of the following is not a type of modern neural network?

  1. Convolutional Neural Network
  2. Modular Neural Network
  3. Recurrent Neural Network
  4. Sine Neural Network
A
  1. Sine Neural Network
46
Q

Which of the following are ways to improve generalization?

  1. Adding dropout layers. Dropout is a technique used to prevent a model from overfitting. Dropout works by randomly setting the outgoing edges of hidden units (neurons that make up hidden layers) to 0 at each update of the training phase.
  2. Performing data augmentation, which is a technique to artificially create new training data from existing training data. This is done by applying domain-specific techniques to examples from the training data that create new and different training examples.
  3. Adding noise - for example, adding Gaussian noise to input variables.Gaussian noise, or white noise, has a mean of zero and a standard deviation of one and can be generated as needed using a pseudorandom number generator.
  4. All options are correct.
A
  1. All options are correct.
47
Q

Which statement is true regarding the “dropout technique” used in neural networks?

  1. Dropout is a technique used to prevent a model from overfitting. Dropout works by randomly setting the outgoing edges of hidden units (neurons that make up hidden layers) to 0 at each update of the training phase.
  2. Dropout is a technique used to prevent a model from underfitting. Dropout works by randomly setting the outgoing edges of hidden units (neurons that make up hidden layers) to 0 at each update of the training phase.
  3. Dropout is a technique used to prevent a model from overfitting. Dropout works by randomly setting the outgoing edges of hidden units (neurons that make up hidden layers) to 1 at each update of the training phase.
  4. None of the options are correct.
A
  1. Dropout is a technique used to prevent a model from overfitting. Dropout works by randomly setting the outgoing edges of hidden units (neurons that make up hidden layers) to 0 at each update of the training phase.
48
Q

Which statement is true regarding neural networks?

  1. Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns.
  2. Neural networks interpret sensory data through a kind of machine perception, labeling or clustering raw input.
  3. The patterns neural networks recognize are numerical, contained in vectors, into which all real-world data, be it images, sound, text or time series, must be translated.
  4. All options are correct.
A
  1. All options are correct.