AWS practice test questions Flashcards
An analytics company wants to use a fully managed service that automatically scales to handle the transfer of its Apache web logs, syslogs, text and videos on their webserver to Amazon S3 with minimum transformation.
What service can be used for this process?
- Kinesis Data Streams
- Kinesis Firehose
- Kinesis Data Analytics
- Amazon Kinesis Video Streams
- Kinesis Firehose*
- Kinesis Data Streams (Kinesis Data Streams is not a fully managed service and therefore does not meet one of the requirements of this question.)
- Kinesis Firehose*
- Kinesis Data Analytics (Kinesis Data Analytics is not an appropriate service here, because it helps
with streaming analytics.) - Amazon Kinesis Video Streams (Kinesis Video Streams is not an appropriate service here, because
it securely streams videos from devices.)
Q4. A video streaming company wants to analyze its VPC flow logs to build a real-time anomaly detection pipeline. The pipeline must be minimally managed and enable the business to build a near real-time dashboard.
What combination of AWS service and algorithm can the company use for this pipeline?
- Amazon SageMaker with RandomCutForest
- Kinesis Data Analytics with RandomCutForest
- Amazon QuickSight with ML Insights
- Apache Spark on Amazon EMR with MLLib
- Kinesis Data Analytics with RandomCutForest*
- Amazon SageMaker with RandomCutForest (Amazon SageMaker cannot be used with streamed data.)
- Kinesis Data Analytics with RandomCutForest*
- Amazon QuickSight with ML Insights (Amazon QuickSight can only be used with structured datasets
that need to be stored in Amazon S3 or a database.) - Apache Spark on Amazon EMR with MLLib (Amazon EMR can be used for this use case, but it
doesn’t satisfy the minimally managed requirement.)
A data and analytics company is expanding its platform on AWS. The company wants to build a serverless product that preprocesses large structured data, while minimizing the cost for data storage and compute. The company also wants to integrate the new product with an existing ML product that uses Amazon EMR with Spark.
What solution should the company use to build this new product?
- Use AWS Lambda for data preprocessing. Save the data in Amazon S3 in CSV format.
- Use AWS Glue for data preprocessing. Save the data in Amazon S3 in CSV format.
- Use AWS Glue for data preprocessing. Save the data in Amazon S3 in Parquet format.
- Use AWS Lambda for data preprocessing. Save the data in Amazon S3 in Parquet format.
- Use AWS Glue for data preprocessing. Save the data in Amazon S3 in Parquet format.*
- Use AWS Lambda for data preprocessing. Save the data in Amazon S3 in CSV format. (AWS Lambda has a runtime of 15 minutes, making it less than ideal for this particular situation. Additionally, saving the data in CSV format will not meet the question’s cost requirements.)
- Use AWS Glue for data preprocessing. Save the data in Amazon S3 in CSV format. (AWS Glue will work as a solution for data preprocessing, but saving the data in CSV format does not fulfull the company’s cost requirements.)
- Use AWS Glue for data preprocessing. Save the data in Amazon S3 in Parquet format.*
- Use AWS Lambda for data preprocessing. Save the data in Amazon S3 in Parquet format. (Lambda has
a runtime of 15 minutes, making it less than ideal for this particular situation.)
A financial organization uses multiple ML models to detect irregular patterns in its data to combat fraudulent activity such as money laundering. They use a TensorFlow-based Docker container on GPU-enabled Amazon EC2 instances to concurrently train the multiple models for this workload.
However, they want to automate the batch data preprocessing and ML training aspects of this pipeline, scheduling them to take place automatically every 24 hours.
What AWS service can they use to do this?
1. AWS Glue
2. AWS Batch
3. Amazon EMR
4. Kinesis Data Analytics
- AWS Batch*
- AWS Glue (AWS Glue cannot import a Docker container with TensorFlow to be used in the pipeline.)
- AWS Batch*
- Amazon EMR (Amazon EMR can be used for this use case, but the pipeline will not be
automatically scheduled) - Kinesis Data Analytics (Kinesis Data Analytics can be used only on streaming data.)
A real estate startup wants to use ML to predict the value of homes in various cities. To do so, the startup’s data science team is joining real estate price data with other variables such as weather, demographic, and standard of living data.
However, the team is having problems with slow model convergence. Additionally, the model includes large weights for some features, which is causing degradation in model performance.
What kind of data preprocessing technique should the team use to more effectively prepare this data?
- Standard scaler
- Normalizer
- Max absolute scaler
- One hot encoder
- Standard scaler* (Standard scaler is the best option, because it performs scaling and shifting/centering.)
- Normalizer (This would perform row normalization. This situation requires column normalization.)
- Max absolute scaler (This would scale each column by its max value, but would not shift/center the
data. ) - One hot encoder (There is no symbolic/string data being mentioned here to perform 1-hot encoding
A Data Scientist at a retail company is using Amazon SageMaker to classify social media posts that mention the company into one of two categories: Posts that require a response from the company, and posts that do not. The Data Scientist is using a training dataset of 10,000 posts, which contains the timestamp, author, and full text of each post.
However, the Data Scientist is missing the target labels that are required for training.
Which approach can the Data Scientist take to create valid target label data? (Select TWO.)
- Ask the social media handling team to review each post using Amazon SageMaker GroundTruth and provide the label
- Use the sentiment analysis natural language processing library to determine whether a post requires a response
- Use Amazon Mechanical Turk to publish Human Intelligence Tasks that ask Turk workers to label the posts
- Use the a priori probability distribution of the two classes. Then, use Monte-Carlo simulation to generate the labels
- Use K-Means to cluster posts into various groups, and pick the most frequent word in each group as its label
- Ask the social media handling team to review each post using Amazon SageMaker GroundTruth and provide the label*
- Use Amazon Mechanical Turk to publish Human Intelligence Tasks that ask Turk workers to label the posts*
- Use the sentiment analysis natural language processing library to determine whether a post requires a response (Sentiment analysis would not directly create a binary label.)
- Use the a priori probability distribution of the two classes. Then, use Monte-Carlo simulation to generate the labels (It’s not clear how this approach would assign the binary classification label that is required by this question.)
- Use K-Means to cluster posts into various groups, and pick the most frequent word in each group as its label (This creates labels, but those labels will not align to the required two categories mentioned in this question.)
A Data Scientist wants to include “month” as a categorical column in a training dataset for an ML model that is being built. However, the ML algorithm gives an error when the column is added to the training data.
What should the Data Scientist do to add this column?
- Convert the “month” column to 12 different columns, one for each month, by using one-hot encoding.
- Map the “month” column data to the numbers 1 to 12 and use this new numerical mapped column.
- Scale the months using StandardScaler.
- Use pandas fillna() to convert the column to numerical data.
- Convert the “month” column to 12 different columns, one for each month, by using one-hot encoding.
- Map the “month” column data to the numbers 1 to 12 and use this new numerical mapped column. (This
is not a good option, because the numerical mapping of the months would imply magnitudes of difference between the different months (for instance, April could be looked at as twice as much as February).) - Scale the months using StandardScaler. (StandardScaler is used to scale numerical data. It will will not work with categorical data, which is what this question is about.)
- Use pandas fillna() to convert the column to numerical data. (This approach, which deals with missing data, is not relevant to this question.)
A Data Scientist for a credit card company is creating a solution to predict credit card fraud at the time of transaction. To that end, the Data Scientist is looking to create an ML model to predict fraud and will do so by training that model on an existing dataset of credit card transactions. That dataset contains 1,000 examples of transactions in total, only 50 of which are labeled as fraud.
How should the Data Scientist deal with this class imbalance?
- Use the Synthetic Minority Oversampling Technique (SMOTE) to oversample the fraud records
- Undersample the non-fraudulent records to improve the class imbalance
- Use K-fold cross validation when training the model
- Drop all the fraud examples, and use a One-Class SVM to classify
- Use the Synthetic Minority Oversampling Technique (SMOTE) to oversample the fraud records* (Instead of undersampling the major class, SMOTE is oversampling the minor class synthetically, which makes it the best solution for this situation.)
- Undersample the non-fraudulent records to improve the class imbalance (This approach essentially requires throwing away data, which is definitely not a good solution given the small dataset in this question.)
- Use K-fold cross validation when training the model (This is a good evaluation technique, but will not improve the model’s capability to differentiate between the classes.)
- Drop all the fraud examples, and use a One-Class SVM to classify (This artificially throws away real data, and one class methods are useful for anomaly detection, not for binary classification as is the case here)
An ML Engineer at a real estate startup wants to use a new quantitative feature for an existing ML model that predicts housing prices. Before adding the feature to the cleaned dataset, the Engineer wants to visualize the feature in order to check for outliers and overall distribution and skewness of the feature.
What visualization technique should the ML Engineer use? (Select TWO.)
- Box Plot
- Histogram
- Scatterplot
- Heatmap
- T-SNE
- Box Plot*
- Histogram*
- Scatterplot (Scatterplot can help check for outliers, but it won’t show the skewness of the data.)
- Heatmap (Heatmaps show relationships between two variables, but is not enough to check for
overall distribution or skewness in the data.) - T-SNE (T-SNE is used to reduce the dimensionality of the data. It is not used to visualize outliers.)
A company is using its genomic data to classify how different human DNA affects cell growth, so that they can predict a person’s chances of getting cancer. Before creating and preparing the training and validation datasets for the model, the company wants to reduce the high dimensionality of the data.
What technique should the company use to achieve this goal? (Select TWO.)
- Use seaborn distribution plot (distplot) to visualize the correlated data. Remove the unrelated features.
- Use T-SNE to reduce the dimensionality of the data. Visualize the data using matplotlib.
- Use Principle Component Analysis (PCA) to reduce the dimensionality of the data. Visualize the data
using matplotlib. - Calculate the eigenvectors. Use a scatter matrix to choose the best features.
- Use L2 regularization to reduce the features used in the data. Visualize the data using matplotlib.
- Use T-SNE to reduce the dimensionality of the data. Visualize the data using matplotlib.*
- Use Principle Component Analysis (PCA) to reduce the dimensionality of the data. Visualize the data
using matplotlib.* - Use seaborn distribution plot (distplot) to visualize the correlated data. Remove the unrelated features.
(This does not show correlation between features, and does not perform dimension reduction.) - Calculate the eigenvectors. Use a scatter matrix to choose the best features. (Eigenvectors cannot
reduce the dimensionality of the data.) - Use L2 regularization to reduce the features used in the data. Visualize the data using matplotlib. (L2
regularization is not a feature reduction technique—although for linear models, L1 regularization can act like a feature reduction technique.)
A Data Scientist wants to create a linear regression model to train on a housing dataset to predict home prices. As part of that process, the Data Scientist created a correlation matrix between the dataset’s features and the target variable. The correlations between the target and two of the features, feature 3 and feature 7, are 0.64 and -0.85, respectively.
Which feature has a stronger correlation with the target variable?
- Feature 3
- Feature 7
- There is not sufficient enough data to determine which variable has a stronger correlation to the target
- Feature 7 and feature 3 both have weak correlations to the target
- Feature 3*
- Feature 7 (Feature 7 has a negative correlation with the target variable, and even though the
magnitude is higher than the correlation with Feature 3, the question asks for a stronger
correlation, which is associated with a positive correlation.) - There is not sufficient enough data to determine which variable has a stronger correlation to the target
(There is sufficient data. Even though the magnitude of the correlation with Feature 7 is higher,
“stronger” is associated with a positive correlation.) - Feature 7 and feature 3 both have weak correlations to the target (This is not true, as 0.64 is not
considered very weak.)
A video streaming company is looking to create a personalized experience for its customers on its platform. The company wants to provide recommended videos to stream based on what other similar users watched previously. To this end, it is collecting its platform’s clickstream data using an ETL pipeline and storing the logs and syslogs in Amazon S3.
What kind of algorithm should the company use to create the simplest solution in this situation?
- Regression
- Classification
- Recommender system
- Reinforcement learning
3. Recommender system* (A recommendation system is a subclass of an information filtering system that seeks to predict the "rating" or "preference" a user would give to an item. It is ideal for the situation in this question.)
- Regression (Regression will not provide personalized recommendations to customer, because regression can either predict a number or classify based on historical data.)
- Classification (Classification cannot deliver a personalized recommendation for every user.)
- Reinforcement learning (Reinforcement learning is a relatively new field and while there are
solutions that use it, it is not the simplest solution, since we already have historical data.)
A security and networking company wants to use ML to flag certain IP addresses that have been known to send spam and phishing information. The company wants to build an ML model based on previous user feedback indicating whether specific IP addresses have been connected to a website designed for spam and phishing.
What is the simplest solution that the company can implement?
- Regression
- Classification
- Natural language processing (NLP)
- A rule-based solution should be used instead of ML
- A rule-based solution should be used instead of ML*
- Regression (Regression needs a historical dataset with a numerical output which fails the requirement of the use case in this question.)
- Classification (Classification can work in this situation, but it’s not the simplest solution.)
- Natural language processing (NLP) (ML for natural language processing and text analytics is used to
understand the meaning of text documents. It can be one part of the solution in this context, but it
is not the simplest solution.)
What factors lead to the wide adoption of neural networks in the last decade? (Select THREE.)
- Efficient algorithms
- Cheaper GPUs
- An orders of magnitude increase in data collected
- Cheaper CPUs
- Wide adoption of cloud-based services
- Efficient algorithms*
- Cheaper GPUs*
- An orders of magnitude increase in data collected* (Over the last two decades, the amount of available data of all sorts and the power of our data storing and processing machines (GPUs) have exponentially increased. Combined with better computing capabilities, the massive increases in the amount of available data to train models have allowed the creation of larger, deeper neural networks, which just perform better than smaller ones.)
- Cheaper CPUs (GPUs are needed to train neural networks efficiently, so cheaper CPUs don’t have much to do with the wide adoption of neural networks in the last decade.)
- Wide adoption of cloud-based services (While cloud-based services made it easy for everyone to do machine learning, they are based on the actual factors like efficient algorithms and cheaper GPUs)
An online news organization wants to expand its reach globally by translating some of its most commonly read articles into different languages using ML. The organization’s data science team is gathering all the news articles that they have published in both English and at least one other language. They want to use this data to create one machine learning model for each non-English language that the organization is targeting. The models should only require minimum management.
What approach should the team use to building these models?
- Use Amazon SageMaker Object2Vec to create a vector. Use the SockEye model in Amazon SageMaker using Building Your Own Containers (BYOC)
- Use Amazon SageMaker Object2Vec to create a vector. Use the Amazon SageMaker built-in Sequence to Sequence model (Seq2Seq)
- Use Amazon SageMaker Object2Vec to create a vector. Use Amazon EC2 instances with the Deep Learning Amazon Machine Image (AMI) to create a language encoder-decoder model
- Use Amazon SageMaker Object2Vec to create a vector. Then use a Long Short-term Memory (LSTM) model using Building Your Own Containers (BYOC)
- Use Amazon SageMaker Object2Vec to create a vector. Use the Amazon SageMaker built-in Sequence to Sequence model (Seq2Seq)* (This is the best answer, because Amazon SageMaker takes care of the management and heavy lifting of the model training and deployment.)*
- Use Amazon SageMaker Object2Vec to create a vector. Use the SockEye model in Amazon SageMaker using Building Your Own Containers (BYOC) (BYOC requires more management of the model training process, because you have to maintain the containers and code.)
- Use Amazon SageMaker Object2Vec to create a vector. Use Amazon EC2 instances with the Deep Learning Amazon Machine Image (AMI) to create a language encoder-decoder model (This solution is not ideal, given the situation outlined in the question, because it requires you to manage the model training and deployment yourself.)
- Use Amazon SageMaker Object2Vec to create a vector. Then use a Long Short-term Memory (LSTM) model using Building Your Own Containers (BYOC) (BYOC requires more management of the model training process, because you have to maintain containers and code.)
An ad tech company is using an XGBoost model to classify its clickstream data. The company’s Data Scientist is asked to explain how the model works to a group of non-technical colleagues.
What is a simple explanation the Data Scientist can provide?
- XGBoost is an Extreme Gradient Boosting algorithm that is optimized for boosted decision trees
- XGBoost is a state-of-the-art algorithm that uses logistic regression to split each feature of the data based
on certain conditions - XGBoost is a robust, flexible, scalable algorithm that uses logistic regression to classify data into buckets
- XGBoost is an efficient and scalable neural network architecture.
- XGBoost is an Extreme Gradient Boosting algorithm that is optimized for boosted decision trees*
- XGBoost is a state-of-the-art algorithm that uses logistic regression to split each feature of the data based
on certain conditions (XGBoost is an implementation of gradient boosted decision trees designed
for speed and performance.) - XGBoost is a robust, flexible, scalable algorithm that uses logistic regression to classify data into buckets
(XGBoost uses decision trees to perform both regression and classification.) - XGBoost is an efficient and scalable neural network architecture. (XGBoost is not a neural network but
a tree boosting algorithm.)
An ML scientist has built a decision tree model using scikit-learn with 1,000 trees. The training accuracy for the model was 99.2% and the test accuracy was 70.3%.
Should the Scientist use this model in production?
- Yes, because it is generalizing well on the training set
- No, because it is generalizing well on the training set
- No, because it is not generalizing well on the test set
- Yes, because it is not generalizing well on the test set
- No, because it is not generalizing well on the test set* (This is correct, because the model is not generalizing well, as illustrated by the difference in accuracy scores between training and testing. Therefore, the model should not be used in production.)
- Yes, because it is generalizing well on the training set (This is incorrect, because the model is not generalizing well, as illustrated by the difference in accuracy scores between training and testing.)
- No, because it is generalizing well on the training set (This is incorrect, because the model is not generalizing well, as illustrated by the difference in accuracy scores between training and testing.)
- Yes, because it is not generalizing well on the test set (This is correct in that the model is not generalizing well, but as a result, the scientist shouldn’t use the model in production.)
A Machine Learning Engineer wants to use Amazon SageMaker and the built-in XGBoost algorithm for model training. The training data is currently stored in CSV format, with the first 10 columns representing features and the 11th column representing the target label.
What should the ML Engineer do to prepare the data for use in an Amazon SageMaker training job?
- The target label should be changed to the first column. The data should be split into training, validation, and test sets. Finally, the datasets should be uploaded to Amazon S3.
- The dataset should be uploaded directly to Amazon S3. Amazon SageMaker can then be used to split the data into training, validation, and test sets.
- The data should be split into training, validation, and test sets. The datasets should then be uploaded to Amazon S3.
- The target label should be changed to the first column. The dataset should then be uploaded to Amazon S3. Finally, Amazon SageMaker can be used to split the data into training, validation, and test sets.
- The target label should be changed to the first column. The data should be split into training, validation, and test sets. Finally, the datasets should be uploaded to Amazon S3.* (For training data in CSV format, the XGBoost algorithm assumes that the target variable is in the first column and that it does not have a header record. Refer to the following for more information: https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html#InputOutput-XGBoost for more info)
- The dataset should be uploaded directly to Amazon S3. Amazon SageMaker can then be used to split the data into training, validation, and test sets. (You should split the data before you upload the train, test and validation datasets to Amazon S3. Amazon S3 cannot split the data automatically.)
- The data should be split into training, validation, and test sets. The datasets should then be uploaded to Amazon S3. (Splitting the data before uploading to Amazon S3 is right, but Amazon SageMaker expects the first column to be the target variable which has to be done before upload.)
- The target label should be changed to the first column. The dataset should then be uploaded to Amazon S3. Finally, Amazon SageMaker can be used to split the data into training, validation, and test sets. (Amazon SageMaker cannot split the data automatically.)
Q21. A navigation and transportation company is using satellite images to model weather around the world in order to create optimal routes for its ships and planes. The company is using Amazon SageMaker training jobs to build and train its models.
However, during training, it takes too long to download the company’s 100 GB data from Amazon S3 to the training instance before the training starts.
What should the company do to speed up its training jobs while keeping the costs low?
- Increase the instance size for training
- Increase the batch size in the model
- Change the input mode to Pipe
- Create an Amazon EBS volume with the data on it and attach it to the training job
- Change the input mode to Pipe* (With Pipe input mode, your dataset is streamed directly to your training instances instead of being downloaded first. This means that your training jobs start sooner, finish quicker, and need less disk space.)
- Increase the instance size for training (Increasing instance size may increase the network throughput a little but it wont speed up the training job time since the training job will still have to wait for the whole dataset to download to the instance)
- Increase the batch size in the model (Increasing batch size doesnt help with improving training time)
- Create an Amazon EBS volume with the data on it and attach it to the training job (Amazon EBS volume
will certainly help in speeding up the time taken but you cannot attach an existing EBS volume to a training job)
A Data Scientist wants to tune the hyperparameters of a machine learning model to improve the model’s F1 score.
What technique can be used to achieve this desired outcome on Amazon SageMaker? (Select TWO)
- Grid Search
- Random Search
- Breadth First Search
- Bayesian optimization
- Depth first search
- Random Search* (Random Search replaces the exhaustive enumeration of all combinations by selecting them randomly. It can outperform Grid search, especially when only a small number of hyperparameters affects the final performance of the machine learning algorithm.)
- Bayesian optimization*(Bayesian optimization builds a probabilistic model of the function mapping from hyperparameter values to the objective evaluated on a validation set. In practice, Bayesian optimization has been shown to obtain better results in fewer evaluations compared to grid search and random search, due to the ability to reason about the quality of experiments before they are run. Amazon Sagemaker supports Bayesian hyperparameter optimization.)
- Grid Search(The traditional way of performing hyperparameter optimization has been grid search, or a parameter sweep, which is simply an exhaustive searching through a manually specified subset of the hyperparameter space of a learning algorithm. A grid search algorithm must be guided by some performance metric, typically measured by cross-validation on the training set or evaluation on a held-out validation set.)
- Breadth First Search (Breadth First Search is not an algorithm for hyperparameter optimization. Rather, its a graph algorithm.Breadth-first search is an algorithm for traversing or searching tree or graph data structures.)
- Bayesian optimization*(Bayesian optimization builds a probabilistic model of the function mapping from hyperparameter values to the objective evaluated on a validation set. In practice, Bayesian optimization has been shown to obtain better results in fewer evaluations compared to grid search and random search, due to the ability to reason about the quality of experiments before they are run. Amazon Sagemaker supports Bayesian hyperparameter optimization.)
- Depth first search (Although a good search method algorithm, this is not used for hyperparameter optimization since it generally deals with array search)
A Data Scientist is using stochastic gradient descent (SGD) as the gradient optimizer to train a machine learning model. However, the model training error is taking longer to converge to the optimal solution than desired.
What optimizer can the Data Scientist use to improve training performance? (Select THREE)
- Adam
- Adagrad
- Gradient Descent
- RMSProp
- Mini-batch gradient descent
- Xavier
- Adam* (Adam stands for adaptive momentum which can help the model converge faster and get out of being stuck in local minima.)
- Adagrad* (Adagrad is an algorithm for gradient-based optimization that adapts the learning rate to the parameters by performing smaller updates and, in turn, helps with convergence.)
- RMSProp* (RMSProp uses a moving average of squared gradients to normalize the gradient itself, which helps with faster convergence.)
- Gradient Descent (Gradient descent will take longer to converge than SGD does since it needs the whole dataset for every step calculation.)
- Mini-batch gradient descent (Mini batch gradient descent will suffer from some of the same problems as SGD.)
- Xavier (Xavier is an initialization technique and not an optimization technique.)
A Data Scientist wants to use the Amazon SageMaker hyperparameter tuning job to automatically tune a
random forest model.
What API does the Amazon SageMaker SDK use to create and interact with the Amazon SageMaker hyperparameter tuning jobs?
- HyperparameterTunerJob()
- HyperparameterTuner()
- HyperparameterTuningJobs()
- Hyperparameter()
2. HyperparameterTuner()* (This is the correct class for creating and interacting with Amazon SageMaker hyperparameter tuning jobs, as well as deploying the resulting model(s). It takes an estimator to obtain configuration information for training jobs that are created as the result of a hyperparameter tuning job. Refer to the following for more information: https://sagemaker.readthedocs.io/en/stable/tuner.html)
- HyperparameterTunerJob()
- HyperparameterTuningJobs()
- Hyperparameter()
A Machine Learning Engineer is creating a regression model for forecasting company revenue based on an internal dataset made up of past sales and other related data.
What metric should the Engineer use to evaluate the ML model?
- Cross-entropy log loss
- Sigmoid
- Root Mean squared error (RMSE)
- Precision
- Root Mean squared error (RMSE)* (Residuals are a measure of how far from the regression line data
points are; RMSE is a measure of how spread out these residuals are. The RMSE is the square root of the variance of the residuals. It indicates the absolute fit of the model to the data, or, put another way, how close the observed data points are to the model’s predicted values.) - Cross-entropy log loss (Cross-entropy log loss is generally used for classification.)
- Sigmoid (Sigmoid maps the input value to an output that is between 0 and 1. It is not a good metric
to use for this use case) - Precision (Precision means the percentage of your results that are relevant and not used for regression problems.)