Introduction to Machine Learning Flashcards
Why is machine learning becoming mainstream and essential for IT professionals?
They use collections of large amounts of data to gain insight and predict trends.
What is a dedicated module for machine learning within Apache Spark?
Spark ML
What does Spark ML do?
Spark ML is a dedicated module for machine learning within Apache Spark, integrating machine learning algorithms with Big Data for speed and distributed computation.
What is the process for Machine Learning?
Data collection, feature engineering, algorithm selection, model training/evaluation, and live analysis.
What are the types of algorithms?
Supervised and Unsupervised.
What are Supervised algorithms?
Regression and classification.
What are unsupervised algorithms?
Clustering.
What is Feature Engineering?
Supervised learning involves inputs (features/independent variables) and outputs (labels/dependent variables).
What are the steps in Feature Engineering?
Data cleaning, feature analysis, feature preparation, and feature scaling.
What is Linear Regression used for?
Linear regression is used to predict a continuous value.
What are some of the methods in Linear Regression and what do they do?
Ordinary Least Squares Method: Minimizes the sum of squared errors to find the best-fit line.
Multivariate regression: involves predicting a label using multiple features.
Accurate measurement: coefficient of determination measures how well the model fits the data.