CAP Predictive Analytics Flashcards
Predictive analytics
Statistical techniques used to make predictions about future or otherwise unknown events
Data science
Field that applies statistics, data analysis, machine learning, and data mining in order to understand and analyze phenomena
Big Data
Data sets so large or complex that traditional data processing applications are inadequate
Machine learning
Study of algorithms and models that learn to perform a specific task based on pattern recognition
Data mining
The process of discovering patterns in large data sets
Deep learning
A family of machine learning methods based on learning data representations as opposed to task-specific algorithms
Linear regression
Approach to modeling the relationship between a dependent variable and one or more independent variables in a linear fashion
Support Vector Machine
A classifier that attempts to find the maximum separating hyperplane between samples in two classes
K-nearest neighbor (K-NN)
A classifier that assigns new samples the majority classification of the k nearest samples in the training space
Logistic regression
Approach to modeling the relationship between continuous independent variables and a binary dependent variable, uses a sigmoid function
Decision tree
A classifier that uses root nodes, branches, and leaf nodes to indicate classification decisions on unknown samples; learning comes from maximizing information gain or other metrics at each node
k-means clustering
Unsupervised learning method that groups elements into one of k different clusters based on which cluster it is closest to
Random forest
Ensemble method that randomly selects features, trains a set of decision trees using those features, and then uses majority voting for classification of samples
Naive Bayes Classifier
Probabilistic classifier that applies Bayes’ theorem with strong independence assumptions about features
Neural network
A classifier that uses artificial neurons/perceptrons linked to each other to learn patterns in data