Chapter 5 Flashcards
AdaBoost
a boosting ensemble method algorithm
artificial neural network (ANN)
Computer technology that attempts to build computers that operate like a human brain. The machines possess simultaneous memory storage and work with ambiguous information.
attrition
loss of personnel, e.g. students, customers, staff, etc
axon
An outgoing connection (i.e., terminal) from a biological neuron
backpropagation
The best-known learning algorithm in neural computing where the learning is done by comparing computed outputs to desired outputs of training cases
bagging
The simplest and most common type of ensemble method; it builds multiple prediction models (e.g., decision trees) from bootstrapped/resampled data and combines the predicted values through averaging or voting
Bayesian network (BN)
these are powerful tools for representing dependency structure among variables in a graphical, explicit, and intuitive way
Bayes theorem
this is a mathematical formula for determining conditional probabilities
boosting
This is an ensemble method where a series of prediction models are built progressively to improve the predictive performance of the cases/samples incorrectly predicted by the previous ones
conditional probability
the probability of event A given that an event B is known to have occured.
cross-validation
involves randomly splitting the data into multiple groups such that N groups are used for training and M groups are used for testing and validation
dendrites
The part of a biological neuron that provides inputs to the cell
distance metric
A method used to calculate the closeness between pairs of items in most cluster analysis methods
Euclidean distance
shortest path between two points
heterogenous ensemble
These combine the outcomes of two or more different types of models such as decision trees, artificial neural networks, logistic regression, support vector machines, and others
hidden layer
The middle layer of an artificial neural network that has three or more layers
Hopfield network
a neural network architecture
hyperplane
A geometric concept commonly used to describe the separation surface between different classes of things within a multidimensional space.
information fusion
(or simply, fusion) A type of heterogeneous model ensembles that combines different types of prediction models using a weighted average, where the weights are determined from the individual models’ predictive accuracies
k-fold cross-validation
A popular accuracy assessment technique for prediction models where the complete data set is randomly split into k mutually exclusive subsets of approximately equal size. The classification model is trained and tested k times. Each time it is trained on all but one fold and then tested on the remaining single fold. The cross-validation estimate of the overall accuracy of a model is calculated by simply averaging the k individual accuracy measures
k-nearest neighbor (kNN)
A prediction method for classification as well as regression-type prediction problems where the prediction is made based on the similarity tok neighbors
kernel trick
In machine learning, a method for using a linear classifier algorithm to solve a nonlinear problem by mapping the original nonlinear observations onto a higher-dimensional space, where the linear classifier is subsequently used; this makes a linear classification in the new space equivalent to a nonlinear classification in the original space
Kohonen’s self-organizing feature map
A type of neural network model for machine learning
Manhattan distance
the rectilinear distance between two points (sum of 2 shortest paths of a triangle)