Chapter 5 Flashcards by Michael Elkerton

AdaBoost

a boosting ensemble method algorithm

How well did you know this?

Not at all

Perfectly

artificial neural network (ANN)

Computer technology that attempts to build computers that operate like a human brain. The machines possess simultaneous memory storage and work with ambiguous information.

How well did you know this?

Not at all

Perfectly

attrition

loss of personnel, e.g. students, customers, staff, etc

How well did you know this?

Not at all

Perfectly

axon

An outgoing connection (i.e., terminal) from a biological neuron

How well did you know this?

Not at all

Perfectly

backpropagation

The best-known learning algorithm in neural computing where the learning is done by comparing computed outputs to desired outputs of training cases

How well did you know this?

Not at all

Perfectly

bagging

The simplest and most common type of ensemble method; it builds multiple prediction models (e.g., decision trees) from bootstrapped/resampled data and combines the predicted values through averaging or voting

How well did you know this?

Not at all

Perfectly

Bayesian network (BN)

these are powerful tools for representing dependency structure among variables in a graphical, explicit, and intuitive way

How well did you know this?

Not at all

Perfectly

Bayes theorem

this is a mathematical formula for determining conditional probabilities

How well did you know this?

Not at all

Perfectly

boosting

This is an ensemble method where a series of prediction models are built progressively to improve the predictive performance of the cases/samples incorrectly predicted by the previous ones

How well did you know this?

Not at all

Perfectly

conditional probability

the probability of event A given that an event B is known to have occured.

How well did you know this?

Not at all

Perfectly

cross-validation

involves randomly splitting the data into multiple groups such that N groups are used for training and M groups are used for testing and validation

How well did you know this?

Not at all

Perfectly

dendrites

The part of a biological neuron that provides inputs to the cell

How well did you know this?

Not at all

Perfectly

distance metric

A method used to calculate the closeness between pairs of items in most cluster analysis methods

How well did you know this?

Not at all

Perfectly

Euclidean distance

shortest path between two points

How well did you know this?

Not at all

Perfectly

heterogenous ensemble

These combine the outcomes of two or more different types of models such as decision trees, artificial neural networks, logistic regression, support vector machines, and others

How well did you know this?

Not at all

Perfectly

hidden layer

The middle layer of an artificial neural network that has three or more layers

How well did you know this?

Not at all

Perfectly

Hopfield network

a neural network architecture

How well did you know this?

Not at all

Perfectly

hyperplane

Study These Flashcards

A geometric concept commonly used to describe the separation surface between different classes of things within a multidimensional space.

information fusion

Study These Flashcards

(or simply, fusion) A type of heterogeneous model ensembles that combines different types of prediction models using a weighted average, where the weights are determined from the individual models’ predictive accuracies

k-fold cross-validation

Study These Flashcards

A popular accuracy assessment technique for prediction models where the complete data set is randomly split into k mutually exclusive subsets of approximately equal size. The classification model is trained and tested k times. Each time it is trained on all but one fold and then tested on the remaining single fold. The cross-validation estimate of the overall accuracy of a model is calculated by simply averaging the k individual accuracy measures

k-nearest neighbor (kNN)

Study These Flashcards

A prediction method for classification as well as regression-type prediction problems where the prediction is made based on the similarity tok neighbors

kernel trick

Study These Flashcards

In machine learning, a method for using a linear classifier algorithm to solve a nonlinear problem by mapping the original nonlinear observations onto a higher-dimensional space, where the linear classifier is subsequently used; this makes a linear classification in the new space equivalent to a nonlinear classification in the original space

Kohonen’s self-organizing feature map

Study These Flashcards

A type of neural network model for machine learning

Manhattan distance

Study These Flashcards

the rectilinear distance between two points (sum of 2 shortest paths of a triangle)

maximum margin

In machine learning the margin of a single data point is defined to be the distance from the data point to a decision boundary.

Minkowski distance

a generalized distance formula that can be specified to get rectilinear or euclidean distance

multi-layer perceptron

a feed-forward neural network architecture

Naive Bayes

A simple probability-based classification method derived from the well-known Bayes’ theorem. It is one of the machine-learning techniques applicable to classification-type prediction problems

neural computing

An experimental computer design aimed at building intelligent computers that operate in a manner modeled on the functioning of the human brain. See artificial neural network (ANN)

neuron

A cell (i.e., processing element) of a biological or artificial neural network

nucleus

The central processing portion of a neuron

pattern recognition

A technique of matching an external pattern to a pattern stored in a computer’s memory (i.e., the process of classifying data into predetermined categories). Pattern recognition is used in inference engines, image processing, neural computing, and speech recognition

perceptron

An early neural network structure that uses no hidden layer

processing element (PE)

A neuron in a neural network.

radial basis function (RBF)

generally speaking, Radial Basis Function (RBF) is a reasonable first choice for the kernel type. The RBF kernel aims to nonlinearly map data into a higher dimensional space; by doing so (unlike with a linear kernel), it handles the cases in which the relation between input and output vectors is highly nonlinear

random forest

First introduced by Breiman (2000) as a modification to the simple bagging algorithm, it uses bootstrapped samples of data and a randomly selected subset of variables to build a number of decision trees, and then combines their output via the simple voting

retention

opposite of attrition

stacking

(a.k.a. stacked generalization or super learner) A part of heterogeneous ensemble methods where a two-step modeling process is used—first the individual prediction models of different types are built and then a meta-model (a model of the individual models) is built

supervised learning

A method of training artificial neural networks in which sample cases are shown to the network as input, and the weights are adjusted to minimize the error in the outputs.

stochastic gradient boosting

a boosting method that is gaining popularity due to it's superior performance

synapse

The connection (where the weights are) between processing elements in a neural network

transformation (transfer) function

In a neural network, the function that sums and transforms inputs before a neuron fires. It shows the relationship between the internal activation level and the output of a neuron

what-if scenario

It is an experimental process that helps determine what will happen to the solution/output if an input variable, an assumption, or a parameter value is changed

Chapter 5 Flashcards

(43 cards)