Booz Terms Flashcards

1
Q

Active Learning

A

Intelligent sample selection to improve performance of model. Samples are selected to provide the greatest information to a learning model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Agent Based Simulation

A

Simulates the actions and interactions of autonomous agents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ANOVA

A

Hypothesis testing for differences between more than two groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Association Rule Mining (Apriori)

A

Data mining technique to identify the common co-occurances of items.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bayesian Network

A

Models conditional probabilities amongst elements, visualized as a Directed Acyclic Graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Collaborative Filtering

A

Also known as ‘Recommendation,’ suggest or eliminate items from a set by comparing a history of actions against items performed by users. Finds similar items based on who used them or similar users based on the items they use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Coordinate Transformation

A

Provides a different perspective on data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Deep Learning

A

Method that learns features that leads to higher concept learning. Usually very deep neural network architectures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Design of Experiments

A

Applies controlled experiments to quantify effects on system output caused by changes to inputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Differential Equations

A

Used to express relationships between functions and their derivatives, for example, change over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Discrete Event Simulation

A

Simulates a discrete sequence of events where each event occurs at a particular instant in time. The model updates its state only at points in time when events occur.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Discrete Wavelet Transform

A

Transforms time series data into frequency domain preserving locality information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Ensemble Learning

A

Learning multiple models and combining output to achieve better performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Expert Systems

A

Systems that use symbolic logic to reason about facts. Emulates human reasoning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Exponential Smoothing

A

Used to remove artifacts expected from collection error or outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Factor Analysis

A

Describes variability among correlated variables with the goal of lowering the number of unobserved variables, namely, the factors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Fast Fourier Transform

A

Transforms time series from time to frequency domain efficiently. Can also be used for image improvement by spatial transforms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Format Conversion

A

Creates a standard representation of data regardless of source format. For example, extracting raw UTF-8 encoded text from binary file formats such as Microsoft Word or PDFs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Fuzzy Logic

A

Logical reasoning that allows for degrees of truth for a statement.

20
Q

Gaussian Filtering

A

Acts to remove noise or blur data.

21
Q

Generalized Linear Models

A

Expands ordinary linear regression to allow for error distribution that is not normal.

22
Q

Genetic Algorithms

A

Evolves candidate models over generations by evolutionary inspired operators of mutation and crossover of parameters.

23
Q

Grid Search

A

Systematic search across discrete parameter values for parameter exploration problems.

24
Q

Hidden Markov Models

A

Models sequential data by determining the discrete latent variables, but the observables may be continuous or discrete.

25
Q

Hierarchical Clustering

A

Connectivity based clustering approach that sequentially builds bigger (agglomerative) or smaller (divisive) clusters in the data.

26
Q

K-means and X-means Clustering

A

Centroid based clustering algorithms, where with K means the number of clusters is set and X means the number of clusters is unknown.

27
Q

Linear, Non-linear, and Integer Programming

A

Set of techniques for minimizing or maximizing a function over a constrained set of input parameters.

28
Q

Markov Chain Monte Carlo (MCMC)

A

A method of sampling typically used in Bayesian models to estimate the joint distribution of parameters given the data.

29
Q

Monte Carlo Methods

A

Set of computational techniques to generate random numbers.

30
Q

Naive Bayes

A

Predicts classes following Bayes Theorem that states the probability of an outcome given a set of features is based on the probability of features given an outcome.

31
Q

Neural Networks

A

Learns salient features in data by adjusting weights between nodes through a learning rule.

32
Q

Outlier Removal

A

Method for identifying and removing noise or artifacts from data.

33
Q

Principal Components Analysis

A

Enables dimensionality reduction by identifying highly correlated dimensions.

34
Q

Random Search

A

Randomly adjust parameters to find a better solution than currently found.

35
Q

Regression with Shrinkage (Lasso)

A

A method of variable selection and prediction combined into a possibly biased linear model.

36
Q

Sensitivity Analysis

A

Involves testing individual parameters in an analytic or model and observing the magnitude of the effect.

37
Q

Simulated Annealing

A

Named after a controlled cooling process in metallurgy, and by analogy using a changing temperature or annealing schedule to vary algorithmic convergence.

38
Q

Stepwise Regression

A

A method of variable selection and prediction. Akaike’s information criterion AIC is used as the metric for selection. The resulting predictive model is based upon ordinary least squares, or a general linear model with parameter estimation via maximum likelihood.

39
Q

Stochastic Gradient Descent

A

General-purpose optimization for learning of neural networks, support vector machines, and logistic regression models.

40
Q

Support Vector Machines

A

Projection of feature vectors using a kernel function into a space where classes are more separable.

41
Q

Term Frequency / Inverse Document Frequency

A

A statistic that measures the relative importance of a term from a corpus.

42
Q

Topic Modeling (Latent Dirichlet Allocation)

A

Identifies latent topics in text by examining word co-occurrence.

43
Q

Tree Based Methods

A

Models structured as graph trees where branches indicate decisions.

44
Q

T-Test

A

Hypothesis test used to test for differences between two groups.

45
Q

Wrapper Methods

A

Feature set reduction method that utilizes performance of a set of features on a model, as a measure of the feature set’s performance. Can help identify combinations of features in models that achieve high performance.