Exam 1 Flashcards by Zachery Garibay

Data science

An area of investigation which includes AI and it’s components as well as statistical analysis, data analysis

How well did you know this?

Not at all

Perfectly

Machine learning

Element that allows a computer intelligence to learn

How well did you know this?

Not at all

Perfectly

Artificial intelligence

Computer implementation of human intelligence

How well did you know this?

Not at all

Perfectly

Artificial neural network

Computer version of a biological version of this

How well did you know this?

Not at all

Perfectly

Supervised Learning

Type of ML that find a model based on a dataset where the values (targets) are known, to predict those values

How well did you know this?

Not at all

Perfectly

Unsupervised learning

Type of ML to find a model based on a dataset to determine natural classifications without guidance from known classifications

How well did you know this?

Not at all

Perfectly

Classifier

Find a classification

How well did you know this?

Not at all

Perfectly

Clustering

Find natural classifications for a dataset without guidance

How well did you know this?

Not at all

Perfectly

Regression

Predict a value based on a fit to trends in the dataset

How well did you know this?

Not at all

Perfectly

Association

Identify patterns of association between variables or items

How well did you know this?

Not at all

Perfectly

A learning rule in machine learning is

What ML algorithms uses to learn

How well did you know this?

Not at all

Perfectly

A decision boundary is

A point, line, plane, or hyper plane separating different classes

How well did you know this?

Not at all

Perfectly

Gradient descents

Updates the answer in the direction along the negative gradient

How well did you know this?

Not at all

Perfectly

Steepest descents

Chooses the best learning rate in each step to minimize the number of iterations

How well did you know this?

Not at all

Perfectly

Choice of optimization

Trade off in number of iterations and the speed of each iteration that produces a reliable result in the shortest time with the smallest resources (memory)

How well did you know this?

Not at all

Perfectly

The gradient gives

A good direction but not a good distance to find the minimum

How well did you know this?

Not at all

Perfectly

The problem of diminishing gradients can be handled by

Normalizing the gradient by dividing by its L2 norm

How well did you know this?

Not at all

Perfectly

L1 Norm of a vector

Sun or average of absolute values of the vector elements

How well did you know this?

Not at all

Perfectly

L2 Norm of a vector

Study These Flashcards

Square root of sum or average of square of values of the vector elements

Linfinity Norm of a vector

Study These Flashcards

The maximum number in the vector

L0 Norm of a vector

Study These Flashcards

Number of non zero elements in the vector

Feature

Study These Flashcards

Defining characteristics of a given dataset that allow for optimal learning

Observation

Study These Flashcards

A sample of the system that may contain several measurements

Reason to do feature scaling

Study These Flashcards

Tends to make the search for the minimum more direct

K fold cross validation

Divide the data randomly into datasets and choose all combinations to train

Overfitting

Fitting noise during training which over estimates how the model will perform on test data

Underfitting

Training does not capture the desired prediction

Accuracy

Number of correct predictions as a proportion of total observations

AUC

Area under the true positive vs False positive rate

True positive rate

TP/(TP + FN)

False positive rate

FP/(FP + TN)

Linear regression

Predicts value from data trends

Logistics regression

Generative discriminator classifying using relative probabilities, assumes independent predictors

SVM

Margin perceptron with regularizer including linear or nonlinear transform

Decision tree

Based on splitting observations on feature value thresholds, a weak learner

Random forest

Ensemble learning method using decision trees with randomly chosen datasets

K nearest neighbors

Subdivision of space into classes based on majority of members within a distance from a centroid, a lazy learner

K means

Unsupervised learning dividing observations into clusters

K means with DBSCAN

Clustering based on density of points in a classification

Two types of error in ML are

Bias and variance

An example of bias would be

An assumption about filtering or collection of data

An example of variance would be

Fitting nuances in the data that may be noise

A way to reduce variance would be

Filter or smooth the predictors

A way to reduce bias would be

Remove filters and use raw data

We can generally reduce overfitting by

Increasing the number of independent observations

Exam 1 Flashcards

(45 cards)