Exam 1 Flashcards

1
Q

Data science

A

An area of investigation which includes AI and it’s components as well as statistical analysis, data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Machine learning

A

Element that allows a computer intelligence to learn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Artificial intelligence

A

Computer implementation of human intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Artificial neural network

A

Computer version of a biological version of this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Supervised Learning

A

Type of ML that find a model based on a dataset where the values (targets) are known, to predict those values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Unsupervised learning

A

Type of ML to find a model based on a dataset to determine natural classifications without guidance from known classifications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Classifier

A

Find a classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Clustering

A

Find natural classifications for a dataset without guidance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Regression

A

Predict a value based on a fit to trends in the dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Association

A

Identify patterns of association between variables or items

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A learning rule in machine learning is

A

What ML algorithms uses to learn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A decision boundary is

A

A point, line, plane, or hyper plane separating different classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Gradient descents

A

Updates the answer in the direction along the negative gradient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Steepest descents

A

Chooses the best learning rate in each step to minimize the number of iterations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Choice of optimization

A

Trade off in number of iterations and the speed of each iteration that produces a reliable result in the shortest time with the smallest resources (memory)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The gradient gives

A

A good direction but not a good distance to find the minimum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The problem of diminishing gradients can be handled by

A

Normalizing the gradient by dividing by its L2 norm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

L1 Norm of a vector

A

Sun or average of absolute values of the vector elements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

L2 Norm of a vector

A

Square root of sum or average of square of values of the vector elements

20
Q

Linfinity Norm of a vector

A

The maximum number in the vector

21
Q

L0 Norm of a vector

A

Number of non zero elements in the vector

22
Q

Feature

A

Defining characteristics of a given dataset that allow for optimal learning

23
Q

Observation

A

A sample of the system that may contain several measurements

24
Q

Reason to do feature scaling

A

Tends to make the search for the minimum more direct

25
Q

K fold cross validation

A

Divide the data randomly into datasets and choose all combinations to train

26
Q

Overfitting

A

Fitting noise during training which over estimates how the model will perform on test data

27
Q

Underfitting

A

Training does not capture the desired prediction

28
Q

Accuracy

A

Number of correct predictions as a proportion of total observations

29
Q

AUC

A

Area under the true positive vs False positive rate

30
Q

True positive rate

A

TP/(TP + FN)

31
Q

False positive rate

A

FP/(FP + TN)

32
Q

Linear regression

A

Predicts value from data trends

33
Q

Logistics regression

A

Generative discriminator classifying using relative probabilities, assumes independent predictors

34
Q

SVM

A

Margin perceptron with regularizer including linear or nonlinear transform

35
Q

Decision tree

A

Based on splitting observations on feature value thresholds, a weak learner

36
Q

Random forest

A

Ensemble learning method using decision trees with randomly chosen datasets

37
Q

K nearest neighbors

A

Subdivision of space into classes based on majority of members within a distance from a centroid, a lazy learner

38
Q

K means

A

Unsupervised learning dividing observations into clusters

39
Q

K means with DBSCAN

A

Clustering based on density of points in a classification

40
Q

Two types of error in ML are

A

Bias and variance

41
Q

An example of bias would be

A

An assumption about filtering or collection of data

42
Q

An example of variance would be

A

Fitting nuances in the data that may be noise

43
Q

A way to reduce variance would be

A

Filter or smooth the predictors

44
Q

A way to reduce bias would be

A

Remove filters and use raw data

45
Q

We can generally reduce overfitting by

A

Increasing the number of independent observations