L3 Flashcards

Question 1

Q

Machine Learning

Answer

A

Branch of AI and CS that focuses on the use of data and algorithms to imitate the way humans learn, gradually improving accuracy

Question 2

Q

Supervised ML

Answer

A

use of labelled datasets to train algorithms which classify data or predict outcomes
classification or regression

Question 3

Q

Unsupervised ML

Answer

A

not supervised using training dataset with unlabeled data
model find hidden patterns and insights by itself
clustering or association (rules)

Question 4

Q

Reinforcement ML

Answer

A

simulates an agent that perceives and interprets its environment, takes action and learns through trial and error
wants to maximise cumulative reward in environment where each action has reward or penalty

Question 5

Q

Workflow ML (7)

Answer

A

gather data
prepare data
split into testin, train, valid
train model
test and validate model
deploy model
iteration

Question 6

Q

KNN

pro and con
practical things

what is it

Answer

A

classifies object based on closest training example in feature space -> nearest neighbour
k is number of examples closest to query
distances between point and all other points are found, k nearest points are selected, most frequent label is voted (classif) or averaged (regres)
Pro: simple and usable for regression and classification, achieves high accuracy in wide type of predicition problems
Con: becomes slow as size of data grows, high computing power needed
can be improved with preprocessing: decision trees, PCA
most useful when labeled data can’t be obtained
eg) handwriting detection, image/video recognition, stock prediction

Question 7

Q

Decision trees

what is it
how does it work
pros and cons

Answer

A

supervised learning, can be used for class and regr but it usually used for binary class problems
tree-structured classifier, internal nodes are features of dataset, branches are decision rules and leaf nodes are outcome
algorithm starts from node, compares values and jumps to next node
pro: simple to understand, decision-related problems, helps think about all possible outcomes for problem, less data cleaning required, good preprocessing method
con: layers make it complex, computation complexity increases with layers, overfitting (resolved with RF)

Question 8

Q

Random Forest

what is it
differences from decision trees

Answer

A

average of several decision trees
each is trained with a random sample of data
takes majority vote or average (regress) of outcome of each tree
less overfitting
slower due to more computation
doesn’t use a set of formulas but average of many trees
much more successful if diverse

Question 9

Q

Bootstrap Sampling

Answer

A

drawing of samples from data with replacement to estimate population parameter

Question 10

Q

Naive Bayes

Answer

A

uses conditional probability (bayes theorem) to calculate likelihood of a point belonging to a certain class
naively assumes that predictors aren’t related
used for binary or multiclass classif problems
posterior = (prior x likelihood)/evidence
be able to explain in detail

Question 11

Q

Linear Regression

Answer

A

model that describes relationship between predictors and outcomes
simplest linear model
key algorithms, commonly used for statistical analysis

Question 12

Q

Logistic Regression

Answer

A

adapts linear regress to classification
models the probability of an event by taking the logistic funct of a linear combination of 1 or more independent variables
basically puts linear combo into function that is bounded from 0 to 1
binary, multinomial and ordinal logistic regression

Question 13

Q

K-Means Clustering

what it is
steps
elbow approach

Answer

A

groups items into clusters without predefined classes
each observation belongs to the cluster with the nearest mean
tries to keep clusters as small as possible
process

pick centroids
data forms cluster with nearest points
find new centroids of the cluster
iterate until convergence

Elbow approach: how to chose best number for K, sum of square will go down quickly until its reduction becomes slow -> ideal point with least variation

Question 14

Q

PCA

Answer

A

dimensionality reduction, removes data not useful
takes attributes and data with most variance/relevance and mapps all data onto less dimentions
projection-based method that projects onto set of orthogonal axes
useful for exploratory analysis
eigenvalues can be used to determine nr of PC

L3 Flashcards

(14 cards)