lv. 3 - Copy of CS35 Flashcards
Series of tasks, activities, or operations to achieve a goal or an outcome
Process
Combination of hardware and software to facilitate or automate processes
Technology
Discrete measurement, fact, or observation representing a real-world process
Data
the mathematical discipline that studies the methods of collecting, analyzing, and interpreting data.
Statistics
specific collection of items of interest
Population
subset or subcollection of the population
Sample
two scopes of data
Sample & Population
Logic is built based on business rules
Traditional Rule-Based AI
Logic is built by modelling and training data
Machine Learning
Input and sometimes output data are provided to a machine which will build a logic based on mathematical rules
Machine Learning
Machine learning algorithms in which the training data includes both input and output
Supervised Machine Learning
Inputs are called
feature values
outputs are called
label values
the label predicted by the model is a numeric value
Regression
the model predicts whether a record is an instance of a specific class or category
Binary Classification
the model predicts whether a record is an instance of one of multiple classes or categories
Multiclass Classification
Training data consists only of input without any known output
Unsupervised Machine Learning
the model identifies similarities between observations based on their features and groups them into discrete clusters
Clustering
A model that groups existing customers into clusters based on age, location, gender, social media usage, and purchasing behavior.
Clustering
A model that classifies whether a social media post is positive, negative, or neutral.
Multiclass Classification
A model that predicts whether a customer will cancel their subscription.
Binary Classification
A model that predicts the price of an apartment based on the size, number of rooms, barangay, and date of building.
Regression
Used to train the model, data where the algorithm learns patterns from
Training Data
Used to evaluate the model
Test Data
Proportion of predictions that the model got right
Accuracy
Proportion of predicted positive cases where the true label is actually positive
Precision
Proportion of positive cases that the model identified correctly
Recall
Overall metric combining Recall and Precision
F1 Score
a lazy learning algorithm, predicts the class of a data point based on the majority class of its k nearest neighbors
k-NN classifier
predicts the probability that a given data point belongs to a particular class, uses the logistic function
Logistic Regression
an S-shaped curve, used to represent logistical regression
logistic function
occurs when one class is significantly more frequent than the other
Class Imbalance