! Session 1 & 2: Intro Flashcards
Algorithm
set of procedures that creates a model when trained. E.g., linear regression
Model
fitted algorithm that has been trained. E.g. a linear regression model that has been trained to predict prices of ‘X’.
Parameters
internal variables within model that adjust automatically when training model
Hyperparameters
variables set by the user to control the algorithm and they define how it learns from the data
Data Science
= Collection of statistical & ML model that
- supports info extraction from data
- offers insights, causality, predictions
ML
- science of programming computers so they learn from data without being explicitly programmed
- component of data science
- AI technique for sophisticated cognitive task
ML Functions
- Descriptive: uses data to explain what happened
- Predictive: uses data to predict what will happen
- Prescriptive: use data to suggest actions
ML Subfields
- Natural language processing: machines learn to understand natural language as spoken and written by humans
- Neural networks: modeled on the human brain
- Deep learning networks: neural networks with many layers
When does ML work well?
- (Large) data is available
- Problem is dynamic or fluctuating
- Problem requires predictions or discovering patterns
ML Flow
Preperation
- Identify Question & Task
- Data Collection
- Data preprocessing
- EDA
Model Development
- Feature & Model Selection
- Splitting Data
- Training with Train Set, validation
- Evaluate with Test set (repeat model development until results satisfiing)
Communicate results & Deploying Model
Challenges
- Explainability: What are ML models doing? How decisions made?
- Bias & unintended outcomes
Challenge: Bias & unintended outcomes
- insufficient Data (Not enough training, non-representative, irrelevant features)
- Overfitting & underfitting
Bias-Variance Tradeoff
- goal: low bias & low variance
- Bias = amount of error by approximating real-world phenomena with simplified model -> underfit
- Variance = how much models test error changes based on variation in training data, prediction error when using data not previously seen by model-> overfit
Subcategories of ML Models
- Based on Training
- supervised
- unsupervised
- reinforcement
- semi-supervised - Based on working
- instance based
- model based
Subcategories of ML Models - based on training: super and unsupervised
- Supervised = trained with labeled data -> know answers we want
- Unsupervised = looks for pattern in unlabeled data -> goal: find unkonwn structures / trends
Subcategories of ML Models - based on working
- Instance based = learns all previous data, compare new data to it & generalizes based on similarity
- model based = learns model based on training data & predict labels according to it
Supervised ML Models
- Regression: x -> continous y
- Classification: x -> discrete (binary) y
Supervised ML Models - Regression
- Linear Regression
- Neural Networks
Unsupervised ML Models - Categories
- Clustering: x -> discrete y
- Dimensionality Reduction: x -> continous y
- Density Reduction: select relevant variables
ML versus traditional AI techniques
- trad. AI: static, rule based, no generalization
- ML: dynamic, data driven, generalization
e.g. Chess
- Symbolic AI: sit down with best chess player & put knowledge in PC
- Statistical AI: Simulate all possible moves & outcomes & take most likely to win
- ML: Show millions of examples & let program learn
Where ML > other AI techniques
- tasks programmers cant describe (handwriting, cognitive reasoning)
- complex multidimensional problems that cant be solved by numerical reasoning (weather forecast, health care outcomes)
3 C’s of ML
- Collaborative filtering: technique for recommendations, same algorithm for different objects, e.g. Amazon & Netflix
- Classification
- Clustering
Over and underfitting
- Overfitting: learning a function that perfectly explains training data that the model learned from, but doesn’t generalize well - high variance
- Underfitting: (strong correlated features) model is not complex enough to capture underlying trend - high bias
Reinforcement learning
trains through try & error with reward system, e.g. Roboter learns to walk (difficult to define task, learning dangeorus e.g. car)
semi-supervised learning
Learns with few labels clustering & adopts labels to all instances of cluster
Parametic
Any algorithm that learns using a pre-defined mapped function e.g. linear regression
Non parametric
Any algorithm that does not make assumptions about form of mapping function e.g. KNN & SVM
Pro and con parametric
- pro: simple, fast, less data
- con: constrained to parameters / assumptions, limited complexity, poor fit
Pro and con non parametric
- pro: flexibility (large nr of features), high power, performance
- con: more data, slower, overfitting
Convex
A shape where given any 2 points in subset, the subset contains the whole line segment that joins them