ML Flashcards by Sean Ndiomu

Artificial Intelligence

Program that can sense, reason, act & adapt

How well did you know this?

Not at all

Perfectly

Machine Learning

Algorithms where performance improve as they re exposed to more data

How well did you know this?

Not at all

Perfectly

Deep Learning

Multilayered neural networks learn from lots of data

How well did you know this?

Not at all

Perfectly

Types of analytics in order of complexity

Descriptive
Predictive
Prescriptive

How well did you know this?

Not at all

Perfectly

Ability to process large data because of ________. Using:
1.
2.
3.

Infrastructure

CLoud Services
GPU: Handles graphic rendering tasks
TPU: Gives high performance & pure efficiency when running tensor flow

How well did you know this?

Not at all

Perfectly

Types of ML (3 types)

Supervised - training data
Unsupervised - no training data
Reinforcement - interaction, +/-ve feedback

How well did you know this?

Not at all

Perfectly

Summary of what to use:
Continuous = …
Discrete = …

…Accuracy
…Confusion Matrix

How well did you know this?

Not at all

Perfectly

Support Vector Machines use

Discriminative classifier
Clear margin of separation between categories
Used in small clean data sets
Don’t suffer from over fitting as much as other methods

How well did you know this?

Not at all

Perfectly

When should you not use Support Vector Machines

large data sets because required training time is higher in noisy data sets

How well did you know this?

Not at all

Perfectly

Decision trees

Used for regression lasts - continous variable decision trees
Used for classification - Categorical variable decision trees
Why use them?
→ involves stratifying or segmenting the predictor space spare into # of regions
→ good for non-linear data.

How well did you know this?

Not at all

Perfectly

Random Forests

Belong in the general category of Ensemble Methods
Increases predictive accuracy but sometimes at the expense of explainability

How well did you know this?

Not at all

Perfectly

Ways to increase accuracy

Bagging:
Learns from each other independently in parallel & combines them for determining model average

Boosting:
Learns sequentially & adaptively to increase model predictions of learning algorithm

How well did you know this?

Not at all

Perfectly

Pruning

Decreases size of decision trees by removing sections of the tree that are non-critical & redundant to classify instances.

Decreases of the final classifier .’. increase accuracy by decrease of overfitting

How well did you know this?

Not at all

Perfectly

Why use Unsupervised Learning

Easier to obtain unlabelled data
Takes place in real time
Decrease complexity in comparison to supervised
finds all kinds of unknown patterns in data

How well did you know this?

Not at all

Perfectly

K-means cluster analysis

Help with data-driven insights
Deep domain knowledge in required
No right or wrong matter for interpretation

How well did you know this?

Not at all

Perfectly

Isolation Forest Algorithm

Study These Flashcards

Basic idea is to overfit decision tree models

Grows in random decisions see model until exact instance is in its own leaf
Model is forced to keep splitting

Dimensionality reduction

Study These Flashcards

Decrease features to easily understandable dimensions

Helps project data in dimensions the brain can understand

Deep Learning

Study These Flashcards

Can be supervised or unsupervised
Uses multilayered neural networks called DNN to simulate complex decision making power of human brain

Convolutional neural nets

Study These Flashcards

What’s the difference between loss function and optimisation?

Study These Flashcards

Loss function: Measures how far a model’s predictions are from the actual values. Lower loss indicates a better fit.

Optimization: The process of adjusting a model to minimize the loss function and improve its performance.

Analogy: Imagine training a dog to fetch. The loss function is like the distance between the ball and where the dog drops it. Optimization is the process of training the dog to minimize that distance (bring the ball back closer each time).

Backpropagation

Study These Flashcards

A method in deep learning to adjust the internal parameters of a neural network. It calculates the error between predictions and actual values, then propagates it backwards through the network to update weights and biases, making the model learn from its mistakes.

Analogy: Imagine a team game where players need to improve their coordination. Backpropagation is like a coach reviewing plays, identifying where things went wrong (high error), and guiding each player (weights and biases) on how to adjust for the next round.

Transfer Learning

Study These Flashcards

Reusing a pre-trained neural network model on a new task. The pre-trained model (usually trained on a massive dataset) acts as a starting point, with its weights fine-tuned for the specific needs of the new task. This saves time and data compared to training a new model from scratch.

Analogy: Learning to ride a bike. Once you know the basics (balance, pedaling), you can transfer that knowledge to ride a motorcycle (similar concept, new controls). Transfer learning leverages pre-existing knowledge (pre-trained model) for a new task (riding a motorcycle).

Pros and Cons of Deep Learning

Study These Flashcards

Pros:

High Accuracy: Can achieve state-of-the-art performance on complex tasks like image recognition, natural language processing, and speech recognition.
Feature Learning: Discovers patterns and features from raw data automatically, eliminating feature engineering.
Scalability: Handles large and complex datasets effectively.
Cons:

High Computational Cost: Requires powerful GPUs and large amounts of data for training, making it resource-intensive.
Black Box Problem: Deep models can be complex and difficult to interpret, making it challenging to understand their decision-making process.
Overfitting: Prone to overfitting if not trained properly, leading to poor performance on unseen data.

Stemming

Study These Flashcards

Popular way to decrease size of vocab in Neural language tasks by conflating words with related meanings

Aims to convert words with same stem/root to single word types

Tokenisation

Splitting texts into individual words or sequences of words [N-grams]

K-means clustering

An unsupervised learning algorithm that groups similar data points together into clusters. It iteratively assigns data points to the closest cluster center (mean) and recalculates the center until convergence (minimal changes). Analogy: Sorting socks by color. K-means separates data points (socks) into piles (clusters) based on their features (color) by minimizing the distance between each sock and the average color of its pile.

Supervised vs Unsupervised Learning

Supervised learning: Learns from labeled data (data with known outcomes) to make predictions on unseen data. It's like learning with a teacher who provides examples and corrects mistakes. Unsupervised learning: Discovers patterns and structures in unlabeled data (data without predefined categories). Imagine exploring a new city with no map - you learn by looking for landmarks and patterns. Key Difference: Labeled data. Supervised learning uses labeled data for training, while unsupervised learning doesn't.

ML Flashcards

(27 cards)