CHAP 5 : ML Intro and KNN Flashcards
What is the difference between traditional programming and machine learning?
In traditional programming, computers are merely following instructions from the program. However, in machine learning, computers learn from experiences just as a human does and makes decisions.
Traditional programming:
- Input : Data, rules (logic)
- Output : Answers
Machine learning:
- Input : Data, answers
- Output : Logic
What are the 3 categories that machine learning can be broadly classfied into?
- Supervised learning
- Unsupervised learning
- Reinforcement learning
What kinds of tasks are solved in supervised learning? Give examples.
- Classsification – Object classification in surveillance videos / photos etc
- Regression – Prediction of house prices
What is the goal of supervised learning and what kind of data is passed into supervised learning?
Goal of supervised learning algorithm is to learn patterns in the data and build a general set of rules to map input to the class / event.
Human-labelled data is passed as input
What are the 3 stages of supervised learning model?
- Training
- Testing / Validation
- Classification / prediction
What kinds of tasks are solved in unsupervised learning? Give examples.
- Clustering – grouping of similar customer profiles (aka customer segmentation for marketing purposes)
- Dimensionality reduction – finding key features in data
- Anormaly detection – detecting fraud credit card transaction
What is dimensionality reduction? [not so impt]
transformation of data from a high-dimensional space into a low-dimensional space so that the low-dimensional representation retains some meaningful properties of the original data
What is the idea behind reinforcement learning?
- These algorithm maps situations to actions that yield maximum rewards
- For every action, there is a reward defined by the user
- The algorithm learns to find the action that maximises the reward
Give examples of reinforcement learning.
AI for games (e.g. ATARI Game AIs / game bots)
Self-driving car
What are the 3 major tasks in ML?
- Classification
- Regression
- Clustering
What kinds of data are used in ML algorithms ? [2]
- Numerical
- Categorical
What are the 2 kinds of numerical data? Give examples.
- Discrete data – Shows the count that involves only integers, cannot subdivide the values into parts (e.g. no of children)
- Continuous data – can be meaningully divided into finer levels, measured on a scale / continuum and have almost any numeric value (e.g. weight and height)
What are the 2 kinds of categorial data? Give examples.
- Nominal data – used for labelling variables, has no order (sometimes referred to as “labels”) – e.g. Gender
- Ordinal data – values of ordinal data have some natural ordering – e.g. rating product on a scale of 1-5, clothes sizing (small, medium, large)
Higher dimensional data provides more details but incurs more computation in Machine Learning. True or False?
True
What are the characterisitics of kNN algorithm?
kNN is a
1. Supervised – need labeled training data
2. Non-parametric – no assumption made on data distribution
- Lazy learning – no need for training the data, no model generated
algorithm