Artificial intelligence Flashcards
Humans vs computers
- Humans can solve problems quickly but take a long time to understand complex problem domains which we achieve through studying and building on an existing foundation of understanding.
- Computers don’t have this similar foundation but instead can quickly learn a problem domain to solve very specific problems much faster than a human can.
What defines AI?
- A solution may be considered intelligent if the computer-generated solution solves a complex problem that if a human were to carry out would be considered intelligent.
- Intelligence exhibited by machines where a device will use information on its surroundings or a problem to maximize its chance of succeeding at a specific task.
Binary classification
Predict one of two classes (can all be expressed as 1 or 0)
• Yes/No
• Buy/Sell
• Healthy/Unhealthy
Multi-class classification
- Predicting the presence of more than 1 class in some dataset.
- The example below shows MRI training data (left) and brain tumour targets (right).
- The system would learn to predict anything from 0 to 3 classes here depending on the presence of tumour and contained sub-tissues.
Types of AI algorithm
SUPERVISED
• The two previous examples of tabular data and brain tumour data with expert masks would be considered supervised datasets.
• This is because we have both the input data and the target data. i.e. we are helping the AI algorithm by providing the intuition to map directly between the input and the target.
• Supervised AI datasets take a long time to prepare and often require multiple experts to rate each sample.
• This expense results in a high accuracy
Unsupervised AI
- Unsupervised AI is the idea that we have some input dataset, but we do not provide a target
- The intention here is to allow the AI to create its own separation and learn to classify between the classes it defines.
- These are often much more sophisticated algorithms, but they are more prone to training failures and classification errors.
- Unsupervised AI can be very useful if we have a huge amount of information but aren’t sure what the exact answer is that we are looking for.
What is machine learning?
- Machine learning is a branch of artificial intelligence which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy (IBM).
- Enabling a computer to learn from data and improve upon itself without being explicitly told how to do so (Arthur Samuel).
- Where artificial intelligence is the field, machine learning is a smaller area within artificial intelligence which focuses on applying specific algorithms that allow the computer to learn information without being told how to do so.
How do we train a machine learning algorithm
decision process = given some input data and target data - the algorithm will look at the info and identify some pattern within the input data from which it will make a target data prediction
error calculation = the machine learning algorithm prediciton is compared against the target and we calculate some error
optimisation = the error value is used to tweak the settings of the algorithm
How do we train a machine learning algorithm?
• We repeat these steps (decision process, error calculation, optimisation) until the machine learning algorithm is making decisions that result in a low error.
• This low error then results in a reduction in the amount that the optimiser updates the machine learning algorithm.
• The reduction in changes to the machine learning algorithm then results in a steady state of decision making. CALLED CONVERGENCE
CONVERGENCE
• The reduction in changes to the machine learning algorithm then results in a steady state of decision making.
K-means
machine learning model
- Uses training data to evaluate how data can be clustered based on the target class.
- K = is the number of clusters that we want to generate, not necessarily the number of classes.
- This technique is useful after performing data visualisation techniques (plotting, PCA) to see how the information clusters.
Support vector machine
machine learning model
Linear separation Is where you draw a line between the 2 groups - Learns to fit a separating hyper-plane between data features where the predicted class depends on which side of the plane the data falls
support vector machine can operate in dimensions we cant visualise
Decision tree
decision.
- Prediction is made when a leaf node is reached.
- Multiple decision trees can be created and their predictions averaged in a popular machine learning algorithm called Random Forest.
- The idea here is that combining votes from multiple trees is better than a single opinion.
- The basic idea of a decision tree is that we make yes/no decisions about each feature until we reach a target prediction.
- In the example to the right these questions in orange relate to features of our input data. During the training stage the questions asked at each decision node are tuned to an optimal threshold that gives us a good result.
- The intuition behind this can be compared to human decision making. For example, if we are determining whether to go for a walk we might ask ourselves questions on the weather, whether it’s cold, whether we’re tired, if we need to prioritise something over walking…
- Decision trees are a really good starting point in machine learning as we can visualize their structures and understand how they have solved a problem, as opposed to being a black box like a lot of artificial intelligence algorithms.
leaf node
- Prediction is made when a leaf node is reached
Random Forest.
- Multiple decision trees can be created and their predictions averaged in a popular machine learning algorithm called Random Forest.
PRE-PROCESSING – how to treat dataset before machine learning model sees it - 5 steps
- Remove redundant information (e.g. height and weight when we also have bmi)
- Remove correlated information (e.g. if we were to keep height we could delete height in foot as it is the same information as height but on a different scale)
- Handle missing data (sample 4 is missing feature 9 resting_heart_rate. This could be removed entirely if the missing variable is critical or we could handle another value. E.g. the average of this column across all samples)
- Handle categorical data (We often want numerical information for our machine learning models so convert binary categoricals e.g. smoker to 0 and 1)
- Standardisation (Machine Learning models like data to be normalised so that no one feature dominates the error calculation. Below we have implemented z-score normalisation but we can also range between [-1, 1] and [0, 1] or any other range)
Pre-processing is not limited to these above steps, nor do we have to implement any of the above if we have reason to believe they will impact the performance of our algorithm.
Validation – check whether or not out machine learning model is performing to a standard which we can use in the real world
• If we have a total of 1000 samples available, we don’t want to use all of these to train the model 1)
• Instead we should leave some of this data out of training so that we can evaluate how our model is performing throughout training 2)
• To do this we would first train the model on the 750 training samples calculate the error and update the algorithm parameters
• We would then get the 250 validation samples and apply these to the model and calculate the error.
§ This validation error represents how our algorithm performs on samples whose targets it hasn’t been trained on
§ This better represents how our trained algorithm might work on unseen data from different sources.
• We then unlock the model and re-apply the training data and continue repeating these steps until we finish training.