AI Revision Flashcards
What is combinatorial explosion?
sIs where the possible combinations or permutations increases rapidly with the size of the input or problem
What is the turing test? what is the objective? and provide one example of the test in real life
The turing test was a test designed by Alan Turing which consisted of an interrogator being connected to a person and a machine via a terminal and cannot see either of them, and its object would be to find out which candidate is the machine or human only by asking them questions. If the machine could fool the interrogator 30% of the time the machine is considered intelligent.
Real Life Examples: -> CAPTCHA | SIRI
What is machine Learning?
The ability to learn without being explicitly programmed. Seen as part of AI, it builds a model based on sample data.
What is supervised learning vs unsupervised learning
Is giving the machine learning model data which is supervised meaning that it relies on label input and output training data where as unsupervised learning processes unlabelled or raw data.
SIMPLE TERMS -> SUPERVISED -> Train a model on labelled data, goal is to predict the label of new instances. UNSUPERVISED -> Analysis of unlabelled data, goal is to uncover hidden patterns
What are a few examples of Supervised and Unsupervised Learning: (2 each)
Supervised Learning ->
1. Handwriting Recognition
2. Traffic prediction
Unsupervised learning ->
1. Ad targeting ( group similar customer together and send adverts specific to those clusters)
2. Crime hotspot (Identify areas with high crime rates)
What is clustering?
Is a technique used in unsupervised learning, it is given unlabelled dataset and similarity metric. And the goal is to find ‘natural’ partitioning, or groups of similar data points.
What is association rules?
Is a technique used in unsupervised learning, it is given a set of transaction records containing items, and it’s goal is to produce dependency rules, to predict occurrence of one variable based on occurrences of another variable(s).
What is data mining?
Analysis of large quantity of data to discover, valid, non-obvious data to the system, data that should be possible to act on the item and ultimately humans should be able to interpret the pattern/model in data.
How is data trained in supervised learning?
Data is split into 2 types of data, (Training data) which is used to learn the parameters of the model and (Test data) which is used to test how well the model works
What is pros and cons of decision trees
Pros -> Easy to interpret and explain decisions, reasonable training time.
Cons -> Small variations in data can lead to very different trees, only simple decision boundaries, over-complex trees - overfitting.
How to construct decision trees
In decision trees there are a few things to consider,
1. The root node (largest information gain)
2. The Internal nodes so true / false are considered decision rules on features.
3. Leaf nodes predicted class label
What is learning regression?
Supervised learning technique to estimate a relationship between an output variable Y (label) and input features(s) X | GOAL - Find line of best fit.
Pros and cons of learning regression
Pros -> easy to interpret, fast to train, can work with limited training data.
Cons -> Only applicable if relationship is linear, relies on assumption, none of the features are highly correlated
What is model evaluation?
How well does the model do on the validation set.
What is model tuning?
Adjust model hyperparameters, e.g. -> maximum depth of decision trees
What is overfitting?
Is where the model that performs well on the training data but poorly on unseen or test data because it has become too specialized to the training data and is not able to generalize well to new data.
What is the concept of search space?
Set of all possible solutions to a problem
Search trees construct
Root Node, Child/parent nodes / depth of a node / branches / leaves
States - nodes (possible states of the problem e.g. initial)
Search Space - all nodes in the tree (the set of all states reachable from the initial state)
Operators - Branches (a set of actions that move one state to another.)
Neighbourhood - All possible states reachable from a given state.
Goal Test - Test if the search reached a state that solves the problem
Path Cost - how much it costs to take a particular path
What is the issues with search tree?
Combinatorial Explosion, Search efficiency (computationally expensive).
What is the BFS Method
Expand root node first
Expand all nodes at level 1 before expanding level 2
What is the DFS Method
Expand root node first
Explore one branch before exploring another branch
What is the UCS Method
Expand root note first
Always remove the smallest cost node first in the queue
What is learning rate?
Learning rate dictates how quickly the network converges by adjusting different scale to correct the weights.
What is a linearly separable function?
Is a function which can be separated | Only linearly separable functions can be represented as a single layer NN, e.g. perceptron
Explain the learning rate if too low and too high
Too low: Very small changes to weights → training will take a very long time however it is guaranteed to converge to correct weights if they exist
Too high: : Drastic adjustment to weights each time → may jump over a good combination of weights
Neural network pros and cons
Pros - 1.Can learn more complicated decision boundaries.
2. Can handle a large number of features.
Cons - 1.Difficult to interpret, 2.difficult to design neural network architectures that work well for a problem - lots of hyper parameters to tune, 3.slow training time