III: Intelligent techniques Flashcards
What area does Bayes rule relate to?
Conditional Probability
What is bayes Rule?
P(H|E) is the probability of Hypothesis H being True based on Evidence E.
P(H|E) = P(E|H)P(H) / P(E)
What is P(H|E)?
Probability of H being True given evidence E
What is P(E|H)?
The probability we will observe E given H is true
What is P(H)?
The a priori probability that hypothesis H is true
What is P(E)?
The probability of observing E
What does Bayes rule say we ‘should do’?
We should update our knowledge with new information
What are three approaches to NLP?
- Symbolic (rely on rules and logic to represent and process language)
- Statistics
-ANN based
What does NLP stand for?
Natural Language Processing
What are some types of symbolic NLP?
-Rule based (instructions)
-Grammar based
-Lexical databases - store words definitions and their contextual relationship)
What are two types of Statistical approaches to NLP?
- Probabilistic models: analysing language patterns
- N-grams: analyse sequences of n consecutive words to predict the next
What ‘networks’ does an ANN based NLP approach use?
RNN, recurrent neural networks
What is the NLP pipeline/process?
A series of steps that transform raw text into a format suitable for further analysis
What are 4 types of text pre-processing?
- normalisation
- tokenisation
- stopword removal
- stemming
What is the first step of the NLP pipeline?
Tet pre-processing
What is step 2 in the NLP pipeline?
Feature engineering
What is the 3rd step in the NLP pipeline?
Advanced processing
What is normalisation in the context of text processing?
Removing punctuation and capital letters (can remove useful information)
What is tokenisation?
Transforming to basic versions of text e.g. ‘10k to 10000’ or ‘wanna to want’
What is POS tagging?
Part-of-Speech: Assigning words their purpose e.g. verb, adjective
What is NER?
Named entity recognition: identify named entities such as person, number, organisation, locations or objects
What do statistical models of NLPs do with the list of words?
Calculate probability of tags or the weights of features
What is the feature engineering BoW technique?
Bag of words: format text data into numerical matrix
What does LLM stand for?
Large Language Model
What are the main two types of neural networks?
RNN - recurrent
CNN - convolutional
Which type of Neural Network is good for text? And which for images?
text - RNN recurrent
images - CNN convolutional
What are three types of LLM training?
- unsupervised pre-training
- Supervised fine-tuning
- Reinforcement learning
What does temperature mean in reference to a LLM?
Controls randomness on the final output layer