Class 2: Introduction to artificial intelligence and its relationship to cognition Flashcards
Wolfram, S. L. (2023 February 14). What is ChatGPT doing... And why does it work? Stephen Wolfram Writings. https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work
The article discusses how ChatGBT’s primary goal is to produce a “reasonable” continuation of the text it has received. Explain what is meant by “reasonable.”
“What one might expect to write after seeing what people have written on billions of webpages, etc.”
Large language models produce a ranked list of words that might follow in a sentence, along with their probabilities. The AI will always pick the word with the highest probability.
FALSE
True or False: The human brain has about 100 billion neurons, each capable of producing an electrical pulse up to perhaps a thousand times a second. Neurons are connected with each other in a complicated net, with each neuron having tree-like branches allowing it to pass electrical signals to perhaps thousands of other neurons. The production of an electrical pulse in a given neuron at a given moment is independent of what pulses it has received from other neurons.
False: In a rough approximation, whether any given neuron produces an electrical pulse at a given moment depends on what pulses it’s received from other neuronsÑwith different connections contributing with different “weights.”
What are the potential risks associated with ChatGTP use in this article?
The article highlights some of the potential risks that come with using ChatGPT. One of the main risks is that if the model is not trained or monitored correctly, it may generate misleading or harmful information. Additionally, the widespread use of language models like ChatGPT could lead to a loss of privacy since these models require a large amount of data to function effectively. Lastly, the article acknowledges the ethical concerns surrounding the use of language models, including the risk of bias, and emphasizes the importance of ensuring that these models are used for the greater good of society.
What does ‘temperature’ parameter mean and what is the best number to indicate ‘temperature’?
It determines how often lower-ranked words will be used and ‘temperature’ of 0.8 is the best.
GPT-2 used sets of 12 attention blocks and attention heads in order to manage its decision-making process. How many does the improved GPT-3 use?
GPT-3 uses a collection of 96 attention blocks and attention heads.
When selecting the next word in a sequence, why doesn’t ChatGTP always just pick the word with the highest probability?
A lot of repetition would emerge and it would create an uninteresting essay.
Explain what softmax is and how it works
Softmax is the process of generating a probability distribution over any other possible words or phrases. It fundamentally takes a set of numbers and creates an output from the model’s neural network and maps it onto a probability distribution.
What are transformers and how do they help ChatGPT generate responses?
Transformers are a type of neural network architecture that allows ChatGPT to process sequences of input, such as sentences or paragraphs.
According to Stephen Wolfram’s article, what is ChatGPT trying to do?
Fundamentally, ChatGPT is attempting to produce a ‘reasonable continuation’ of the text it has been given.
How neural networks are being used to generate human-like responses to text prompts in the ChatGPT language model
ChatGPT uses a neural network called a Transformer, which is trained on a large text dataset and learns to predict the probability of a given the word or phrase appearing in a particular context.
How does ChatGPT generate “reasonable continuations” of text, and why is it sometimes necessary for the model to randomly select lower-ranked words to produce more interesting and creative output?
ChatGPT predicts the likelihood of a given word or sequence of words occurring in a sentence and generates text by selecting the next word with the highest probability. To produce more interesting and creative output, the model sometimes randomly selects lower-ranked words based on a “temperature” parameter.
What technology does ChatGPT use to generate replies?
One technique ChatGPT uses to generate response questions is called “generative pre-training”.
The __________ architecture is a type of neural network designed for processing sequential data, such as text, and uses self-attention mechanisms to enable the model to attend to different parts of the input sequence.
Transformer
What are the potential applications of ChatGPT?
Chatbots and virtual assistants are potential applications.
Define the term “Neural Networks”.
Neural Networks — a type of machine learning algorithm inspired by the structure and function of the human brain. They are composed of interconnected nodes or “neurons” that can learn and adapt to patterns in data to make predictions or classifications.
The whole process of training a neural net can be characterised by seeing how the loss (error) progressively decreases. And what one typically sees is that the loss ______ for a while, but eventually flattens out at some constant value. If that value is sufficiently ______, then the training can be considered successful - otherwise it’s probably a sign one should try changing the network architecture. A. increases, large B. increases, small C. decreases, large D. decreases, small
D. decreases, small
Finish the sentence: If you’re trying to get a neural net to learn a function (e.g. to replicate a graph with a boxy line), you first have to choose/figure out the weights. This is done byÉ
Supplying lots of input to output” examples to “learn from” - and then to try to find weights that will reproduce these examples. “
True or False - The number of possibilities is larger than the number of particles in the universe.
Yes.
What bias happens in this situation? Lucy heard that Mike injured Daivd, so she very hates Mike. After this, no matter how Mike and other friends persuade her and explain the reason for hurting David to her, she only believes what she heard.
Confirmation Bias.
Explain the function of a perceptron.
Perceptrons help classify the data that is input to the neural network. It is classified in two parts, therefore it is known as a linear binary classifier. It functions as the most simple part of a neural layer.
True or False: The ‘temperature’ parameter in ChatGPT that determines how often lower-ranked words are used in essay generation is best at 0.9.
FALSE
Fill in the blanks: ChatGPT is based on a ______ network. It is essentially trying to produce a “r_________ c___________”
Neural, reasonable continuation.
About how many neurons are in the human brain?
100 billion
What are the future development directions of ChatGPT?
The future development direction of ChatGPT includes better dialogue quality and efficiency, better emotion recognition function and multilingual support.
Fill the missing word. In neural net training, the numerical values assigned to the connections between neurons in a neural network are known as (missing word).
weights
rue or False: ChatGPT produces a ranking list of probabilities on what would the next word most likely to be and always use the highest-ranking word.
False. There is randomness in essay generation which makes it more “creative”.
True or False: ChatGPT utilises whole words to compute.
False (generally), it uses “tokens which are linguistic units that could be whole words or segments like “pre” or “ing” or “ized”. “
True or False: are neural nets only relevant to human brains?
False
How does Chat GPT take user feedback into account?
When users rate Chat GPT’s output, a new neural network model is created to predict user ratings. The new model then runs like a loss function on the original network continually adjusting the network to user preferences.
What is the tradeoff between capability and trainability?
The more capable a system is, the less trainable it becomes. Conversely, the more trainable a system is, the less capable it becomes.
What is a neural net?
Neural nets are a type of machine learning algorithm that are simple idealizations of how human brains seem to work. Like a human brain, neural nets learn more through practice and repetition. The nodes and neurons that make up these neural nets perform mathematical operations on the input data to produce an output.
What is a temperature parameter for ChatGPT responses?
A temperature parameter determines how often a lower-ranked word (relative to the highest-ranked word calculated to be a probable match for responses) will be selected. For example, when ChatGPT responds to essay prompts, a ‘temperature’ of 0.8 yields the best results in essay generation.
Is it the following statement true or false: it’s not clear whether there are ways to “summarize what it’s doing”
TRUE
Explain the concept of temperature as it pertains to language generation in ChatGPT, including the ideal temperature that ChatGPT uses to predict words.
It is a parameter that is used to adjust the randomness and creative output of of ChatGPT. The ideal temperature is approximately 0.8, otherwise the text becomes repetitive. A temperature of 0.8 allows for a degree of randomness while still ensuring the output is relevant and coherent.
how does ChatGTP talk to us continuously (work)?
It operates in three basic stages. First, it takes the sequence of tokens that corresponds to the text so far, and finds an embedding (i.e. an array of numbers) that represents these. Then it operates on this embeddingÑin a “standard neural net way”, with values “rippling through” successive layers in a networkÑto produce a new embedding (i.e. a new array of numbers). It then takes the last part of this array and generates from it an array of about 50,000 values that turn into probabilities for different possible next tokens. (And, yes, it so happens that there are about the same number of tokens used as there are common words in English, though only about 3000 of the tokens are whole words, and the rest are fragments.) However, according to ChatCPT’s answer, it is an AI language model created by OpenAI. It communicates with us through natural language processing (NLP) technology. It analyzes the text us enter and uses a combination of algorithms, statistical models, and machine-learning techniques to understand the meaning of our input and generate a response that best answers our question or fulfills our request.
TRUE/FALSE:Showing a neural net repetitive examples when training it is always redundant because it is in the same state for each training round.
FALSE. Generally, neural nets need to see a lot of examples and at least for some tasks, the examples can be incredibly repetitive. It is standard strategy to show a neural net all the examples one has, over and over again. In each of these training rounds” the neural net will be in at least a slightly different state, and somehow “reminding it” of a particular example is useful in getting it to “remember that example.” However, it is normally also also necessary to show the neural net variations of one example.”
True or false: The architecture behind ChatGPT allows it to constantly learn from previous interactions in order to tailor its responses to individual users.
TRUE
True or False: ChatGPT is expected to produce a reasonable continuation of existing texts on a related topic.
TRUE
ChatGPT will always pick the highest ranked word when deciding what word to pick next
False. Always picking the highest ranked word can make a piece of text seem flat so lower ranked words are often used to make the text more interesting
True or False: It is possible to completely eliminate cognitive biases.
False. Although there are steps and ways to reduce dependency on cognitive biases, it is not possible to completely eliminate them.
Inside ChatGPT is a giant _____ consisting of 175 billions of weights
Neural net
What does ‘temperature’ mean in regards to ChatGPT?
A parameter that determines how often lower-ranked words will be used.
True or False: Often just showing a neural net the same example over and over again isn’t enough. It’s also necessary to show the neural net variations of the example. These variations may only need to be slight modifications.
TRUE
If you give ChatGPT the same prompt several times, would you get the same answer each time or different answers?
Different answers
What does ChatGPT stand for?
Chat Generative Pre-Trained Transformer
True/Flase -> In a neural net bigger networks generally do better at approximating the function we are aiming for.
True
What is the overall goal of ChatGPT?
to continue text in a “reasonable” way, based on what it’s seen from the training it’s had (which consists in looking at billions of pages of text from the web, etc.)
What is the source of the training data for ChatGPT? a) A large collection of images b) Randomized phrases generated by a computer c) Human-written text from books, the web, and other sources d) Sounds and speech recorded from people
C: Human-written text from books, the web, and other sources
How do we make a neural net do a recognition task?
Take an input corresponding to a position (x,y) and to recognise it as whichever of the three points it is closer to
What is “X”: To train ChatGPT, neural net training is seeking “X” to reproduce the given examples?
“X” = Weights, the neural network relies on the weights to interpolate (generalise) between the given examples
What are some limitations of ChatGPT
As GPT is not perfect, it can have errors in generating biased or inappropriate responses as well as not having a full understanding of language. Also, it can be limited to what resources it receives
What is the fundamental goal of ChatGPT?
The fundamental goal of ChatGPT is to produce a “reasonable continuation” of a given text, based on what one might expect someone to write after seeing what people have written on billions of web pages, etc.
Define the term ‘loss function’ in the training process ofneural nets
The discrepancy between the current values of the function and the desired function. This value is calculated in order to adjust the weight of the function to be able to reproduce the function that we want.
What is the purpose of a temperature parameter?
ChatGPT incorporates a temperature parameter to determine the frequency of utilising ‘low-ranked words’.
True or False: ChapGPT has been trained on vast amounts of text data to improve its accuracy and ability to understand the context.
TRUE
What is so special about machine learning through neural nets?
Their ability to learn to do things
The basic operation of the (___) is also very simple, consisting essentially of passing input derived from the text it’s generated so far “once through its elements” (without any loops, etc.) for every new word (or part of a word) that it generates
The basic operation of the (neural net) is also very simple, consisting essentially of passing input derived from the text it’s generated so far “once through its elements” (without any loops, etc.) for every new word (or part of a word) that it generates.
True of False: During training, ChatGPT will progressively adjust the weights in the network in an attempt to accurately reproduce the desired function.
True. The training method uses a loss function (how far away are the current weights from the desired end goal). This loss function will decrease progressively until the network reproduces the desired function *(within an approximation margin).
The neurons are connected in a complicated net, with each neuron having _________ branches allowing it to pass electrical signals to perhaps thousands of other neurons
tree-like
A model commonly used to train Artificial Intelligence which reflects the processes of the human brain is called a what?
Neural net.
What is ChatGPT fundamentally trying to do?
ChatGPT is trying to produce a continuation of the text it has gotten so far with the available input it has access to.
What is the question that ChatGTP constantly asks itself as it writes?
Given the text so far, what should the next word be?
What the difference between syntactic and semantic grammar, and what’s necessary for ChatGTP to handle the latter?
Syntactic grammar refers to the structure of language, where as semantic grammar refers to the meaningfulness of language. For ChatGTP to properly grasp semantic grammar, it would need a “model of the world” to refer to, which could be acheived through coding.
What are the potential ethical issues of CHATGPT mentioned in the text? (At least two)
Bias; discrimination; misinformation; manipulation; privacy; security.
What does the optimal temperature for essay generation refer to?
The optimal temperature for essay generation refers to the temperature setting used in language models like GPT-3 to control the degree of randomness and creativity in the generated text. The temperature determines the degree to which the model is willing to take risks and produce unexpected outputs, versus sticking to more predictable and safe choices.
Define what reasonable continuation means when said “ChatGPT is always fundamentally trying to do is to produce a reasonable continuation” of whatever text it’s got so far.””
Reasonable continuation in this context alludes to what might be written after reviewing billions of readings. The ChatGPT reply produces a list of ranked words that might follow the previous word, together with “probabilities” after reviewing the readings. However, “temperature” parameter that determines the probabilities of how often lower-ranked words will be used.
Describe the term loss function.
The loss function calculates the sum of the squared differences between a machine learning model’s anticipated output and the actual output (the goal) for a given input. More importantly, it’s an essential concept in machine learning because it’s used in ChatCTP to direct training and gauge how well the model fits the training data.
What are neural networks and LLMs (like ChatGTP) simple idealisations of?
The human brain
define neural nets
simple idealisations of how the brain works - specifically discussed is the process of how humans form a thought upon recognising something
Fill in the blank. ______ _______ is a machine learning algorithm that mimics the idealised function of the human brain and was used to train ChatGPT.
Neural Network
Define computational irreducibility
Computational irreducibility is a concept that refers to the idea that some computational problems cannot by simplified or reduced in a meaningful way. This concept says that some problems do not have quick or predictable ways to solve, essentially a limit on computational capabilities (such as neural networks). Rather these problems must be studied through human intuition, experimentation, and observation.
what makes neural nets so useful?
Is not only can they in principle do all sorts of tasks, but they can be incrementally trained from examples to do those tasks.
Why does ChatGPT use ‘tokens’ instead of words?
The use of tokens instead of words makes it easier for ChatGPT to deal with rare, compound, or non-English words. Tokens can be words, and can also be parts of words, such as “ing”, “pre”, “anti” etc.
Define embedding.
Embedding involves assigning numbers to text and words to represent their meaning, and grouping similar meanings to nearby numbers.
How many connections/weights are there in ChatGPT’s neural net?
Approximately 175 billion
What does ChatGPT stand for? Give a sentence on what it does
ChatGPT stands for Generative Pre-Trained Transformer (GPT) and is a language model that uses a text through its large dataset to generate responses to prompts and questions an individual may ask.
The concept of ‘embeddings’ refers to the way we try to represent the essence of something by an array of numbers (true/false).
True!
How does ChatGPT generate responses for you?
After analysing the prompt that you give it, it then uses statistical probability of what it has learned to generate a response that is likely to be relevant and informative
What is ChatGPT?
ChatGPT is large language model (LLM) that is trained on large amounts of human-created text. It then utilises this information to estimate probabilities and generate meaningful text after given a prompt.
What are some potential limitations of ChatGPT?
bias from the datasets used for training, risk of generating inappropriate or offensive responses, lack of understanding of social norms or cultural context, relies on large amounts of data (not suitable for situations with little data or privacy restrictions)
Define “Embedding”.
The assignment of a number to a type of stimulus (in ChatGPT’s case, common English words) that help to group like stimulus with a similar essence” together. “
What is a temperature parameter?
Temperature is a parameter used to control the level of randomness/unpredictability/creativity in the generated text. Higher temperatures result in more diverse and unpredictable output and lower temperatures result in more conservative and predictable outputs
True or False “Machine Learning codes the defining characteristics of an object and uses these said defining characteristics to determine what the object is”
False, Machine learning acquires a plethora of examples to determine whether the new object fits within the constraints of the prior examples.
Why are loss functions important for large language models?
Loss functions are important for large language models like GPT as they provide a measure of how well the model is able to predict the next word or sequence of words in a given text, allowing the model to generate more accurate and contextually appropriate text by minimizing the loss function during training.
What is a neural network?
A neural network is a type of machine learning algorithm that is modelled on the human brain. It is made of layers of interconnected nodes, or neurons, which perform mathematical computations on input data and pass the results to the next layer until the final output is generated. Neural networks can model a wide variety of functions with high execution and training performance.
TRUE or FALSE: Artificial neurons in ChatGPT take numerical inputs, multiply them by some weights and feed them forwards to end up with the next ‘token’ in a sequence.
TRUE
How many percent is AI’s ability to learn ?
4.5
What is a neural net?
Neural networks are simplified models inspired by the workings of the human brain. Our brains are complex networks of nerve cells that are connected to assist in processing information. When we look at an image, photoreceptor cells at the back of our eyes convert the image into electrical signals that travel through layers of neurons to help us recognize the image. Neural networks use mathematical functions to simulate this process.
The whole process of training a neural net can be characterised by seeing how the loss (error) progressively decreases. And what one typically sees is that the loss ______ for a while, but eventually flattens out at some constant value. If that value is sufficiently ______, then the training can be considered successful - otherwise it’s probably a sign one should try changing the network architecture. A. increases, large B. increases, small C. decreases, large D. decreases, small
D. decreases, small
What is backpropagation?
Backpropagation adjusts the weights and biases of the ANN and corrects its random guesses and to make them less wrong. The way an ANN learns is by making adaptive changes. The probability of making the right calculation improves with each backpropagation and is one of the most frequently used learning rules in many applications of artificial neural networks.
What kind of prompt tend to make ChatGPT to “wander off” in non-human-like ways?
Having to make longer texts. Essay, stories etc.
How does “temperature” help ChatGPT generate text that sounds like a human wrote it?
When ChatGPT generates text, it selects its next word from a ranked list of words based on their probability of being next. It asks repeatedly what the next word should be and adds it. The temperature parameter(with a value of 0.8 considered as optimal) determines how often lower-ranked words will be used. The randomness in ChatGTP’s selection of lower-ranked words makes for more interesting writing.
What is a large language model (LLM) and how does it estimate the probabilities with which sequences of words should occur in natural language text?
A large language model (LLM) is a neural network trained on vast amounts of text to predict the likelihood of a given word or sequence of words occurring in a sentence. The LLM uses this training to generate more accurate and coherent language output, and complete tasks such as language translation, summarization, and question answering.
Decribe a neural network
A neural net is a computational model that is modeled/inspired after the structures and function of the humain brain. Neural nets consists of interconnected nodes or “neurons” that are arranged into a layered structure. It can then be trained to learn patterns and relationships in data using large data sets.
True or False: When discussing how ChatGPT is trained to retrieve correct/true answers, the “mountain lake” metaphor describes how training neural nets to predict true values is limited to its weight landscape. In neural network training, a loss function is created to measure the difference between predicted and true values. The goal of training stages is then to minimise the loss function. However, the mountain lake metaphor illustrates that this optimisation process is not guaranteed to find the global minimum of the function, but only a local minimum - similar to how water flowing down a mountain will eventually end up in a lake at the bottom. There may be other lower points in the loss function (an ultimate global minimum) that are not reached in training stages.
TRUE
What are the thumbs up and thumbs down buttons on the right for?
They provide feedback for ChatGPT’s responses and help inform its future responses
Why does ChatGPT take a while to generate a long piece of text?
It is because when ChatGPT generates a new token, it has to do a calculation involving every single one of the 175 billion weights.
According to the Wolfram (2013) article, what is one of the main challenges associated with developing natural language processing systems like ChatGPT?
Training the machine learning models used by these systems on diverse and representative datasets in order to generate high-quality, human-like responses.
How does one train a neural network from text? a) By presenting a batch of examples and then adjusting the weights in the network to minimize the error b) By presenting the network with a list of common words and their definitions c) By inputting large amounts of data and allowing the network to learn on its own d) By manually adjusting the weights in the network to produce the desired output
A: By presenting a batch of examples and then adjusting the weights in the network to minimize the error
Briefly describe the process of neural net training
A large number of examples of input and output are to be given to the system. The system will then ‘learn’ from these examples. What we can do is find the weight that is suitable for the system to be able to reproduce these examples, by relying on its ability to generalise between the examples reasonably.
What is the key reason for the neural net in ChatGPT to be so useful?
That is somehow captures a “human like” way of doing things/thinking.
How many weights does chatGPT have?
175 billion
True or False: can neural nets be trained to do different tasks for effectively?
TRUE
What is computational irreducibility?
Computational irreducibility is a phenomenon in which there are some computations that can’t be reduced to simpler steps, and must be computed step-by-step in order to determine the outcome.
In Essence, what is a neural network trying to minimize?
Error
Finish the sentence: a Neural Net is a simple
Idealisation of how brains seem to work.
What is ChatGPT?
ChatGPT is a natural language processing software that uses machine learning to generate responses to text inputs that mimic human-like communication.
How does ChatGTP ensure that its essays aren’t boring?
It uses a temperature of 0.8 - which means 20% of the time, it will randomly select a word that isn’t the highest ranked word.
When ChatGTP is writing an essay, what is the purpose of having a “Randomness” function (with a temperature of 0.8) in ChatGPT rather than letting it choose the most probable word that would follow?
To create an essay that sounds more creative and less flat than it otherwise would.
What is ChatGPT’s basic task?
Continue a piece of text that has been given
With the result of list of words, will the highest-ranked words be picked to add to the essay (or whatever) that it’s writing? T/F
False.
ChatGPT when determining its probabilities and consequently which words it picks utilises words of ____ probability to achieve writing of ____ style.
- A moderate 2. Creative and more interesting
The best way in simple terms, to ChatGPT is that it is representing reasonable continuation of the text it was given to work with so far. Through the scanning of the never ending literature on the internet, it bases it’s answers off probability and matches in meaning to create readable information that makes sense to the human language. And this is only the beginning of what it can do..True or False?
TRUE
Fill in the blank: The ______ ___ passes input from the text generated so far “once through its elements” (without loops) for each new word (or part of a word) that it generates.
Neural net
What are the three important aspects of working with Neural Network?
- The Architecture of a Neural Network needs to be considered for a particular task. 2. It is critical to obtain the necessary data to train the Neural Network 3. It is important to incorporate existing, trained Neural Networks or use them to generate training examples for a new Neural Network.
ChatGPT’s model produces a “reasonable continuation” of text based on what? A) The model uses random words to continue the text B) The model uses a predefined set of words to continue the text C) The model uses a probability-based ranking system to determine the most likely word to follow the given text based on billions of webpages and digitized books D) The model uses a word association tool to predict the next word in the text
C) The model uses a probability-based ranking system to determine the most likely word to follow the given text based on billions of webpages and digitized books.
What is unsupervised learning in neural net training?
It’s where the neural net must find patterns in the data on its own.
What does GPT stand for?
the GPT in ChatGPT stands for Generative Pre-trained Transformer, which is a type of language model based on deep learning
How is Chat GPT different from other language models?
ability generate longer and more complex responses, use of contextual information to generate more relevant responses, ability to generate responses that are more diverse and creative