FINAL Flashcards

1
Q

2 higher order theories of consciousness (HOT)

A
  • higher-order perception
  • higher-order thought
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

higher-order perception

A

idea that consciousness arises when you acknowledge yourself perceiving other things

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

higher-order thought

A

idea that humans are aware of thought processes, and this brings about consciousness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

3 criticisms for higher order theories

A
  • we are aware of stimuli, not thoughts about stimuli (meta-thought is not necessary for consciousness)
  • doesn’t take conscious action into account
  • no neuro basis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

global workspace theory (GWT)

A
  • input processors compete for attention (similarly to pandemonium -> loudest will enter consciousness/workspace)
  • (conscious) global workspace broadcasts input to other brain areas for voluntary action
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what 3 things does GWT account for

A
  • intentional action
  • information retention
  • planning and creative thought
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

global neural workspace theory (GNW)

A

used fMRI to map brain areas with different functions, these propagate signals across to motor systems to organize voluntary action

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what structure is thought to account for consciousness and why according to GNW

A

pyramidal neurons because of widespread structure that can connect many areas including PFC and temporal lobes for voluntary action

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

monism

A

body and mind made of one substance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

dualism

A

body and mind made of two distinct substances (possibly connect through pineal gland)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

functionalism

A

mental states are constituted only by their functional roles (i.e. accounts for multiple realizability); brain = hardware, mind = software

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

reductionism

A

break everything down into parts, no longer discuss the whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

advantage of reductionism

A

easy to test each individual part

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

disadvantage of reductionism

A

some things can’t be broken down

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

emergence

A

sum > parts e.g. janitor’s dream

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

rat memory experiment setup

A

play a tone, then shock the rat. if they remember, they should be afraid every time they hear the tone.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

how to prevent a rat from learning new things

A

administer drug while / near the time of the initial shock

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

how to prevent a rat from remembering things

A

administer drug during recall, it will alter the original memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

can the rat memory experiment be done in humans and is it effective

A

yes, drug was used with PTSD patients and was somewhat effective, it makes the painful memory less painful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

best way to preserve a memory in its original form

A

don’t recall it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

why can we never be 100% certain about material reality (& order of processing pipeline?)

A

real world -> senses -> processing -> awareness; reality is always filtered through senses and processing therefore never 100% objective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

cartesian theatre

A

example of higher-order perception (perceive yourself perceive something)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

problem with cartesian theatre

A

if homunculus is another person inside your head, what about their consciousness?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Dennett multiple drafts model

A

different senses have different processing streams where they process things intermittently, and objects can reach ‘fame’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

‘fame’ (multiple drafts model) & what is it related to

A

related to GWT; something that reaches awareness when not being processed, and is broadcast to other brain modules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

‘draft’ (multiple drafts model)

A

there can be multiple version of one stimulus in one stream due to lots of processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

can stimuli continue to be processed after they reach fame

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

David Eagleman task summary

A
  • monitor motor cortex
  • high specificity watch
  • any time you feel urge to press button, note the time
  • up to 2 seconds before reporting feeling urge, brain has already made the decision for you (without your awareness)

therefore do we have free will?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

eagleman democracy of mind

A

rivalling streams competing for attention (similar to pandemonium

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

3 features of eagleman democracy of mind

A

redundancy, competition, emotional + rational processing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

redundancy of democracy of mind

A

each stream is processed by multiple competing systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

competition of democracy of mind

A

e.g. pandemonium, which stream is loudest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

emotional + rational processing of democracy of mind

A

both kneejerk processing and more rational

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

what does IIT slicing allow for

A

seeing which systems are affected lets you determine which systems are dependent on each other vs independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

what is phi value when slicing

A

number of dependent subsystems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

what does a high phi value mean (theoretically)

A

more consciousness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

negatives of IIT (5)

A
  • panpsychism
  • hard to calculate
  • possible to ‘hack’
  • just an opinion
  • no definition for consciousness but now we have a number?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

structural brain imaging (2)
plus 2 examples

A
  • shows anatomy
  • used for tumors, strokes, lesions. etc
  • CT and MRI
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

functional brain imaging (2) plus 3 examples

A
  • shows blood flow / electricity
  • used during experiments and diagnosis
  • fMRI, PET, EEG
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Computed Axial Tomography (CAT / CT) scan

A

slice by slice through brain (white = bone)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

MRI

A

machine shoots radio waves at tissue, only H responds, creates excitation, H stores info, then measure H activity (white=bone)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

CT advantages over MRI (3)

A
  • better spatial res
  • cheaper
  • faster
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

MRI advantage over CT

A
  • better contrast
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

fMRI

A

(multiple scans)
oxygenated blood -> processing -> deoxygenated blood
areas where blood is being deoxygenated light up (blood oxygen level-dependent (BOLD) signal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Positron emission tomography (PET) scan

A

(rainbow scans)inject radioactive glucose, brain activity uses sugar -> see glow more

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

fMRI advantages over PET (3)

A

better spatial and temporal res, no radioactivity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

PET advantages over fMRI (3)

A
  • faster
  • quieter
  • cheaper
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

EEG advantages (5)

A
  • fast
  • cheap
  • safe
  • direct relation to brain activity
  • good temporal res
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

what type of scan do we not have yet

A

full CNS scan

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

how do diff types of memory differ (3)

A

capacity, duration, content

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

2 types of sensory memory

A

iconic (visual = 250-200ms) and echoic (auditory = several secs)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

why no olfactory, gustatory and tactile sensory memory

A

difficult experimental protocol

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

normal capacity of sensory memory

A

4-5 items e.g. letters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

what did Sperling find sensory memory capacity could be extended to with training

A

9-12 items e.g. letters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

what happens to sensory memory with a 1s distractor (masking)

A

removes majority of it (back down to like 3 items)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

what happens to sensory memory with a cue to indicate direction (too short for conscious awareness)

A

restores performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

how do blinking vs blank screen affect sensory memory

A

blinking = disrupts performance
blank screen does not

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

working memory duration without rehearsing

A

18s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

working memory capacity (& how to improve it)

A

7 +- 2
chunking

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

3 types of coding for working memory

A
  • acoustic (e.g. get confused because stimuli sound the same)
  • semantic (e.g. get confused because categories have similar meanings)
  • visual (e.g. rotation tasks are processed degree by degree)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

is working memory scanning done in serial or parallel, and how do we know (in lab) (2)
& caveat to these results

A
  • RT linearly correlated to set size (7+-2)
  • we don’t terminate as soon as we find the number = exhaustive (not serial)
  • perhaps only parallel processing because we want to do well in lab
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

2 types of LTM

A

explicit/declarative and implicit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

2 types of explicit/declarative memory

A

semantic and episodic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

implicit memory =

A

procedural

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

why is assessing LTM capacity hard

A

must max it out = how?
- hard also because of memory reorganization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

LTM duration (3 phases)

A
  1. rapid decline over first 3y without reinforcement
  2. stable at 75% for ~25-30 years
  3. another decline, possibly due to general cognitive decline
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

how does better learning/memorization affect LTM duration

A

higher starting point, but curve stays the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

coding explicit/declarative memory

A

various locations across cortex (distributed representation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

coding implicit memory

A

production (e.g. if then rules) in cerebellum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

what did Lashley find about LTM location (engram)

A

takes longer to retrain a rat depending on the size of the brain chunk that was removed, but rat does not forget

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

equipotentiality (Lashley)

A

brain areas can take over for each other after damage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

Hebb rule

A

neurons that fire together wire together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

long-term potentiation (LTP)

A

high freq = more receptors develop on receiving neuron

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

HM (procedure and result)

A

removal of hippocampus = anterograde amnesia (no new memories), but all memories up to that point are intact

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

hippocampus function

A

memory consolidation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

why is cognitive science reverse-engineering

A

we are trying to figure out how an already existing thing works

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

machine definition

A

any cause-effect system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

4 features of computation (and what is it ultimately)

A
  • rule-based
  • shape-based
  • implementation-independent (i.e. multiply realizable)
  • semantically interpretable

aka symbol manipulation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

weak Church/Turing hypothesis

A

turing machine can compute anything that can be computed by a general-purpose digital computer

80
Q

strong Church/Turing hypothesis

A

turing machine can compute anything that can be computed

81
Q

what is mathematics fundamentally

A

syntax/semantics (manipulating shapes based on rules)

82
Q

computationalism (strong AI)

A

‘cognition is computation’ (not really true)

83
Q

what premise must be adopted because of computationalism’s definition

A

that computation is implementation-independent

84
Q

Searle’s Chinese Room Argument (cognition is not computation)

A

even if you memorize and execute the whole rule book, you do not understand chinese

85
Q

Searle’s periscope (the implementation-independence of computation)

A

no implementation of the rule book leads to understanding mandarin

86
Q

turing test hierarchy

A

t1 = toy (regurgitating patterns)
t2 = verbally indistinguishable e.g. chatgpt
t3 = verbal + robotic
t4 = verbal + robotic + neuro

87
Q

which level of turing hierarchy is disputed by Searle’s Chinese room argument

A

t2

88
Q

what level does Harnad think is correct

A

t3

89
Q

symbol grounding problem

A

can look up mandarin symbols indefinitely but if you never have a referential meaning for the symbols the cycle will go on indefinitely

90
Q

minimal grounding set

A

the 1000 words that can be used to define every other other word in the dictionary

91
Q

how does minimal grounding set get its meanings (3)

A
  • direct sensorimotor grounding (DSG) (e.g. trial and error in real world)
  • indirect verbal grounding (IVG) (e.g. describing the sensorimotor stuff)
92
Q

what is a powerful way of grounding new words, but what is the issue with it

A

language; only works if you understand the words being used

93
Q

why is computationalism the cogsci dominant theory

A

because it allows for everything to expressed as an algorithm

94
Q

how are real neurons similar to fake neurons

A

lots of inputs converge on a neuron/’black box’ to give 1 output

95
Q

what is first step after input in artificial neuron

A

weigh each input

96
Q

what comes after weighing each input in artificial neuron

A

sum of all inputs x weights

97
Q

what comes after the sum of inputs x weights in artificial neuron

A

bias = add 1 number to every value to simulate base excitation level

98
Q

what comes after bias in artificial neuron

A

activation function e.g. threshold

99
Q

what is the implementation of an artificial neuron

A

dot product aka matrix multiplication (multiple input, outputs, weights, and biases)

100
Q

what is input x weight in neuron implementation called

A

parameter

101
Q

how mant parameters does chatgpt have

A

1.76 trillion

102
Q

what is computer core like intel good for

A

good for complex math (smart but few)

103
Q

what is each computer graphic core good for

A

very basic math (dumb but many)

104
Q

what is sigmoid function an analogy for in artificial neurons

A

neuron threshold + maxing out

105
Q

ReLU (rectified linear unit)

A

1 to 1 ratio of increase (above 0 on x-axis), output of 0 (below 0 on x-axis)

106
Q

what do OR, AND, NOT functions have in common

A

each component just looks at its inputs, then transmits a 0 or 1 signal

107
Q

perceptron model

A

same as neuronal model (inputs, weights -> output)

108
Q

perceptron update rule (2)

A

small delta = desired output - actual output

big delta = random small value * small delta * input
* must calculate big delta separately for each weight in perceptron model

109
Q

what does big delta mean

A

how much you should change each weight in the perceptron model to achieve desired output

110
Q

why does applying big delta not always lead to exact desired output

A

because of the small random value (E)

111
Q

what type of learning is applying big delta

A

supervised learning

112
Q

multi-layer perceptron (MLP) / dense neural network

A

every layer is connected to every element in the next layer

113
Q

‘deep’ neural network’

A

contains hidden layers that are not directly trained

114
Q

NN training (general) (5)

A
  • randomly initialize all weights
  • put input through model (feed-forward), receive predicted output
  • calculate loss (desired output - predicted output)
  • backpropagate loss through network
  • update all weights and biases
115
Q

convolutional kernel

A

pattern it looks for; allows for feature search over a whole image and avoid localizability

116
Q

how does convolution work

A

multiple neurons (e.g. 3x3) for 1 single pixel = convolved feature, repeat for whole image

117
Q

AlexNet concept

A

same image can go through multiple convolutional kernels at the same time to look for more than 1 feature

118
Q

physical symbol system

A

a set of entities (symbols) that are physical patterns that can occur as components of another type of entity called an expression/symbol structure

119
Q

what are all rules in the brain theoretically saved as

A

expressions

120
Q

how are concepts stored in the brain

A

combinations of many features

121
Q

simon’s symbolic model of emotion

A

(in higher level system) start -> physical symbol system (expression) -> done

122
Q

what does simon’s symbolic model of emotion not account for (4)

A
  • classification of emotions
  • which emotions are more important
  • physiological markers
  • neuronal basis of emotion
123
Q

how can simon’s symbolic model. of emotion reprioritize based on emotions / urgency

A

CNS can cause interruption

124
Q

how did Ekman find basic emotions

A

tested Papua New Guinea people of fore culture on emotion identification

125
Q

ekman’s 6 basic emotions

A

fear
anger
surprise
sadness
happiness
disgust

126
Q

2 claims about ekman’s emotions

A
  • distinct emotions with distinct physiological features
  • evolutionary functions (hardwired)
127
Q

2 criticisms on ekman’s basic emotions

A
  • link between emotion and physiological response?
  • sociological impact necessary for proper development?
128
Q

how do emotions differ (2)

A

duration and role

129
Q

axes of russel’s circumplex model of emotions

A

valence (goodness) and arousal (engaging) (2D scale, arranged in a circle)

130
Q

Adolph & Anderson modern alternative to emotions

A

7 dimensions to create interspecies framework of emotions

131
Q

7 dimensions of Adolph & Anderson emotion framework

A
  • scalability: varying intensity
  • valence: pleasantness
  • persistence: outlast stimulus
  • generalization: specificity to stimulus
  • global coordination: engage whole organism
  • automaticity: how challenging to control
  • social coordination: social functions
132
Q

appraisal theory of emotions

A

emotions lead to change in the perception of the environment (how emotions relate to cognition)

133
Q

how an emotional episode is created (3)

A
  • triggering stimuli and context interact to form perception
  • perception creates somatic and neural responses as well as cognitive evaluation/appraisal and emotional feelings
  • leads to behavioural / verbal responses
134
Q

5 tools for studying emotions

A
  • neural responses
  • somatic responses
  • affective responses
  • genetic tools
  • lesion studies (temp and permanent)
135
Q

3 examples of genetic tools to study emotions

A

knockout experiments, optogenetics, pharmacogenetics

136
Q

can we use language without communicating

A

not really

137
Q

Paul Watzlawick’s Axioms of communication (5)

A
  • one cannot not communicate
  • communication is diff between diff people
  • communication is punctuated
  • communication involves digital + analogic modalities (verbal + non verbal)
  • communication can be symmetrical or complementary
138
Q

Clark’s language characteristics (5)

A
  • communication
  • arbitrary (e.g. why is ‘truck’ a truck)
  • structured (syntax rules)
  • generative (infinite)
  • dynamic (constantly adding new words)
139
Q

3 types of linguistic representation

A

auditory, visual, haptic (braille)

140
Q

language processing pipeline (4)

A

phonemes, morphemes, syntax and semantics, pragmatics

141
Q

phonology (3)

A

sounds of letters, IPA, spectrogram

142
Q

coarticulation

A

phonemes modify each other

143
Q
A
144
Q

why is absence of freq on spectrogram not good for identifying word boundaries

A

occur within words too

145
Q

temporal induction

A

strong top-down influence on phoneme perception

146
Q

2 types of morphemes

A

stem and bound morphemes

147
Q

how does auditory word recognition occur

A

all possible words get activated until stimulus narrows down options

148
Q

how does written word recognition occur

A

we don’t read letter by letter, we fixate and process the rest in our periphery

149
Q

what causes increased word processing time (2)

A
  • diff phonemes e.g. pint vs mint
  • double meanings
150
Q

garden-path sentence

A

immediate meaning sounds wrong, must reparse and get to different outcome (include syntactic ambiguity)

151
Q

Chomsky vs Lakoff

A

chomsky = syntax
lakoff = semantics

152
Q

how are all sentences represented according to lakoff

A

pictograms

153
Q

5 pragmatic features

A

-assertives
-directives
-commissives (commit to later action)
-expressives (about mental state)
-declaratives

154
Q

what do decision trees involve

A

lots of arbitrary biases

155
Q

2 features of expert systems

A

explainable, can be hand-crafted

156
Q

2 disadvantages of expert systems

A

difficult to handcraft, falls apart with large data/edge cases/nonlinear correlations

157
Q

what is visual hierarchical processing similar to

A

CNN processing

158
Q

3 CNN features

A
  • sparse connectivity (every input does not connect to every output)
  • shared weights
  • invariance under translation (look for 1 feature across whole image)
159
Q

autoencoders

A

encodes itself to learn compressed representation; dense encoded data reconstructed through backpropagation

160
Q

what do smaller bottlenecks lead to in CNN autoencoders

A

worse reconstruction

161
Q

latent space interpolation

A

reconstructing interpolated vectors allows you to see what comes between the 2 start and end vectors

162
Q

latent space arithmetic

A

allows you to generate data that was not part of data set
e.g. smiling woman - neutral woman + neutral man = smiling man

163
Q

why must we do arithmetic on latent space rather than image itself

A

image - image could lead to 2 noses, all background space, etc.

164
Q

why are some word embeddings e.g. dictionary position of a word not useful

A

because not meaningful

165
Q

why is latent space arithmetic with words sometimes problematic

A

can lead to biases

166
Q

recurring neural nets (RNN)

A

loop model into itself

167
Q

what does RNN involve

A

sliding window (3 words as input -> following word as output)

168
Q

what do RNNs allow for (in theory)

A

learn truth value of certain statements

169
Q

what makes RNNs better

A

storage

170
Q

why does storage help RNN

A

because otherwise separator characters have no data to draw a conclusion from

171
Q

neural machine translation with encoder and decoder RNN

A

encode until ‘end token’, then decode and output

172
Q

2 RNN types

A

GRU cells and LSTM

173
Q

GRU cells

A

gated recurrent unit, only short term memory

174
Q

LSTM

A

long short term memory; built in mechanism so you cannot delete stuff from beginning

175
Q

problem with predictive text

A

often causes cyclic phrases if only pick #1 option

176
Q

how to solve predictive text problem

A

add randomness i.e. pick an option in top 10

177
Q

does looking at character frequency in isolation lead to language models

A

no (e.g. sd n oeiam etc.)

178
Q

bigrams

A

co-occurences of letters/words (e.g. on inguman ise forenoft etc - not real words but resemble it more)

179
Q

how to get good predictive text

A

use word bigrams/trigrams/n-grams to form word sequences

180
Q

cosine similarity

A

cosine of angle between 2 vectors = quantifiable similarity between 2 things (dot product between 2 vectors)

181
Q

person embedding

A

turn big 5 scores into vectors, fine cosine similarity, if vectors point in roughly same direction, people should get along better

182
Q

what can be done with vector values (similar to latent space arithmetic)

A

make a slider out of each value in a vector, see what meaning each specific number corresponds to e.g. square 4 = personhood

183
Q

proxy tasks

A

know how many of each object there is, ask something like “what is there the highest quantity of”
- teaches system to count and rank things implicitly

184
Q

CBOW (continuous bag of words)

A

(proxy task to learn word embeddings)
- ask system to fill in blank (implicitly teaches system similarities between concepts e.g. king and queen can both sit on throne)

185
Q

what type of learning are proxy tasks

A

supervised

186
Q

GPT context window

A

number of words a transformer looks at before giving output; the more words you look at, the more contextual questions you can answer

187
Q

what does chatgpt use instead of word embeddings

A

tokens (allows for certain characters to be grouped together therefore system can run better)

188
Q

what must transformers learn

A

correspondence

189
Q

key-value storage

A

key = concept
value = explanation
query = question about a concept (answer should be value)

190
Q

how to execute key-value queries (6)

A
  1. calculate word embedding (e.g. make a vector for word ‘is’)
  2. turn vector into 3 vectors: query, key, value
  3. make key and value vectors for ‘sky’ and ‘the’
  4. run query of ‘is’ against its own key
  5. also run the query of ‘is’ against keys of ‘the’ and ‘sky’ (cosine similarity test)
  6. highest cosine similarity will be the value (in this case, probably sky; value of sky=blue therefore output blue)
191
Q

narrow/weak AI

A

good at one task (possibly better than humans)

192
Q

artificial general intelligence

A

good at all tasks, can learn new tasks, can learn anything a human can and do it better

193
Q

why do datasets matter for transformers

A

can learn different things from different datasets

194
Q

problems w datasets

A

biases

195
Q

reinforcement learning from human feedback

A

add training data into the model (feedback on how good response was)

196
Q

problem w reinforcement learning from human feedback

A

sparse reward