LECTURE 7 Flashcards
Artificial neural networks
Computational structure formed of individual units•Individual unit = “neuron”•Inspired by brain structure•Model for neuroscience and cognitive science
Neurons in the brain
Synapses from other nerve cells release transmitters into the dendrites (input)•When electric potential in somaexceeds threshold, the neuronfires (trigger), and•an action potential is sent downthe axon to downstreamneurons (output
mportant properties of the brain
Massive parallelism−1 kHz versus several gHz clock speed−but 1011neurons•Graceful degradation•Plasticity−both strength of connections and structure of network
Perceptron
McCullogh-Pitts neuron with parameters w0, … wplearnedfrom data•Can be used for classification•Seen before: “logistic regression” with step function as activation function!•Inductive bias: what functions can it represent?
Decision boundaries
A classifier divides input space into regions−points in region have same class−decision boundary: border between regions•What does perceptron’sdecision boundary look like?
Linear separability
A hypothesis is linearly separable if its decision boundary is linear•Perceptronscan onlyrepresent linearly separablehypotheses•This is a strong inductive bias as it restrictsthe types of concepts that can be represented by perceptrons
inear models vs decision trees
Linear models such as perceptronstake all attributes into account-those attributes only interact in simple ways, i.e. addition and subtraction-strong inductive bias: as the number of inputs pincreases, the fraction of (Boolean) functions that can be represented decreases exponentially in p•Decision trees are good at representing functions where you only need to look at a few attributes to make a decision-attributes can interact in interesting (i.e. non-linear) ways-bad at “simple” functions involving many attributes, such as majority vote
Argumentation
Intelligence requires non-linearly separable hypotheses•We need multi-layered networks to represent these•What representations/features in the hidden layers?-local features(e.g., top left corner, left-hand side) not good enough, e.g., cannot represent simple hypothesis such as connectedness-far too many possibilities for global features; should be able to learnthose, but don’t know how to (in 1969)
Representation of Boolean function
Boolean function: from binary inputs to binary output•Any Boolean function can be represented by network with single complete hidden layer•Possible construction: specialized hidden unit for each possible input example; output OR function of hidden units•Number of hidden units required grows exponentially in number of inputs (worst case
Representation versus learning
So, neural networks can represent any function: no inductive bias•But then high risk of overfitting!-Various techniques to prevent this-In practice well-suited for smooth, somewhat nonlinear functions•Learning is even harder than representation•Learning rule is called “backpropagation”:-Gradient descent on usual error functions-Repeated application of chain rule:
Summary neural networks
Inspired by actual neurons: “fire” when input exceeds a threshold•Learning the network means fitting the weights•Perceptron: single neuron with single layer of weights; very similar to logistic regression•Linear models have linear decision boundaries and can only solve linearly separable problems•Multi-layered networks generalize linear and logistic regression and can represent any nonlinear function•Hidden units become clever feature extractors
Learning goals: neural networks
Explain how a McCullogh-Pitts neuron computes its output and how it relates to actual computation in the brain−Compute and visualize (in two dimensions) the decision boundary corresponding to the weights of a simple perceptron−Explain why a simple perceptron can only solve linearly separable problems−Show how to learn a (simple) perceptron from a data set−Explain the difference between a perceptron and a logistic regression model−Explain how the addition of hidden nodes allows a network to represent non-linear decision boundaries−Compute the output of a neural network when given its weights and its inputs−Find a neural network for a simple classification problem (e.g. the XOR)