Lecture 20- Information, prediction, people Flashcards
What do we want when programming a network?
To find the weight that produces the lowest error rate
How do we find the lowest error rate?
Gradient descent- chose a random place on the curve and go downwards from there until get no further improvement
A bigger slope= ?
A bigger step down
When does learning via gradient descent work? When does it not?
- When network is simple with 2 levels (input-output)
- Not good for 3 levels as only have error data at the last part
What is the term for passing back error data across multiple levels?
Back propagation
What are the two steps for finding the lowest error rate in a more complex network?
- Assign blame for an error (which node?)
- Back propagation
The error landscape is…
multidimensional
What is a problem with gradient descent?
-Could find a local minimum instead of the true/ global minimum
What is the solution to the local minimum problem?
- Solve the network multiple times
- Bounce around a lot
- Simulated annealing
How does this concept work in a fitness landscape?
- Change genes instead of weights (natural selection)
- Maximize fitness not errors
- But natural selection is not enough as can only move up the slope
What are solutions to the natural selection problem?
- Genetic drift
- Sex
- Genetic teleportation
- Genetic recombination
How do we notice when we make errors?
We have a predictive model of what is meant to happen that we monitor ourselves against
How do we also incorporate us as social into this?
- Source 1 of predictions is the other person’s actions
- Source 2 is my mind’s prediction of source 1
How do we reduce uncertainty in a social capacity in terms of predictions?
- Limit choices by nudging other person to go in a certain direction
- Do this by sending info but doing as least as possible to limit confusion
- Inference
Our learning error is based on…
predicting both ourselves and others creativity