Lecture 11: Probabilistic models of language processing and acquisition Flashcards
What does BF Skinner proposes regarding language acquisition? Which linguistic theory does his proposition fall into?
He proposed that we acquire language through a combination of imitation, reward and practice ( = learning the right reaction (=a word) to a situation (=object/concept)).
This falls into the field of behaviorism.
What does Ferdinand de Saussure proposes regarding language acquisition? Which linguistic theory does his proposition fall into?
Words are connected with their tokens. We connect the representation/ object with some arbitrary/symbolic/conventional connection (string of characters).
This falls into the field of structuralism.
What does Chomsky proposes regarding language acquisition? Which linguistic theory does his proposition fall into?
Chomsky proposes that language cannot be solely acquired through exposure, but there have to be some innate factor, called universal grammar, that gives us the capability of learning a language. According to Chomsky, all switch panels are in a neutral position. The exposure to language either turns these panels on or off.
This falls into the field of generativism, which states that all language have some universal principles in common (example: a subject needs a verb.)
WATCH OUT: These universal principles can have different parameters.
Example: subjects can either be explicitly stated (I am here) or included in the verb (Estoy aquí), but these both meet the universal principle.
What s the difference between the generativism approach and the probabilistic one? Explained through an example.
According to Generativism, a native speaker would recognize that ‘she singed the song’ is incorrect due to innate linguistic capabilities. Here, we classify into “possible” and “impossible”.
From the probabilistic approach, a native speaker would recognize it’s incorrect because it’s statistically unlikely. This approach suggest to classify into “more likely” and “less likely”, whether than making it an “either - or”.
What is the Bayesian approach in language processing?
You start with a random understanding of what your output should be. Then, you update your understanding based on new data (Initial belief + new data = updated belief). The more data you collect, the more accurate your understanding of the output becomes.
[See lecture 0:41:00 for “balls on the table” example.]
What is the “minimal attachment” concept in serial language processing?
This concept states that the less nodes you can have in a syntactic tree, the better (lower “costs”).
[Example : “ The girl saw the boy with the telescope.” see lecture slides.]
What is the probabilistic difference between the two sentences “The girl saw the boy with the telescope” and “The girl saw the boy with the book”?
In the first sentence, it could be the girl who saw the boy through a telescope, or she could have seen the boy carrying a telescope. Both are more or less equally valid.
In the second sentence, because books and seeing are not related, it becomes more likely that the book is connected to the boy, and not the girl.
What are the requirements of language comprehension?
- Model of general knowledge (Understanding of how the world around us works. Example: ‘the lion ate the antilope’ vs ‘the antilope ate the lion’)
- Theory of mind
- Principles of pragmatics (Understanding how different actions/words can have different meanings, depending on the context)
What is the Poverty of Stimulus? What does the existence of this concept propose? How can we use this concept in the probabilistic approach?
Children are not exposed to enough linguistic data to learn for just exposure. This proposes that we don’t have to be exposed to every single possible output: we have the innate ability to recognize right and wrong.
Computational models that work with probabilistic approaches to infer the right words work better on context-free grammar than natural languages.