week 11- corpus based CDA Flashcards
what is introspection and what are the pros and cons
Introspection- biased, one person sat in an armchair
Thinking about what you know about language
pros:
-Gives competence data
cons:
-Our cognitive biases can result in inaccuracies
what is corpus analysis and what are the pros and cons
Computer-aided analysis of large data sets
pros:
-Based in reality
-Findings can be generalised -Deep(er) analysis
cons:
- Not always easy
- Potential for poor analysis
what is a corpus
- A large body of text
- Representative of language (or a genre of language)
- In machine-readable form (e.g. text files on a computer)
- Acts as a standard reference about what’s typical in language
- Often annotated with additional linguistic information –e.g. grammatical codes
why use a corpus
• Allows us to test theories of language and culture
• Large amounts of data tell us about tendencies and what’s normal or typical in real-life language use
– Corpora also reveal instances of very rare/exceptional cases; we wouldn’t get from single texts or introspection
• Helps to remove bias
– We are biased towards the noteworthy
– We are biased towards things that are easy to think of, but may not be normally used
• Human researchers make mistakes and are slow – Computers are much quicker and more accurate
similarities and differences between critical discourse analysis and corpus linguistics
• Similarities between CDA and CL
–Empirical –both begin with hypotheses that can be tested through observation and experiments
• Though of course two researchers can still disagree on results –Based on the study and analysis of actual texts •
Differences between CDA and CL
–CDA is (traditionally) focused on in-depth, qualitative analysis
–CL is (traditionally) focused on large-scale quantitative analysis
how can the mode of analysis employed in CDA can be criticised
- Close analysis of individual texts: cherrypicked?
- Does the analysis account for everything?
- One solution: use corpus methods
- Looking at lots of data with a partially-quantitative analysis helps you be more objective and less prone to bias than otherwise
key methods in corpus linguistics
frequency
concordance
collocation
keyness
CL frequency
how frequent is a certain feature (x)? (how many times do we mention ….)
- Get the computer to count how often each word occurs
- Does this change over time? Across samples?
CL concordance
all the examples of a word or phrase from a corpus, plus some of the surrounding context
- Some researchers use a corpus as a database of examples, or concordance lines
CL collocation
the measure of the relationship between words that co-occur together in texts (words attract or repel each other)
–and thereby derive their meanings (Sinclair 1991, 2004)
–Can show a word’s (socialised) ‘meaning’
CL keyness
a keyword occurs more frequently in a text than you would expect by chance alone (compared to a benchmark)
–Can indicate what’s interesting/unique about the text
what is a frequency list
Simply a list of words and their frequencies in a corpus
Do men and women live in different cultures?
• Lakoff, Spender, Tannen etc. have all argued that men and women use language differently. –e.g. Lakoff’s (1975) theory of “women’s language”
- Empty adjectives
- Hedges
- Precise colour terms
Comparing sociolinguistic variation –men vs. women
Schmid, 2003
- looked at gendered language in the British national corpus.
- women use more: - - Empty adjectives
- Hedges
- Precise colour
discourses surrounding “the elderly”
• Mautner (2007) looks at discourses surrounding “(the) elderly” in contemporary English
–… using corpus methods: particularly collocation, concordance analysis
- - Looked at ‘elderly but’ to see what people collocate with the word
eg. elderly but… boyish, feisty, fit, loved….
- While elderly may have been a euphemism for old once, there are definitely no grounds now for describing it as “a polite way of saying old” as per its dictionary definition.
- Important point for our purposes: what Mautner found is a “hidden meaning”, a “hegemonic discourse” about a (relatively) powerless group in society
what is collocation
• Collocation: the systematic co-occurrence of words in use
• Two words that co-occur are collocates of one another
• We can also say that one word collocates with another
• The word we want to examine for collocates as the node word (Stubbs 2001)
Some examples;
• telephone –operator
• back –front (e.g. back to front, front and back)
• tell –story (e.g. tell me a story)
Why are we interested in collocates?
• Collocation is meaning:
–Firth famously said, “You shall know a word by the company it keeps”
–There are aspects of the words bachelor and spinster that we can’t know unless we look at their collocates (sexist ideology, bachelor= positive, spinster= old)
Marked in society
–All words seem to have collocates
• They can expose patterns ‘not visible to the naked eye’
• They can help us to establish ‘usual’ usage, so we have a reason to talk about unusual usage of words
how may Collocations and language learning affect L2 learners
Pawley and Syder(1983) –theory that L1 speakers have memorised thousands of rare collocational idioms, many of which unknown to L2 speakers.
- We learn which words attract and repel each other, L2 learners don’t always get this
eg. while 100% of L1 speakers put ‘nooks and crannies’ together, only 19% of L2 speakers did.
what did Stubbs say about collocation?
‘…if collocations and fixed phrases are repeatedly used as unanalysed units in media discussion and elsewhere, then it is very plausible that people will come to think about things in such terms.’ (Stubbs 1996)
what is semantic preference?
• A common semantic field around a word – i.e. if several words, not collocates themselves, together form a semantic category which does “collocate”