LUDE midterm 1 Flashcards

1
Q

what is phonology?

A

how sounds are organized in languages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is morphology?

A

how words and word forms are built

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is syntax

A

how to build sentences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is Semantics

A

meaning of words and sentences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is pragmatics?

A

how meaning works in context

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what are the 2 sub fields of phonetics?

A

sounds that human vocal tract can produce // gestures that sign languages have

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is NLP

A

Natural language processing is a subfield of computer science and (AI) that helps computers understand and communicate with human language.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are the goals of NLP?

A

NLP allows computers and digital devices to recognize, understand and generate text and speech.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are the three types of writing systems?

A

Alphabetic systems
Syllabic systems
Logographic systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what language is an example of the alphabetic system?

A

English and korean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what language is an example of the syllabic system?

A

Japanese

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what language is an example of the logographic system?

A

chinese

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how are the 3 types of writing systems differentiated?

A

the content represented by the symbols/characters in the written language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

how is the alphabetic system split up?

A

phonemic, abjads and phonetic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the phonemic alphabet?

A

Sets of letters arranged in a specific way, each letter represents a phoneme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an abjad?

A

also known as consonant alphabets. They have independent letters for consonants and may indicate vowels using some of the consonant letters and/or with diacritics.
ie: arabic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is the phonetic alphabet?

A

symbols associated with the sounds of english letters ie: ipa

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the Syllabic system?

A

building blocks of speech, usually with a structure of CVC

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the Abugidas system?

A

the main element is the syllable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is an example language in the Abugidas system?

A

Hindi, cree, dene

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what is the importantce of diacritics in the Abugidas system?

A

they change or mute the inherent vowel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the syllabary system?

A

A syllabary has a different glyph for each syllable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what is transliteration?

A

a conversion of the characters in one writing system to another system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

why is IPA important & why is it helpful?

A

ipa accurately describes pronunciation. IPA eliminates the ambiguities of spelling by assigning unique symbols to each distinct sound,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the logographic system?

A

a symbol representing a unit of meaning, chinese

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the pictograph system?

A

pictures of the items to which they refer, Traffic symbol systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

what is a bit?

A

binary digit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

how many bits there are in a byte?

A

1 byte = 8 bits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

can you explain a byte?

A

A group of eight 0s and 1s is a byte.
If we have 8 slots and each of them can be 1 or 0, it means we have 28 (=256) unique combinations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

what is ascii?

A

The Standard Code for Information Interchange ASCII, common character encoding format for text data in computers and on the internet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

how many symbols ASCII can encode

A

128 symbols, 33 non printables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

what is unicode?

A

represent the characters in ALL writting systems

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

how many bytes are in utf8?

A

1-4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Each sequence of bytes begins with a…

A

0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

The amount of 1s before the initial 0 tells the computer…

A

how many bytes are in one symbol.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Binary (Base-2) system is represented by

A

only 0s and 1s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Decimal (Base-10) system is represented by

A

decimal uses 0-9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Hexadecimal (Base-16) is represented by both…

A

letters and numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

what is the main difference between UTF-8 and UTF-32

A

UTF-8

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

whats the difference between vowels and consonants

A

vowels require the vocal tract to be open and consonants have the vocal tract closed or partially

Consonants have low amplitude while vowels have high amplitude

41
Q

whats the difference between voiced and voiceless consonants

A

whether or not the vocal cords vibrate

42
Q

what is acoustic phonetics?

A

study of speech sounds, amplitude of waveforms, and frequency on spectrum

43
Q

what is a sample rate?

A

the number of recorded discrete points

44
Q

what are the key concepts of acoustic phonetics?

A

Frequency, Amplitude, Formant

45
Q

what is frequency?

A

cycle per second Pitch, high & low note, from auditory perspective

46
Q

What is Amplitude?

A

loudness

47
Q

what is formant?

A

a concentration of acoustic energy around a particular frequency in the speech wave

48
Q

how can f1 identify a vowel?

A

F1 corresponds to the height of the vowel, openness of the mouth

49
Q

how can f2 identify a vowel?

A

F2 corresponds to the frontness or backness of the vowel, position of the tongue

50
Q

why is spoken language harder to ‘adapt’ for computer in comparison to the written language

A

Different vocal tracts
Dental alignment and oral anatomy
Different pronunciations
Dialects, variations
Speech sound disorders

51
Q

what is ASR

A

auto speech recognition: processing of human speech into a written format

52
Q

What is used to train a machine learning-based ASR system (what it learns from)?

A

We give audio imput computer looks at spectrogram freq, hz, and formants and learns from it

53
Q

how did speech recognition work before machine learning

A

Matching spectrograms data with templates.
Speaker-dependent machines

54
Q

why are ASR technologies are important for the endangered languages documentation?

A

theres a lack of textual data so asr processes speech data to textual

55
Q

what is parametric speech synthesis

A

speech is based on pitch, duration and formants

56
Q

what is neural speech synthesis

A

speech is based on raw audio waveforms from text

57
Q

what are the four approaches computational linguistics?

A

Rule-based approach
Statistical approach
Machine learning approach
Hybrid approach

58
Q

what are three reasons why consistent spelling is important?

A

Faster reading;
Efficient communication;
Easy access to information;

59
Q

what are the 3 types of spelling error?

A

typos, nonword errors, & real word errors

60
Q

whats a typographical error?

A

we pressed the wrong word

61
Q

whats a Nonword errors

A

misspelled words, unrecognized names, insertion deletion, phonetic spelling

62
Q

what is a morpheme?

A

The smallest meaningful unit

63
Q

whats a free morpheme

A

they can stand alone as independent words. They don’t need to be attached to other morphemes like cat

64
Q

whats a bound morpheme?

A

cannot stand alone as independent words. They must be attached to a free morpheme (a base or root word) (un-, unhappy)

65
Q

whats an inflectional affix

A

a segment will attach to the word but it wont change the word type ie) like –> likes is still a verb

66
Q

What is a derivational affix?

A

a segment will attach to the word but it WILL change the word type

67
Q

What is the correct order of the spell-checker workflow?

A
  1. text processing
  2. non word error detection
  3. generation of candidates
  4. suggestions
  5. user decision or auto correct
68
Q

what is tokenization?

A

splitting a text into words;

69
Q

what is stemming?

A

removing inflectional suffixes

70
Q

what are the 2 Possible Causes of Spelling Errors?

A

Language-specific issue, & Technology-related factors

71
Q

what is POS tagging

A

breaking the words down into their type

72
Q

whats an example of user imput?

A

the full sentence that you type in
ie) this cat is bigger than mine

73
Q

whats an example of tokenization

A

full sentence into individual words

74
Q

whats an example of stemming

A

removing inflectional suffixes - this cat be big then i

75
Q

what are two reasons why dictionary methods of spell-checking are not always
the most effective?

A

Long wordlist and they keep adding words
Unit of entry, different words for prepositions cat –> cats

76
Q

whats an n-gram?

A

N-grams are sequences of “n” items from a given text or speech. These items can be words, syllables, letters, or phonemes.

77
Q

How do you count the number of word/character n-grams

A

Identify N: Decide on the value of “n” (e.g., 2 for bigrams, 3 for trigrams).
Split the text: Break the sentence or paragraph into individual words.
Form the N-grams: Group the words in sequences of “n”.

78
Q

what does the Soundex system do?

A

words with similar characteristics are in a bin and a misspelt word with a similar key and characteristics will be pulled from said bin

79
Q

how do you convert a word to soundex

A

use the calculator or ask chatgpt

80
Q

how does the confusion matrix work?

A

A confusion matrix is a visualization of how well a classification model is performing. It shows the actual vs. predicted results for your model, helping you see where it’s making correct predictions and where it’s getting things wrong.

81
Q

what are the rules for edit distance?

A

substitution1, deletion1, transpose2, insertion1

82
Q

3 possible operations in dynamic programming are….

A

delete, insert, substitute

83
Q

what is the goal of the dynamic programming method?

A

Technical solution to finding the most efficient route

84
Q

what is a real word error?

A

real word error is a word thats spelt correctly but the meaning isn’t write ie) their is 4 swans

85
Q

why are real word mistakes more difficult for computers to fix than non-word mistakes?

A

because real word errors are spelt correctly but their intended meaning is wrong

86
Q

whats a syntactic tree?

A

a syntactic tree is a way of organizing a sentence into phrasal categories

87
Q

what are the 2 techniques that grammar checkers use?

A

relaxation-based techniques and mal-rules

88
Q

what is a relaxation-based technique for grammar checking?

A

it can be forgiving of mistakes typically improper use of verb/nouns

89
Q

what is the mal-rule technique for grammar checking?

A

person input rules in to computer and computer learns based off of rules

90
Q

why does mal rule suck

A

because you have to enter in all the rules

91
Q

how do you calculate probability?

A

look at slide 19 on 6.2

92
Q

what is wordnet?

A

wordnet is a website that compares the SEMANTIC relationship between words

93
Q

what is a learner’s language corpus

A

collection of written or spoken texts produced by language learners used to study their language patterns, errors, and development.

94
Q

what are 2 reasons why large language models (LLM) are better in real word mistake detection?

A

they have a better understanding of context, can catch agreement mistakes between clauses and can adapt to writing preferences

95
Q

what is call?

A

Computer-Assisted Language Learning

96
Q

what is icall?

A

icall uses linguistic properties to make CALL better

97
Q

what is a frame-based call system?

A

anything multiple choice or fill in the blank

98
Q

what is a positive transfer model?

A

syntactically the learning language is similar to the known language

99
Q

what is a negative transfer model?

A

syntactically the learning language is NOT similar to the known language and when trying to speak or use learning language they try to apply known language rules