Linguistic and Representational Concepts Flashcards
Difference between a root and a lemma
In natural language processing, the root of a word is the base form of the word, without any inflections or affixes. For example, the root of the word “running” is “run”, and the root of the word “cats” is “cat”. The root of a word is also known as the stem.
On the other hand, the lemma of a word is the base form of the word that is used for dictionary lookup and inflectional analysis. The lemma of a word may be different from its root, especially in cases where the root has been modified by inflections or affixes. For example, the lemma of the word “running” is “run”, but the lemma of the word “cats” is “cat”, since the plural form of the word “cat” is not listed in the dictionary.
Overall, the key difference between a root and a lemma is that the root is the base form of a word without any inflections or affixes, while the lemma is the base form of a word that is used for dictionary lookup and inflectional analysis.
Difference between inflectional and derivational morphology
Inflectional morphology and derivational morphology are two types of word formation processes in natural language. Inflectional morphology involves the addition of inflections or affixes to a word to indicate grammatical features such as tense, person, gender, and number. Derivational morphology, on the other hand, involves the creation of new words by adding affixes or combining words in different ways.
Inflectional morphology is typically used to indicate grammatical features that are inherent to a word, such as its tense or person. For example, the suffix -s is added to the base form of a verb to indicate the third-person singular present tense, as in the word “runs”. Inflectional morphology is regular and predictable, and it does not change the basic meaning of the word.
Derivational morphology, on the other hand, is used to create new words with new meanings. For example, the prefix un- can be added to a word to indicate the opposite meaning, as in the word “unhappy”. Derivational morphology is less regular and predictable than inflectional morphology, and it can change the basic meaning of the word.
What is a lexical compound? In what context would it come up?
A lexical compound is a type of word that is formed by combining two or more words or word parts. Lexical compounds are common in many languages, and they can be found in a variety of contexts.
There are several types of lexical compounds, depending on the structure and meaning of the words that are combined. Some common types of lexical compounds include the following:
Compound nouns: These are words that consist of two or more nouns, such as “bookcase” or “software”.
Compound verbs: These are words that consist of two or more verbs, such as “to babysit” or “to walk out”.
Compound adjectives: These are words that consist of two or more adjectives, such as “blue-green” or “well-intentioned”.
Compound adverbs: These are words that consist of two or more adverbs, such as “sometimes” or “everywhere”.
Lexical compounds can be found in many different contexts, including everyday language, technical language, and literature. They are often used to create new words that are more precise or expressive than the individual words that make up the compound.
Overall, lexical compounds are an important part of the vocabulary of many languages, and they can be found in a variety of contexts.
What is Part-of-Speech and where is it used?
Part-of-speech is a grammatical category that is assigned to each word in a sentence, based on its syntactic function. In natural language processing, part-of-speech tagging is the process of automatically assigning a part-of-speech to each word in a sentence, using a set of grammar rules or a probabilistic model.
There are several common parts of speech, including nouns, verbs, adjectives, adverbs, and pronouns. Each part of speech has a specific role in the sentence, and the correct assignment of part-of-speech to each word is important for understanding the meaning of the sentence.
Part-of-speech tagging is used in many natural language processing tasks, such as parsing and sentiment analysis. In these tasks, the part-of-speech tags provide important information about the syntactic structure of the sentence and the meaning of the words, which can be used to make predictions or generate output.
Overall, part-of-speech tagging is an important step in natural language processing, as it allows us to analyze the syntactic structure of a sentence and to extract the meaning of the words in it.
Difference between Open-class Words and Closed-class Words
Open-class words and closed-class words are two categories of words in a language. Open-class words are words that can be added to the vocabulary of a language, such as nouns, verbs, adjectives, and adverbs. Closed-class words, on the other hand, are words that are fixed in the vocabulary of a language, such as prepositions, conjunctions, and articles.
Open-class words are typically the most common and productive words in a language, and they are used to convey the meaning of a sentence. They are often the focus of language learning and language teaching, as they are the words that are most frequently used and that have the most variation in meaning.
Closed-class words, on the other hand, are typically less common and less productive than open-class words. They are used to indicate the syntactic structure of a sentence, rather than its meaning. They are often learned implicitly, as they are not the focus of language learning and teaching.
Overall, the key difference between open-class words and closed-class words is that open-class words can be added to the vocabulary of a language, while closed-class words are fixed in the vocabulary of a language.
When is a grammar context-free?
A grammar is context-free if it can be expressed using context-free rules. In formal language theory, a context-free grammar is a type of formal grammar that consists of a set of productions, or rules, for generating strings of symbols. A context-free grammar is said to be context-free if the left-hand side of each production consists of a single non-terminal symbol, and the right-hand side can consist of any combination of terminal and non-terminal symbols.
A context-free grammar is called context-free because the productions do not depend on the context in which the symbols appear in the sentence. In other words, the meaning or function of a symbol is determined solely by its position in the sentence, rather than by the symbols that appear before or after it.
Context-free grammars are used in natural language processing to describe the syntax of a language, and they are often used to generate or parse sentences in that language. They are also used in computer science and other fields to model the structure of systems and processes.
Overall, a grammar is context-free if it can be expressed using context-free rules, and if the productions do not depend on the context in which the symbols appear in the sentence.
What are terminal and non-terminal (phrasal) categories
n formal language theory, terminal and non-terminal symbols are two types of symbols that are used in grammar to generate strings of symbols. Terminal symbols are the basic building blocks of a language, and they represent the words or tokens that appear in a sentence. Non-terminal symbols, on the other hand, are symbols that represent phrases or larger units of meaning in a sentence.
Terminal symbols are also known as lexical categories, as they represent the individual words or tokens in a sentence. Examples of terminal symbols in English include nouns, verbs, adjectives, and adverbs. Terminal symbols are the smallest units of meaning in a sentence, and they cannot be broken down into smaller units.
Non-terminal symbols, on the other hand, are also known as phrasal categories, as they represent phrases or larger units of meaning in a sentence. Examples of non-terminal symbols in English include noun phrases, verb phrases, and prepositional phrases. Non-terminal symbols are composed of one or more terminal symbols, and they can be further broken down into smaller units of meaning.
Overall, terminal and non-terminal symbols are two types of symbols that are used in grammar to generate strings of symbols. Terminal symbols represent the individual words or tokens in a sentence, while non-terminal symbols represent phrases or larger units of meaning in a sentence.
What are bounded and unbounded dependencies?
In natural language processing, bounded and unbounded dependencies are two types of dependencies between words in a sentence. Bounded dependencies are dependencies between words that are close together in the sentence, typically within a few words of each other. Unbounded dependencies, on the other hand, are dependencies between words that are further apart in the sentence, and may be separated by many other words.
Bounded dependencies are relatively easy to model and analyze, as the words that are involved in the dependency are close together and can be easily identified. For example, in the sentence “The cat sat on the mat”, the noun “cat” and the verb “sat” are in a bounded dependency, as they are close together in the sentence and the verb directly depends on the noun.
Unbounded dependencies, on the other hand, are more challenging to model and analyze, as the words that are involved in the dependency may be far apart in the sentence and may be separated by many other words. For example, in the sentence “I think that the cat sat on the mat”, the noun “cat” and the verb “sat” are in an unbounded dependency, as they are separated by several other words and the verb does not directly depend on the noun.
Overall, bounded and unbounded dependencies are two types of dependencies between words in a sentence. Bounded dependencies are relatively easy to model and analyze, while unbounded dependencies are more challenging.
What is dependency syntax?
Dependency syntax is a type of syntactic analysis that focuses on the dependencies between words in a sentence, rather than on the hierarchical structure of the sentence. In dependency syntax, each word in the sentence is treated as a node, and the dependencies between the words are represented as directed edges between the nodes.
In dependency syntax, the head of a phrase is the word that the other words in the phrase depend on. For example, in the noun phrase “the cat”, the noun “cat” is the head of the phrase, as it determines the meaning of the phrase and the other words in the phrase (i.e. the article “the”) depend on it.
Dependency syntax is used in natural language processing to model the relationships between words in a sentence, and to extract information about the meaning and structure of the sentence. It is often used in combination with other types of syntactic analysis, such as phrase structure grammar and constituent structure, to provide a more complete picture of the sentence.
Overall, dependency syntax is a type of syntactic analysis that focuses on the dependencies between words in a sentence, and it is used to model the relationships between words and to extract information about the meaning and structure of the sentence.
What are head words in Syntax?
In syntax, a head word is the central word in a phrase that determines the grammatical properties of the phrase. For example, in the noun phrase “the big red ball,” “ball” is the head word because it determines that the phrase is a noun and that the other words in the phrase, “the,” “big,” and “red,” are adjectives modifying the noun. In a verb phrase, the head word is typically the main verb, and in an adjective phrase, the head word is typically the adjective. The concept of a head word is important in syntactic analysis because it allows us to understand the structure and function of phrases in a sentence.
What is a synonym?
A synonym is a word or phrase that has the same or nearly the same meaning as another word or phrase. For example, “big” and “large” are synonyms.
What is a hypernym?
A hypernym is the opposite of a hyponym; it is a more general term that encompasses a group of more specific terms. In the example above, “mammal” is a hypernym of the more specific term “dog.”
What is a hyponym?
A hyponym is a word or phrase that is more specific than a more general term. For example, “dog” is a hyponym of the more general term “mammal.”
What is the Distributional Hypothesis?
The Distributional Hypothesis is a linguistic principle that states that words that are used in similar contexts tend to have similar meanings. This principle is based on the idea that the meaning of a word can be inferred from the words that surround it and the contexts in which it is used. For example, if we see the word “big” often used in sentences with the word “large,” we can infer that the two words have similar meanings.
The Distributional Hypothesis has been a central principle in the field of natural language processing (NLP), where it is used to develop algorithms for tasks such as word sense disambiguation and machine translation. It is also used in the development of word embedding models, which are used to represent words in a continuous vector space in a way that captures the semantic relationships between words.
Examples of Open-class and Closed-class words
Here are some examples of open-class and closed-class words in English:
Open-class words: book, run, happy, quickly
Closed-class words: of, and, the, with