Topic 10: Text Analysis as part of TTS system Flashcards

1
Q

Text to speech synthesis block diagram

A

text -> text analysis -> phonetic analysis -> prosodic analysis -> speech synthesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

text analysis

A

includes preprocessing and conversion

document structure detection
text normalization
linguistic analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

phonetic analysis

A

grapheme-to-phoneme conversion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

prosodic analysis

A

pitch and duration attachment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

speech synthesis

A

voice rendering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Document structure detection detail

A

why needed? there are various text formats
TTS don’t pay attention to structure, bottom line is to synthesize speech

more tagging better speech expression

SSML

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Text normalization detail

A

different NLP have different normalization purpose

For TTS, needed to be done until text converted to readable form

used to overcome ambiguity to certain extend

Why normalize?

  • symbols
  • number format
  • combination of both
  • abbreviation and acronym
  • emoji
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Normalisation: Abbreviation and

Acronym Expansion

A

example steps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
Normalisation: Pattern Matching in RE
when to use RE?
where it can be done
what is RE?
what is string
A

• When to use RE: search and modify
• Where it can be use in: string, pattern, corpus matching
• A regular expression, often called a pattern, is an expression used to
specify a set of strings required for a particular purpose.
• String: For text-based search, a string is any sequence of
alphanumeric characters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

More example application of RE

A
  • test for pattern
  • replace text
  • extract substring
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Linguistic Analysis (LA) detail

A

Also known as syntactic and semantic parsing in NLP

Information desired for TTS from parsing analysis:
o Word part-of-speech (POS) or word type
o Word sense
o Phrasal cohesion of words: idiom, syntactic phrases, clauses, sentences
o Modification relations among words
o Anaphora (co-reference) and synonymy
o Syntactic type identification: questions, quotes, etc.
o Semantic focus identification (emphasis)
o Semantic type and speech act identification: requesting, informing, narrating,
etc.
o Genre and style analysis

In principal we don’t need all for these for a TTS, but we need those that can provide TTS-specific functionality

LA supports the phonetic analysis and prosodic generation phases
what is needed for TTS
- sentence breaking/tokenizer
- POS tagging
- homograph disambiguation
- noun phrase and clause detection
- sentence type identification
How well did you know this?
1
Not at all
2
3
4
5
Perfectly