Trade Offs Flashcards
What is shallow vs. deep NLP?
- Deep parse of every sentence and semantics (longer to run, sometimes it’s not needed to deliverable actionable insight). Extensive representation of grammar and meaning. Different than neural nets and deep learning (three or more hidden layers). Breaks down every word into POS. Understanding everything about every sentence.
- “Chunking” or a phrase. Scrape the surface of documents and have a partial representation of every document. Shallow parsing breaks down into chunks as opposed to breaking down every single word. Sometimes you just need the noun phrases. Sometimes deep learning is overkill. Tagging and topic segmentation, and sentiment analysis is usually shallow.
What is statistical vs. symbolic?
Statistical: Uses statistical methods. It dominates right now. Uses vectors to determine which documents are similar to each other. Easily scalable. Don’t have to write any rules by hand. Can get it up and running fast.
Symbolic: Fixed, rule-based…if this, then this…discrete tests are performed on each rule. Boolean logic and numerical threshold. Could build a whole system with symbolic rules. Might have to be manually constructed. Nurse practitioner example. Harder to maintain but more controllable and explainable.
What is feature engineering vs. feature learning?
Engineering: Human defines features to use.
Learning: Take human out of it and allow machine to determine based on training data. This dominates right now. Look at all reviews written over the last few years. ML will learn the features if you have training data. Don’t have to work with SMEs but they also don’t feel like they are involved. Features may only be useful for the specific dataset. May not generalize well.
Decision is based on who is going to use the project and what they need. Can do a semi-automated feature engineering. Use statistical NLP to bootstrap suggested features. Get the process started before presenting to the SMEs.
What is top-down vs. bottom-up?
Top-down: High level classifications and then break down to more detail. Favor high level concepts. Non-technical people love this. Publishers use this approach to automatically classify in a hierarchy approach. The pros are that there is a neat category tree. Con is that the taxonomy is probable pre-defined and they don’t update it and can have misclassified or don’t update at all.
Bottom-up: Look at words and disambiguate. Later summarize trends. Google is an example of this. They index every word of every web page on the internet. Where a word occurs and # of times and then it gets rolled up from there.
Humans gravitate to the middle, not too general and not too specific. A moderate amount of both. Our brains preer this. 90% of people want to use 90% of the time.
What is transparent vs. opaque (AI vs. XAI)
Transparent: Explainable, we can see what it’s doing. XAI is explainable AI.
Opaque: You don’t know what’s under the hood.
Latent Symantic Analysis (LSA) and Latent Symantic Indexing (LSI)
Nurse and doctors were closer together than physician and doctor statistically even though they are not synonyms.
Precision and Recall
Precision: Out of times that we said it was X, it was correct Y # of times?
Recall: Out of all the posts that were X, how many were correct?
Context
Words before and after
Context-free embeddings
Word2Vec, Glove, etc. Combine all sense of a word into one vector, especially if the word has multiple meanings - prison cell, cell phone
Contextual embeddings
Elmo, BERT, etc. Generates different vectors for same word if word has different senses observed in the corpus to create the vector. Use this when your words have multiple senses.
Pre-trained vectors
Text corpus used to train vectors
Entity vs. attributes
Entity is the main item. Can we make a distinction? Attributes don’t have to go to leaf level node. It’s more of additional filtering. A leaf level node is a semantic tag for an entity that uniquely identifies that it is. What are the least number of words needed to describe the product (<5).