NER Flashcards
What does NER stand for?
Named Entity Recognition
What is a Named Entity?
It is anything with a proper name
People, Locations, Organisations, Events, Dates, Time, Money
What is Named Entity Recognition?
It is the task of labelling a text span with types of named entities
What are some popular NER tags?
People (PER)
Organisations (ORG)
Locations (LOC)
Geo-Political Entity (GPE)
What are NER approached based on?
They are based on tagsets
What is a popular NER tagset? How many types does it define?
The Automatic Contact Extraction (ACE) tagset is a very popular that defines 7 types.
What are some challenges of NER?
It works with spans, so working out how big a phrase needs to be labelled for a NE
Named entity type ambiguity, where a NE can be different types depending on the context
What is the key method used for NER?
We treat it as a sequence labelling problem so we use BIO tagging
What is BIO tagging?
It is a common approach for sequence labelling requiring span-recognition
What does BIO stand for?
Begin, Inside, Outside
What is the idea of BIO tagging?
We assign a tag to each word in our sequence, and each tag may represent the beginning, the middle or the end of something
Explain what the image shows
It shows different types of tagging methods. It shows that IO is difficult to comprehend, as it is difficult to understand where one NE begins and ends. BIO tagging builds on this by using begin labels, which shows where NEs begin. BIOES takes this even further.
What model is used to learn text according to the BIO scheme to identify NEs?
Conditional Random Fields (CRFs)
What type of features are useful for NER?
Non-word features such as captilisation.
Why is the Hidden Markov Model (HMM) not a good model for NER?
As they are generative, is is hard to add feature patterns