B01 Unstructured Data Analytics Flashcards
Define Structured Data
Data that is organized and stored in a pre-defined format. Examples include relational databases, tabular data (csv or xls), etc.
Define Semi-Structured Data
Data that has a selfdescribing structure (usually with tags) but also doesn’t follow a predefined format. Examples include HTML, XML, JSON, etc.
Define Unstructured Data
Data that does not have a pre-defined data model and/or format. Examples include images, text, audio, video, sensor data, etc.
What is Unstructured Data Analytics?
“Unstructured data analytics are a set of techniques focused on the extraction of useful insights from data with unpredictable or inconsistent form.”
Define the Unstructured Analytics approaches we will cover
- Text Analytics
Text Analytics is the process of extracting high quality insights from textual data. It is sometimes referred to as text mining.
Define the Unstructured Analytics approaches we will cover
- Sentiment Analysis
Extracting an author’s emotional
intent from text.
Define the Unstructured Analytics approaches we will cover
- Topic Modelling
Discovering the abstract “topics”
that occur in a collection of
documents.
Define the Unstructured Analytics approaches we will cover
-Naïve Bayes
Quantifying the probability of
events and how those probabilities
should be revised in the light of
additional information.
Define the Unstructured Analytics approaches we will cover
-Support Vector Machines
Separating categories of data by
representing the data as points in
multi-dimensional space.
Define the Unstructured Analytics approaches we will cover
-Neural Networks
Recognizing patterns in data by
using a method that loosely models
the neurons in a biological brain.