Introduction Data types and Learning Types Flashcards
Structured data
Numerical data
(age, time, temperature)
Categorical data (gender, color, country, class)
Unstructured data
- Text
- Audio
- Image
- Signal
- Video
Structured data (tree)
Tabular data
Columns are features and rows are instances
Features
• Features are raw or derived: max, min, average, rank,
bin, etc.
• Time plays a special role: time cannot decrease and
often we want to predict the future based on the
past.
• In case of labeled data, there are descriptive features
and a target feature.
Labeled tabular data
Descriptive features
Target feature
• Alternative names for descriptive features
- predictor variables
- independent variables
• Alternative names for target feature
- response variable
- dependent variable
• Alternative names for instances
- individuals, entities,
cases, objects, or records.
Supervised learning using labeled data (goal)
The goal is to find a “rule”
in terms descriptive
features that explains the
target feature as good as
possible.
Unsupervised learning (goal)
The goal is to find clusters
or patterns.
• Clusters are
homogeneous sets of
instances.
• Patterns reveal hidden
structures in the data (unknown unknowns)