Quiz #1 Flashcards
What is the other name of rows?
Instances
What are synonyms of column in a dataset?
features, attributes
What is the other name for target?
label
What process is important to do before running a classifier or ML algorithm? What other processes does it involve?
Preprocessing which includes feature engineering/transformation.
Data preprocessing is a form of what three proccesses?
Summarization, Kernelization, and Representation Learning
What is the most common way of representing data to a model? What two models require modifications?
Instances and Feautures. Time series and Networks
How is data structured in time series
each column is a time slice where instances are a measured property like magnetometer and another row is a pressuremeter.
What is an example of numeric type data? How can it be represented to a classifier?
[0.1,3,6,3.4,-34] and it can be represented as-is or normalized.
What is an example of binary type data? How can it be represented to a classifier?
[yes,no,no,yes] and it can be represented to a classifier as 0s and 1s.
What is an example of ordinal type data? How can it be represented to a classifier?
[advanced,proficient,beginner,advanced], notice it has weight (levels) therefore it can be represented ordered numerically: [3,2,1,3]
What is an example of nominal type data? How can it be represented to a classifier?
[josh,pedro,andrea,austin] and it can be represented to the model as one-hot encoded.
Where does the target come from in a dataset?
Usually from a feature except in unsuperivsed learning.
What do you need to do before optimizing a model?
Understand problem that you are trying to solve and think about features and if they represent or make sense of the problem.