AI Flashcards
What are the 2 types of data ?
Numerical Data and Categorical Data.
What kind of value does Numerical or Continuous data accept ?
Can accept any value within a finite or infinite interval (e.g., height, weight, temperature, blood glucose, …).
What are the 2 types of Numerical or Continuous data ?
Interval and ratio.
Describe data on an interval scale.
Can be added and subtracted but cannot be meaningfully multiplied or divided because there is no true zero. For example, we cannot say that one day is twice as hot as another day.
Describe data on a ratio scale.
Has true zero and can be added, subtracted, multiplied or divided (e.g., weight).
Categorical or Discrete variable is the one that has ….. .
two or more categories (values).
What are the 2 types of categorical variables ?
Nominal and ordinal.
Describe Nominal variables.
Has no intrinsic ordering to its categories. For example, gender is a categorical variable having two categories (male and female) with no intrinsic ordering to the categories.
Describe Ordinal variables.
Has a clear ordering. For example, temperature as a variable with three orderly categories (low, medium and high).
What is a frequency table ?
Is a way of counting how often each category of the variable in question occurs. It may be enhanced by the addition of percentages that fall into each category.
What is Encoding or continuization ?
Is the transformation of categorical variables to binary or numerical counterparts. An example is to treat male or female for gender as 1 or 0. Categorical variables must be encoded in many modeling methods (e.g., linear regression, SVM, neural networks).
What are the 2 types of encoding ?
Binary and Target-based.
What is Binning or discretization ?
Is the process of transforming numerical variables into categorical counterparts.
An example is to bin values for Age into categories such as 20-39, 40-59, and 60-79.
Numerical variables are usually discretized in the modeling methods based on ….. .
frequency tables (e.g., decision trees).
Binning may improve accuracy of the predictive models by ….. or ….. .
reducing the noise, non-linearity.
What is a Dataset ?
Is a collection of data, usually presented in a tabular form. Each column represents a particular variable, and each row corresponds to a given member of the data.
Alternatives for columns: ….., ….., ….. .
Fields, Attributes, Variables.
Alternatives for rows: ….., ….., ….., ….., ….., ….. .
Records, Objects, Cases, Instances, Examples, Vectors.
Alternatives for values: ….. .
Data.
In predictive modeling, ….. or ….. are the input variables
predictors, attributes.
In predictive modeling, ….. or ….. is the output variable
target, class attribute.
In predictive modeling, the output variable value is determined by ….. and ….. .
the values of the predictors, function of the predictive model.
Pattern recognition predicts the future by ….. .
means of modeling.
What is Predictive modeling ?
Is the process by which a model is created to predict an outcome.