Chapter 1 Flashcards
Artificial Intelligence
Artificial intelligence (AI) is a branch of computer science that is designed to mimic human-like intelligence for certain tasks, such as discovering patterns in data, recognizing objects from an image, understanding the meaning of text, and processing voice commands.
Binary
Binary categorical data can have only two values—for example, yes or no. This can be represented in different ways such as 1 or 0 or “True” and “False.” Binary data is commonly used for classification in predictive modeling.
Categorical Data
Categorical data exists when values represent a group of categories. Categorical variables can be one of three types: binary, nominal, or ordinal.
Continuous Data
Continuous Data
“Continuous data includes values with decimals: 1, 1.4, 3.75, . . .”
Dependent (Target) Variable
The variable being predicted is referred to as the dependent (target) variable (Y).
Descriptive Analytics
Descriptive analytics are a set of techniques used to explain or quantify the past.
Discrete Data
Discrete data is measured in whole numbers (integers): 1, 2, 3, . . .
Independent (Predictor) Variable
The variables used to make the prediction are called independent variables (X) (also referred to as predictors or features).
Integer
An integer is a whole number.
Interval
Interval data has an equal distance between data points and does not include an absolute zero.
Machine Learning
Machine learning is a statistical method of learning that can be trained without human intervention to understand and identify relationships between previously established variables.
Marketing Analytics
Marketing analytics uses data, statistics, mathematics, and technology to solve marketing business problems.
Nominal
Nominal categorical data consist of characteristics that have no meaningful order.
Ordinal
Ordinal categorical data represent meaningful values with a natural order but the intervals between scale points may be uneven.
Predictive Analytics
Predictive analytics is used to build models based on the past to explain the future.
Prescriptive Analytics
Prescriptive analytics identifies the best optimal course of action or decision.
Primary Data
Primary data is collected for a specific purpose. For example, companies conduct primary research with surveys, focus groups, interviews, observations, and experiments to address problems or answer distinct questions.
Ratio
Ratio values can have an absolute zero point and can be discussed in terms of multiples when comparing one point to another.
Secondary Data
Secondary data relies on existing data that has been collected for another purpose.
SMART Principles
SMART principles are used as a goal-setting technique. The acronym stands for specific, measurable, attainable, relevant, and timely.
Structured Data
Structured data is made up of records that are organized in rows and columns. This type of data can be stored in a database or spreadsheet format.
Supervised Learning
In supervised learning, the target variable of interest is known and is available in a historical dataset.
Testing Dataset
A testing dataset is used to evaluate the final selection algorithm on a dataset unique from the training and validation datasets.
Training Dataset
The training dataset is the data used to build the algorithm and “learn” the relationship between the predictors and the target variable.
Unstructured Data
Unstructured data does not have a predefined structure and does not fit well into a table format (within rows and columns).
Unsupervised Learning
Unsupervised learning has no previously defined target variable. The goal of unsupervised learning is to model the underlying structure and distribution in the data to discover and confirm patterns in the data.
Validation Dataset
The validation dataset is used to assess how well the algorithm estimates the target variable, and helps select the model that most accurately predicts the target value of interest.
Variables
Variables are characteristics or features that pertain to a person, place, or object. Marketing analysts explore relationships between variables to improve decision making.