Exam 1 Flashcards
Variables take categories as their values such as “yes” “no “ or blue brown green
Categorical (qualitative)
Variables that have values that represent a counted or measured quantity
Numerical (quantitative)
Variables that arise from a counting process
Discrete / numerical (quantitative)
Variables that arise from a measuring process
Continuous/ numerical (quantitative )
Facts and figures callected, analyzed, and summarized for presentation and interpretation
Data
All the data collected in a particular study are referred to as the __________ for the study
Data set
The entities on which data are collected
Elements
A characteristic of interest for the elements
Variable
The set of measurements obtained for a particular element is called _____
An observation
A data set with n ________ contains n __________.
Elements, observations
How do you calculate the total number of data values
The total number of data values in a complete data set is the number of elements multiplied by the number of variables
Nominal data
Defined categories such as eye color, political party, marital status
Ordinal categories
Categorical - Ordered categories such as good, better, best or low, medium, high
Data classified into distinct categories in which no ranking is implied
A nominal scale ex: do you have a facebook profile? Y or N; cellular provider? (verizon, AT&T, etc.)
Classifies data into distinct categories in which ranking is implied
Ordinal data - EX: grades, ratings, product satisfaction
Data that has the properties of ordinal data and the ___ between observations is expressed in terms of a fixed unit of measure. it is always ___. The scale ____ contain a ____ value that indicates that noting exists for the variable at the _____ point.
Interval,, numeric, does not , zero x2
Data that has all the properties of internal data and the ___ of two values is meaningful. The scale ____ contain a zero value that indicates that nothing exists for the variabe at the zero point,
Ratio, must
Data that is collected at the same or approximately the same point in time.
Cross-sectional data. EX: data detailing the number of building oermits issued in Nov. 2019 in each of the counties of Ohio.
Data that is collected over several time periods
Time series - data detailing the number of building permits issued in lucas county, ohio in each of the last 36 months. Graphs of time series help analysts understand what happened in the past, identify any trendsa over time, and project future levels for the time series.
The set of all elements of interest in a particular study
Population
A subset of the population
Sample
The process of using data obtained from a sample to make estimates and test hypothesis about the characteristics of a population
Statistical inference
Collecting data for the entire population
Census
Collecting data for a sample
Sample survey
Collecting data via sampling is used when doing so is:
Less time consuming tham selecting every item in the population. It is less costly than selecting every item in the population. It is less cumbersome and more practical than analyzing the entire population.
Summarizes the value of a specific variable for a population.
Population Parameter
Summarizes the value of a specific variable for sample data
Sample statistic
Tallies the frequencies or percentages of items in a set of categories so that you can see differences between categories
A summary table
A summary of data showing the number (frequency) of observations in each of several non-overlapping categories or classes.
Frequency distribution
The ____ ______ of a class is the fraction or proportion of the total number of data items belonging to the class. What is the equation?
Relative frequency. Equation is ~ relative frequency of a class = frequency of class/ n
How do you calculate percent frequency of a class?
The relative frequency multiplied by 100
Used to study patterns that may exist between the responses of two or more categorical variables.
A contingency table
It cross tabulates or tallies jointly the responses of the categorical variables
A contingency table.
A tabular summary of data for two variables.
A crosstabulation
Contingency table - For two variables, the tallies for one variable are located in the ____ and the tallies for the second variable are located in the ________.
Rows, columns
A sequence of data, in rank order, from the smallesy value to the largest value.
An ordered array.
It shows range (minimum value to maximum value)
Ordered array.
May help identify outliers (unusual observations)
An ordered array