Key words Flashcards
Nominal Scale
Labels or names used to identify an attribute of the element
Ex: M/F, Married/Single, Eye Color: Blue/Brown/Grey/Etc.
Ordinal Scale
Data exhibiting properties of nominal data and the order or rank of data is also meaningful
Ex: Old age, Middle age, Young adulthood; High/Low
Interval Scale
Data has all the properties of ordinal data and the interval between values is expressed in terms of a fixed unit of measure. Interval data are always numerical, can be ranked.
} Ex: SAT scores (620, 550, 470), Credit history scores
Ratio Scale
Data exhibiting all properties of interval data and ratio of two values is also meaningful, i.e., there is a meaningful zero value.
} Ex: Distance, Height, Price (Sofa from West Elm costing $1600 is twice as expensive as sofa from IKEA costing $800.)
Data Measurement Levels
Ratio/Interval Data (Highest Level Complete Analysis) Ordinal Data (Higher Level Mid-level Analysis) Nominal Data (Lowest Level Basic Analysis)
Data Types: Categorical and Numerical Data
Data Qualitative (Categorical) Examples: - Marital Status - Political Party n Eye Color (Defined categories)
Discrete
Examples:
- Number of Children n Defects per hour
(Counted items)
Continuous
Examples:
- Weight n Voltage (Measured
characteristics)
Cross Sectional Data
Data values observed at a
single point in time.
Time Series Data
Data collected over
several time periods.
Tabular summary
Frequency, Percent Frequency, etc.
Graphical Summary
Bar Charts, Histograms, etc.
Numerical
Mean, Median, Standard Deviation, etc.
Relative frequency
of a class equals the fraction or proportion of items belonging to a class.
A bar chart can be used to
summarize categorical data
A useful feature of bar charts is that they can display
multiple issues
A pie chart is
another graphical device for depicting relative frequency, or percent frequency for categorical data.
Line Charts are
effective tools to represent data that are measured over time (e.g., monthly, quarterly, annually)
A Scatter diagram is
a graphical representation of the relationship between two quantitative variables.
A histogram is constructed
by placing the bins on the horizontal axis and the frequency, relative frequency, or percent frequency on the vertical axis.
Mean =
The most common measure of central tendency
Mean = sum of values divided by the number of values } Affected by extreme values (outliers)
Weighted Mean
The mean of data values that have been weighted according to their relative importance
is useful in computing the expected value of a random variable
A percentile provides information about
how the data are spread over the interval from the smallest value to the largest value.
Basic Elements of Probability Theory
Experiment
Elementary outcome
Sample Space (S)
Mutually Exclusive =
no overlap between events
Cumulative distribution function (CDF):
The probability a random variable X takes on a value less than or equal to x.