CHAPTER 8-STATISTICS Flashcards
The practice or science of collecting and analyzing numerical data in large quantities, especially for the purpose of inferring proportions in a whole from those in a representative sample
statistics
Values that the variables can assume
Data
Characteristics that is observable or measurable in every unit of universe
Variable
The set of all possible values of a variable
Population
A subgroup of a population
Sample
Words or codes that represent a class or category
Express as a categorical attribute (gender, religion, marital status, highest educational attainment)
Qualitative Variables
Number that represent an amount or a count.
Numerical data, sizes are meaningful and answer questions as “how many” or “how much”
Example are height, weight, household size, number of registered cars
Quantitative Variables
What are the Classification of Quantitative Variables
Discrete and continuous
Data that can be counted (number of days, number of siblings, usual number of text messages sent in day)
Discrete Variables
It can assume all values between any two specific values like 0.5, 1.2 etc and data can be measured (weight, height, body temperature
Continuous Variables
Data created by assigning observations into various independent categories and then counting the frequency of occurrence within each of the categories.
Nominal
A scale in which scores indicate only relative amounts or rank order
Ordinal
A scale in which equal differences in scores represent equal differences in amount of the property measured, but with an arbitrary zero point.
Interval
All the properties of an interval scale with the additional property of zero indicating a total absence being measured.
Ratio Scale
Also called spread or dispersion refers to how spread out a set of data is.
Variability
The difference between the highest and lowest value in a set.
Range
Quartiles segment any distribution that’s ordered from low to high into four equal parts.
Interquartile Range
A measure of dispersion, meaning it is a measure of how far a set of numbers is spread out from their average value.
Variance
A measure of the amount of variation or dispersion of a set of values.
It is the square root of the variance
Standard Deviation
measure of the symmetry of a distribution. The highest point of a distribution is its mode. The mode marks the response value on the x-axis that occurs with the highest probability. A distribution is skewed if the tail on one side of the mode is fatter or longer than on the other: it is asymmetrical.
Skewness
Statistical measure that defines how heavily the tails of a distribution differ from the tails of a normal distribution. In other words, it identifies whether the tails of a given distribution contain extreme values.
Kurtosis