statistics Flashcards
kinds of variables
- categorical (qualitative)
- numerical (quantitative) => discrete (counting), continuous (measuring)
population def and kinds
= set from which the data is collected
finite (everything in life), inifinite (math concepts)
sampling
= selecting the group from which data is collected from
sampling methods
- statistical (selection based on chance)
- non-statistical (selection based on convenience)
statistical sampling methods
- simple random sampling => every possible sample of a specific size has an equal chance of being chosen
- stratified => a population is divided into strata based on the values of interest and then random sampling within strata
- systematic => first element is selected randomly and then each nth element (sampling interval) is selected
non-statistical sampling methods
- quota sampling => non-random selection of a predertermined number of units
- availability sampling => based on convenience
statistical interferences
= using characteristics of a sample to draw conclusions
descriptive statistics
displaying and summarizing data
infernal statistics
choosing a representative sample, drawing conclusions from sample to population, predict, …
absolute vs relative frequency
absolute = number of data
relative = abs/number of total data
cumulative absolute vs relative frequency
cum. abs. => less or equal absolute frequency (final = total number of data)
cum. relative => less or equal relative frequency (final = 100%)
adding consecutive
width, class width
width => difference between subsequent grades
class width = beginning of next class - beginning of said class
class boundaries
- upper class of previous = lower class of current
- class width is evident
histogram
- = graphical representation of a frequency distribution table
- used for numerical values
- no spaces between columns
- variable axis has direction
bar graph
represents qualitative data
variable axis has no direction
distribution curve
or ogive
= smooth histogram, indicating the general behaviour of the histogram
distribution in a distribution curve
- symmetrical
- skewed to the left
- skewed to the right
modality of a distribution curve
- unimodal => one peak
- bimodal => two peaks
- trimodal => three peaks
height of peaks doesn’t matter
uniform distribution
- straight line histogram
- each class has roughly the same distribution
bell-shaped distribution
unimodal, symmetric distribution
Gaussian (normal) curve
approximation of central tendencies on a distribution curve
- mode => x-coordinate of the highest peak
- median => x-coordinate of the vertical line that halves the area of the distribution curve
- mean => symmetric distribution (coincides with the median and mode at the center of distribution), skewed (to the side of the tail of the median)
measures of central distribution
- mode = value with the highest frequency
- median = value in the middle position of data in ascending order
- arithmetic mean (average) = sum of all data/number of data
modal class
class with the most elements
average of data in k classes with mid-interval values m1, m2, …, mk
(m1f1 + m2f2 + … + mkfk)/n