intro to stats Flashcards
what is statistics?
The science of Collecting,Organising ,Presenting,Analysing and interpreting data
data
Data refers to set of values for a variable
variable
A variable is any measured characteristic that differs for different subjects
types of variables
categorical and numerical/quantitative
Levels of Measurement- NOIR
categorical: Nominal Data/ Ordinal Data
numerical: interval data/ ratio data
Nominal Data(categorical)
Naming/Labelling variables without any quantitative value. i.e. gender
Ordinal Data(categorical)
Ranked data/ Typically measure non-numeric concepts e.g. satisfaction ratings
Interval Data(numerical)
we know the order and also the differences between the values.ie. temperature/ can add or subtract
Ratio Data(numerical)
Tell us about the order, the difference in value between units and have an absolute zero.i.e. weight, income
population
All items or individuals about which you want to
reach conclusions
sample
A portion of the population selected for analysis
Why use a sample?
less costly/ less time consuming/ more practical
two types of statistics
descriptive/ inferential
Descriptive Statistics
Used to summarise the main aspects of your dataset using Tables, graphs, simple formulae
Inferential Statistics
Methods that use data collected from a sample (small group) to reach conclusions or make predictions or inferences about the population (larger group)
two general classes of descriptive statistics
Measures of central tendency/ Measures of variation/dispersion
Central Tendency
Central tendency refers to the tendency data has to cluster around the centre point.i.e the average
The three main measures of central tendency
mean/ median/ mode
The Fulcrum Conceptualization
the mean might be considered the balancing point or turning point of a lever where the distance from the mean is equal on each side
i.e. The mean is a good central point
mean and central tendency
Mean is a poor measure of central tendency for this set of data (due to outliers)
median
The median represents the middle score in a dataset
Calculating the median if even number
find the average of middle two ranked values
the mode
he value that appears most frequently in the dataset
nomindal data and measures of central tendency
Nominal data – you can only use the mode
ordinal data and measures of central tendency
Ordinal data – you can use the mode or the median
both interval and and ratio data/ and measures of central tendency
Interval data and Ratio data – you can use mean, median or mode
measures of variation
The Range/ The Interquartile Range/ The Standard Deviation
the range
Range = Max Value – Min Value
what do Measures of variation tell us?
Measures of variation tell us about how scores are dispersed
(distributed) around the central/average score
the variance
sum of squares
standard deviation
The standard deviation tells us how much, on average, scores in the dataset deviate from the mean
variability
describes the distribution of scores
Skewness
when data is not symmetrically distributed
Kurtosis
refers to the steepness/shallowness of the distribution
how does a small SD describe the distribution of scores ?
most scores cluster closely around the mean score
how does a big SD describe the distribution of scores ?
most scores are spread out a lot from the mean score
postive skew
more scores to the left of the peak
negative skew
more scores to the right of the peak
what presence of skewness is a problem?
value greater than +2 or – 2 is a problem
Leptokurtic
steep curve
Mesokurtic
normal curve