Statistics Flashcards
data analysis
relies on data collection
experimental design
helps design experiments, determines sample size, increases accuracy
hypothesis statement
provides ways to test hypothesis as to whether occur by chance
predictive modelling
used to predict
risk assessment
helps assess hazards
decision making
helps make decisions
consider 4 things when describing/summarising data
centrality, dispersion, replication, shape
centrality
average
dispersion
how spread out values are from average
replication
how many samples
shape
how evenly spread data is
measure of centrality/central tendency
compares differences between datasets
mean equation
x(-)= 3x/n
median equation
1/2(n+1)
n
how many samples there are (not sum)
x
value of all data
measure of dispersion
standard deviation (s), variance (s(2))
standard deviation
\/3(x-x(-))(2)/n-1
variance
3(x-x(-))(2)/n, then square root
what does interquartile range measure
data thats not normally distributed
minimum, lower quartile, median, upper, maximum
0%, 25% 1st quartile, 50% 2nd quartile etc
work out interquartile range
q3-q1
confidence interval
probability something will fall between values around mean
mean absolute deviation
measures how much values vary from mean
qualitive data
ordinal, nominal
quantitative data
discrete, continuous
datum/cases
single value
what makes histogram
group frequency table
histogram
measures spread/location
measure of spread
range/quartiles/standard deviation/variance
measure of location
mean/median/mode
right/left skewed
mean to right/left
function
describes relationship between input/output
input, output
(x), f(x)
random variable
from random events, discrete/continuous, calculated by functions
probability distribution
discrete, lists all random variable values and probability
probability mass function
discrete, values between 0-1, sum of outcomes equal 1
probability density function
continuous,, outputs are greater or equal 0, probability estimated under curve
area under curve sums to
1