Data Analysis: Hypothesis testing and comparing meanss Flashcards
What is data?
The actual pieces of information you collect in your study
What is variable?
measurement which varies between subjects e.g. height or gender (not constant)
How can data be classified?
Into 2 types:
Categorical or numerical
Categorical: can be sorted into groups or categories, use bar charts and pie charts to represent
Can be further split up into:
- nominal values: you can count but not order or measure e.g. sex and eye colour
- Ordinal values: you can count and order but not measure e.g. house numbers and swimming level
How do populations and samples relate to one another?
If your chosen sample is chosen correctly, the sample data can represent the whole population and can be used to draw inferences about the whole population
What is point estimation?
Where the sample data is used to estimate the parameters of a population
statistics - calculated using sample data
parameters- characteristics of population data
How do we choose which average and measure of spread to use?
1 - First look at the type of data you’re looking at (numerical or categorical)
2- If numerical:
- for normally distributed data measure average using mean and spread of data - standard deviation
- for skewed data use median, spread (IQR)
If categorical,
- for ordinal use median (IQR)
- for nominal use mode (no measure of spread) - rare
What is hypothesis testing?
A way for you to test the results of a survey or experiment to see if you have meaningful results
You are testing to see if your results or valid or if they are due to chance
If due to chance then your experiment won’t be repeatable and of little use
Objective way of making decisions or inferences from sample data
What are the two hypotheses you can have?
Null - Ho
- assume that there is no difference/effect/relationship
Research (alternative) hypothesis - HA
- assume that there is a difference/effect/relationship
What are the types of error?
Type 1 - where there isn’t a significant difference but study reports there is (reject null hypothesis)
Type 2 - where there is a significant difference but study reports there isn’t (accept null hypothesis)
Which one is worse depends on the scenario - consider risks of each error
What test do we use to compare means?
T - tests
What are the types of t-tests and when do we use them?
paired test - used for paired data - when we study the same individuals at two different times or under two diff conditions
independent samples t-test - data collected from two separate groups
What does t-test assume?
Assumes normal distribution
How can we check to see if assumptions are met in t-tests and what tests do we carry out if they aren’t?
Independent, you check using histograms of data by group. If data shows not normal distribution then use Mann-Whitney test (non parametric)
For paired t-test, check using histogram of paired differences. If not normal distribution then use Wilcoxon signed rank (non parametric)
What is ANOVA and what are the types of ANOVA?
ANalysis Of Variance
2 MAIN TYPES:
ONE WAY - when you want to test two groups to see if there’s a difference
TWO WAY (with or without replication) - Without replication - when you have one group and you're double testing that same group (e.g. one group before and after medication)
With replication - when you have to groups and the members of those groups are doing more than one thing (e.g. two groups of patients from diff hospitals trying two diff therapies)
What distribution do we use for one way ANOVA?
USed to compare two means from two independent groups using f-distribution
Looks at all the data in the groups together
- looks at all the variance within the groups then looks at overall variation between the groups