Data Analysis Flashcards
Def of correlation
A change in one variable is associated with a change in the other variable
Def of causation
A change in one variable is responsible for a change in the other variable
Def of standard deviation
The spread of data around the mean value
(shown as error bars on graphs)
When are results precise/ precision
Good precision means the result can be reproduced/repeated
This means little variability between repeats when experiments are repeated
When is data unreliable (not precise)?
If graph fluctuates up and down
If repeats are far away from the mean (large error bars)
Advantage of large sample size
-allows you to identify anomalies
- allows you to calculate a more reliable mean
- allows you to reduce the effect of anomalies on the mean
Precision of measurements means….
How sensitive the instrument is
(I.e how small are the increments that can be measured)
For valid methodology consider..
- comparing like to like
This ensures that there is no factor that may affect the results - factors that might affect the outcome need to be controlled
If factor cannot be controlled it should be monitored
For a valid comparisons you must
- compare two numbers directly
If the numbers are not equal then comparing the % can lead to valid comparisons
How do you prevent bias
Selecting patients/ sample sires randomly
Def of Accuracy
+ what do you need for accurate data?
The results are close to the ‘real value’
Requires
+ valid test, reliable data and precise measurements
+ no errors and no bias
When evaluating if data supports conclusion include:
- are organisms tested same as conclusion?
- is sample size large
- is the test bias
- are there statistical tests
- how large are error bars
- control variables controlled?
What is normal distribution?
Plotting data to get a smooth symmetrical curve that peaks in the middle
(this peak where most data was collected is the normal distribution with extremes on either side.)
What is needed for test it be statistically significant?
- need to be 95% confident that the results are not due to chance
We can never be 100% confident as biological systems show variation and on any day results may be different
Equation of standard deviation
————-
/ ∑(x-x̅)^2.
/. ————- = S
\—. N - 1
N= total number of values
X = each individual value
x̅ = mean
S = standard deviation (SD)
What is null hypothesis?
+ how do you confirm results with it?
This is always opposite to hypothesis and negative states that
no significant difference or association between the two samples it assumes that difference occurs due to change.
+ a statistical test will tell you wether you can accept or reject this null hypothesis
Accepting = no significant difference between results
Rejecting = significant difference between results
Conclusion if rejecting the null hypothesis
The null hypothesis is rejected. There is a significant difference between the mean values (be specific here). There is a less that 5 % probability that these results are due to chance.
Conclusion if accepting the null hypothesis
The null hypothesis is accepted. There is no significant difference between the mean values (be specific here). There is a more that 5 % probability that these results are due to chance.
Practice Calc of standard deviation
See notes
What is Chi-Squared test for
Establishing whether the difference between observed and expected results is small enough to occur purely due to change.
It can be used to test the null hypothesis
What is the criteria for a Chi-Squared test
1- Sample size must large enough (over 20)
2- use data that falls into discrete categories
3- only raw counts and not percentages or rates can be used
4- observed data and expected needed
Formula for chi-squared Test
X2 =
What is a positive correlation
When one variable increases so does the other
What is negative correlation
When one variable increases to other variable decreases