6.1 Stats Flashcards
6.1.1. Outline that error bars are a graphical representation of the variability of data.
Distribution of data values is often represented by showing a single data point, representing the mean value of the data, and error bars to represent the overall distribution of the data.
Error bars are a graphical representation of the variability of the data
6.1.2. Calculate the mean and standard deviation of a set of values.
6.1.3.
State that the statistic standard deviation is used to summarize the spread of values around the mean
For normally distributed data, about 68% of all values lie within ± 1 standard deviation of the mean. This rises to about 95% for ± 2 standard deviations. About 99.7% of the values lie within three standard deviations of the mean
6.1.4.
Explain how the standard deviation is useful for comparing the means and the spread of data between two or more samples.
- A small standard deviation indicates that the data is clustered closely around the mean value.
- Conversely, a large standard deviation indicates a wider spread around the mean
6.1.5.
Outline the meaning of the coefficient of variation.
The coefficient of variation is the ratio of the standard deviation to the mean expressed as a percentage.
CV= (Standard deviation / Mean) x 100
6.1.6.
Deduce the significance of the difference between two sets of data using calculated values for t and the appropriate tables
The t-test can be used to measure whether there is a significant difference between the means of two populations.
•For example if you measure the weight of the inhabitants on two islands the t-test formula will work out whether there is a significant difference based on the difference between the means and the degree of variation among them.
6.1.7.
Explain that the existence of a correlation does not establish that there is a causal relationship between two variables.
- Correlation is used to define the extent of relatedness or relationship between two variables. (Analysis which let us know the association or absence of a relationship between two variables)
- Are the two variables related in such a way that random chance cannot account for the relationship?
- It should be noted that just because you can mathematically determine how related two variables are one cannot use correlation to validate a cause and effect relationship between the two variables.
Explain the 3 (main) outcomes possible when plotting correlation scatter graphs.
What is the Pearson correlation coefficient
The “R” value.
vary from +1 (perfect correlation) through 0 (no correlation) to –1 (perfect negative correlation).
The larger the value “R”-value (positive or negative), the stronger, or more significant, the correlation. The lower the value, the weaker the correlation.
Values greater than 0.8 are probably very significant, values between 0.5 and 0.8 are probably significant, and values less than 0.5 are probably insignificant.
What is R2
R2 is Linear Regression
Describes how an independent variable is numerically related to the dependent variable
- The value r2 is a fraction between 0.0 and 1.0, and has no units.
- An r2 value of 0.0 means that knowing X does not help you predict Y. There is no linear relationship between X and Y.
- When r2 equals 1.0, all points lie exactly on a straight line with no scatter. Knowing X lets you predict Y perfectly.
What is the difference between R2, and the R value?
What is “Uncertainty of measurement”?
This needs to be included in your IA data
Uncertainty of measurement is the doubt that exists about the result of any measurement. You might think that well-made rulers, clocks and thermometers should be trustworthy, and give the right answers. But for every Measurement - even the most careful - there is always a margin of doubt. In everyday speech, this might be expressed as ‘give or take’ … e.g. a stick might be two meters long ‘give or take a centimeter’. A measurement result is only complete if it is accompanied by a statement of the uncertainty in the measurement.
Error is the difference between the measured value and the ‘true value’ of the thing being measured.
Uncertainty is a quantification of the doubt about the measurement result.
What is a Dependent Variable?
Output variable you are interested in monitoring to see if it is affected or not. (Measured variable / Explained variable)
What is an Independent Variable?
The one variable that is changed by the scientist in a controlled study (Manipulated variables)
What is a Controlled Variable?
Variables that the researcher or scientist will do their best to “control”.
Control meaning, that they want to design a method that makes it so all the identified controlled variables to not impact the results.
The scientist wants only the independent variable to have an effect on the dependent variable.