Evaluating Information Flashcards
What are statistics?
The science of collecting, organizing and interpreting data
The data that describes or summarizes something
What is a Statistical Population (referred to as population hereafter)?
The complete set of values
eg all the adult heights, measured to the nearest cm, of the human population
What is meant by sample?
A subset of the population
Used when it is impractical to measure the entire population (which is most of the time)
What are the Population Parameters?
The characteristics of the statistical population
What are Sample Statistics?
A summary of the data gathered from the sample
What is important about choosing a “Sample”?
MOST important step in any statistical analysis!
You want your sample to be representative of the population so that you can make inferences about the population
What is a Sampling Bias
When a studies design or conduct tends to favour certain results.
Examples of Bias:
- Choosing a sample from the basketball team to determine average height.
- Researcher has a personal stake on the outcome.
What is selection bias?
Researcher selects the participants
What is participation bias?
Participants volunteer to participate
What are 4 common sampling methods?
Simple Random
Systematic
Convenience
Stratified
What is a Simple Random Sample?
every member of population has a chance of being chosen
What is a Systematic Sample?
use a simple system (such as every 10th member of the population)
What is a Convenience Sample?
a sample that is convenient is chosen, (students of a particular class)
What is a Stratified Sample?
chosen from different subgroups or strata of the population. Within each strata simple random sampling is done.
What is a placebo?
a treatment meant to deceive a population into thinking they are receiving treatment.
What is a placebo effect?
A perceived or actual improvement in a condition even though the person was given no actual treatment to benefit them (the were given a placebo)
What is a Single-Blind Study?
Participants do not know whether they are in the treatment or control (placebo) group
What is a Double-Blind Study?
Neither participants nor the researchers collecting the data know who is in the treatment group and who is in the control (placebo) group
What is a Case-Control Study?
An observational study that resembles an experiment because the sample naturally
divides into two or more groups.
Cases
participants who engage in the behaviour under study (of their own choice) and hence they are like a treatment group.
Control
participants who do not engage in the behaviour under study and hence they are like
a control group.
Give an example of a Case-Control Study.
Marijuana use and young adulthood problems
Tobacco smoking and lung cancer
What are studies based on people answering questions called?
Survey and Opinion Polls
- Most common
- must be interpreted carefully
- margin of error is used
What is a margin of error?
an amount (usually small) that is allowed for in case of miscalculation or change of circumstances.
Name 4 descriptive statistics.
Sample size
Mean
Median
Mode
What is “Sample Size”?
The number of values “n”
What is the “Mean”?
The average
What is the “Median”?
The middle value
What is the “Mode”?
The most common value
Name 4 Descriptive Statistics.
Sample size
Mean
Median
Mode
Name the 3 Measures of Variance.
Variance
Standard Deviation
Standard Error
What is meant by Variance?
can be viewed at the “spread” of data around a mean
The average of the squared differences from the Mean
What is Standard Deviation?
A statistic that tells you how tightly all the various examples are clustered around the mean in a set of data. When the examples are pretty tightly bunched together and the bell-shaped curve is steep, the standard deviation is small. When the examples are spread apart and the bell curve is relatively flat, that tells you you have a relatively large standard deviation.
How do you calculate Standard Deviation?
Work out the Mean (average the numbers)
The for each number subtract the Mean and then square the result (the squared difference)
Then work out the average of those squared differences
Calculate the Mean, the Variance and the Standard Deviation for the following (look at the email I sent you for a better understanding):
You have 5 dogs, their heights are 600mm, 470mm, 170 mm, 430 mm, and 300mm.
To calculate Mean:
600+470+170+430+300 divided by 5 (“n” number of samples) = 394 mm
To calculate the Variance:
subtract the Mean from the original heights 600-394 = 206 470-394 = 76 170-394 = -224 430-394 = 36 300-394 = -94
Square each of these and add them together then divide this number by the number of samples 206 squared is 42,236 76 squared is 5,776 -224 squared is 50,176 36 squared is 1,296 -94 squared is 8,836
add them together the total is 108,520. Divide this by “n” the number of samples (5) to get the Variance
108,520/5 = 21,704 is the Variance
To calculate Standard Deviation take the square root of the Variance:
square root of 21,704 = 147.32 mm
If the five dogs in the example are a sample of a larger population how do the calculations change?
Calculating Mean:
doesn’t change
Calculating Sample Variance:
When you have “n” data values that are a SAMPLE of the population you:
divide by n-1
So in the previous example 108,520 is divided by 4 (n-1 or 5-1=4 ) to equal 27,130
Calculating Sample Standard Deviation:
take the square of the Sample Variance = 164 mm
What is Statistics based on?
probabilities
What is an Alternative Hypothesis? Give an example.
Also called maintained hypothesis or research hypothesis. It states that there is a difference in our samples.
An example might be where water quality in a stream has been observed over many years and a test is made of the null hypothesis that there is no change in quality between the first and second halves of the data against the alternative hypothesis that the quality is poorer in the second half of the record.
What is a Null Hypothesis? give an example.
The Null Hypothesis is the opposite of the Alternative Hypothesis (H1). The null hypothesis (H0) is a hypothesis which the research tries to disprove, reject or nullify. It states that there is no change in our samples.
H1 Hypothesis: Tomato plants exhibit a higher rate of growth when planted in compost rather than in soil.
H0 Hypothesis: Tomato plants do not exhibit a higher rate of growth when planted in compost rather than soil.
For research you must have what?
A Null Hypothesis vs An Alternative Hypothesis. The Null Hypothesis means nothing will change the Alternative means there is a change.
What does p-value mean?
The p-value (probability) is our chance of being wrong if we reject the null hypothesis and accept that our samples are significantly different.
being wrong in this way is also called a Type I Error
Higher p-values support the Null Hypothesis
Lower P values support the Alternative Hypothesis
What are Type I and Type II Errors?
Type I - false positive
Type II - false negative
In statistical hypothesis testing, a type I error is the incorrect rejection of a true null hypothesis (a “false positive”), while a type II error is the failure to reject a false null hypothesis (a “false negative”).
What does a higher p-value mean? What about a lower p-value?
Higher p-values support the Null Hypothesis
Lower p-values support the Alternative Hypothesis
What is experimental design?
a study design used to test cause-and-effect relationships between variables. The classic experimental design specifies an experimental group and a control group. The independent variable is administered to the experimental group and not to the control group, and both groups are measured on the same dependent variable