[L7] Data from Independent Groups: Categorical Measures Flashcards
– can only take on one of a
limited number of values, often simply yes or no.
Categorical or Nominal data
– very rarely used as an appropriate measure of
central tendency.
* It does not tell much.
Mode
- Three main ways of showing the difference between two
proportions. (fourth one as well)
Absolute and Relative Change
All look very similar to each other and it is often not clear
which one people are talking about.
Absolute and Relative Change
- We usually describe the % of people in each group and the
differences between them. - We also get on to the 95% CIs.
Summary Statistics
- Third way of showing the difference two proportions.
Odds Ratio
Trickier than percentages and proportions but most
common way.
Odds Ratio (OR)
Odds are always presented as
“something to one”.
First way is to look at the data is to use the
Absolute
Difference
The___ in the percentage of people who
took the antibiotic is 15%.
absolute reduction
Reduction in antibiotic use as a ___ of those in
the leaflet group
Proportion
So giving a leaflet reduces the chances that the person
will take antibiotics by 24%. This is the ____
Relative Risk
Decrease.
What are the odds of throwing a 6 on a die
/dice? Tricky because it is…
counter-intuitive
The odds are 5 to 1, or 5: 1. The event will not
happen___ that it does happen.
5 times for each time
If we know the ____, p, we can calculate the odds
using the following formula:
probability
Chi-square
Odds Ratio
We then express the change in the ___ using the odds
ration, OR.
odds
Easier way of calculating the odds ratio.
OR = AD / BC
OR = AD / BC
distribution of frequency scores
- One more method of presenting the effect of an
intervention which is commonly used in medicine,
though less commonly used in psychology
Number Needed to treat (NNT)
Also known as NNH or
number needed to harm.
The NNT is very easy to calculate. It is simply
- NNT = 1 / Absolute Risk Difference
- Although it is possible to compute for all the descriptive
statistics that we calculated, most of them are rarely used
so the only one we are going to concentrate on is the
\
odds ratio.
v, (Greek
letter nu, pronounced ___
“new” or “noo”)
v is a bit like the ___of the odds ratio
standard error
First, when we had the confidence intervals of a mean,
the intervals were symmetrical, so the lower CL was the
same amount below the mean as the upper CL was
above it.
* This is not the case for the ___
odds ratio.
Second the CIs do seem to be very _
_
_.
wide
That’s just the way they are and upper CLs can stretch
into ___, with small sample sizes.
hundreds
If we want more certainty, we must have a ___
bigger
sample.
There is only one way to calculate the probability value
given the plethora of ways of displaying the difference
between two proportions. (Sort of only one way)
* The test is called “____
Chi-square test.
Odds ratio
raw score
t-test
standardized score (d)
convert OR to ____, to know if its difference is significant
chi-square
- Developed by Pearson and sometimes known as the
Pearson χ² test.
Chi-Square χ²
The first stage in the χ² test is to put the ___
values into a
table, but add totals to it.
- We have to calculate the ___for each cell,
which are referred to as ___.
expected values , E
The E are the values that we would
_
__,
expect if the null
hypothesis were true
The expected values are given by:
__
- E = R x C / T
Where R refers to the___, C the ___, and T for____
total for a given row; total
for a given column; grand total.
O = ___
Observed value
All we need to know is the ____so we can take
the differences and add them up. (Almost but not quite)
distance between the
observed value and the expected value
The difference needs to take account of the___
sample size
χ² = Σ (O - E)² / E
similar, deviation score
Issues with χ² / complications: An assumption made by the χ² test is that all of the
expected values (in a 2 x 2 table) must be ___
greater than 5.
Degrees of Freedom (df)
- df = (number of rows – 1) x (number of columns – 1)
If the table is larger than 2 x 2, then __ of the expected
values need to be above
__
80%; 5.
If our data do not satisfy this assumption we can use the
___ instead.
Fisher’s exact test
2nd, when we have a 2 x 2 table the χ² test is a little bit
__, and statisticians do not like liberal tests
liberal
A
__- is slightly more prone to say that a result is
statistically significant than it should be, so the Type I
Error rate is not 0.05, but a little bit higher than that
liberal test
One approach in dealing with this is to use ____
Yates’
Correction for Continuity or Continuity Correction
The values contained within the bars means ___, which means if there is a minus sign
ignore it.
“take the
absolute value”
The problem with Yates’ correction is that it makes the
test a little conservative so the ___is now
___
Type I Error rate; smaller than 0.05
This in itself is not a problem, but it means that the test is
now ___, when it
should be giving one
less likely to give a significant result
In fact, the correction only matters when the sample size
is ____
relatively small.
So if the sample size is __, the correction makes little
difference.
large
If the sample size is small, there is a better test we can
use called the
____
Fisher’s Exact test.
The Idea of Fisher’s exact test is that for some events we
can work out the exact probability of them occurring
without needing to use ___
test statistics and tables.
We can do the same thing for data which are in a___
2 x 2
contingency table.
Fisher’s exact test gives the probability of getting the
___
exact result that was found in the study.
However, we are not just interested in the exact result,
we are interested in any result that is ___
more extreme than
the result we have.
Fischer, inspired by __, that led him to form a formula because his wife has a special talent to exactly pin point is a tea is made with tea first or tea
wife
With Fisher’s exact test there is no __, and no
need to look anything up in a table.
test statistic