Critical thinking Flashcards

1
Q

TEACHING BLOCK 1: week 1

Null hypothesis

Which significance levels is needed to reject null hypothesis

A
  • states that there is no difference / significance
  • significance level of 5% (calculated as p=0.05) =
    can reject the null hypothesis

=
- if the P (probability) value is less than 0.05 = reject the null hypothesis
- if P is greater than or equal to 0.05 = accept the null hypothesis
- the p value tells you how likely something is to be not true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If P>0.05 ?

A
  • P>0.05 lies within the 95th centile
  • Not significant
  • there is no difference between the mean of a
    sample and the population mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the mean + median show

A

Mean tells us the proportion of data

median reflects the distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. Standard deviation
  2. Statements of Probability + confidence intervals
A
  1. check formula for SD

2.
- x̄ ±1.96 x SD ≈ 95%
- x̄ ± 3 x SD ≈ 97%

x̄ = mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

when Hypothesis testing do we prove/ disprove?

A
  • Rather than PROVE (+) something happens….
  • We need to DISPROVE (-) that something DOESN’T (-) happen
  • Thus making it LIKELY (-) to have occurred- although that’s only a probability

EG:
- Hypothesis: aggression between men + women = different
- Null hypothesis: aggression between men+ women = not different

= We need to disprove that our sample is not different
Try and disprove the n hypothesis in order to invalidate it = if u can’t, probability is, it is accurate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

standard error of the mean (SEM)?
formula?

A
  • SEM = a measure of how much variation there is likely to be between different samples of a population and the population itself
  • SEM = standard deviation / √ number of samples
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  • Data in normal distributions (bell-shaped curves) is represented as mean + the deviations around it.
  • the mean of a sample has a standard errorWhat range do u expect 95% of the sample means to fall in?
A
  • 95 % of the data will fall within mean +/- 2(SEM)
  • That means that 5% will be outside it (2.5% at each end)

EG
- if mean = 50 and SEM = 2, then.. “mean ± 2(SEM)” would be:
- 50 ±2 (2) = 50±4
- =95% data would fall within the range of 46 - 54

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. What’s the z-score that corresponds to the critical value for a 95% confidence level in a normal distribution?
  2. How does this apply to calculating a confidence interval for a normally distributed data set?
A
  1. 1.96

= 95% of the data points in a normally distributed data set lie within ±1.96 standard deviations of the mean
- Z score of 1.96 represents 95% cutoff point of normality

  1. A sample mean that departs by more than 2x(1.96) its standard error from the population mean would be expected by chance in about 5% of the samples
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

If the difference between means for population 1 + population 2 is greater than 1.96x the SEM (p < 0.0.5)

do u accept or reject null hypothesis?

A
  • reject null hypothesis

Either an unusual event has occurred or the null hypothesis is incorrect

As difference between the means is greater than
1.96 xSEM = result is statistically significant at
𝑝<0.05 = reject null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

week 2

Steps of Planning + conducting a study

What’s the scientific method?

A
  • Develop the research question(hypothesis)
  • Decide what to measure and how to measure it
  • Collect the data
  • Analyse the data
  • Interpret the results
  • Make observations
  • Think of interesting QS
  • Formulate hypothesis
  • Develop testable predictions
  • Gather data to test predictions (refine/Accept/ alter/reject hypothesis)
  • Develop general theories
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Types of data

Quantitative?
Categorical?

conversion of quantitative to categorical data

A
  • Quantitative = How much
    EG: age, blood pressure,blood group is AB, number of kids in a family, weight,height
  • Categorical = What Type
    EG: car types, genders, colours

Converting data:
* Height –> Tall/short
* Weight –> Anorexic/ normal/overweight/ fat/obese
* Blood pressure –> Hypertensive/normotensive

However- categorising a continuous variable reduces the amount of information available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

week 3

Sample Standard deviation?
Population standard deviation?

A

Population- A measure of how spread out the data points are from the mean

sample SD = √ Σ (𝑥i – x̄)^2 / (n-1)
xi: each data value
x̄: The sample mean
N: The total number of observations

[sample SD= use when ur data is a sample taken from a larger population]

Population SD = √Σ ( 𝑥i – μ)^2 / N
𝑥i: each data value
μ: The population mean
N: The total number of observations

[Population SD= use when u have data for the entire population]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

standard deviation symbol = σ

CALCULATING SD with
frequencies

A

Frequency SD = √ Σf (𝑥 - x̄)^2 / ΣF

f: Frequency of each data point (how often each value occurs)
𝑥: Each data value
: Mean of the data
∑f: Total frequency (the sum of all frequencies)

EG if calculated SD was 8796 amd the mean was 5,700 = 5,700 +/- 8796 [the range around the mean= most data lie within the SD 8796 above/below the mean]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Statements of Probability and confidence intervals

A

x̄ ±1.96 x SD ≈ 95%
= used to calculate a 95% confidence interval for a normal distribution

x̄ ±3 x SD ≈ 99.7%
= refers to the percentage of data points that fall within three standard deviations (±3×SD) of the mean in a normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

week 4 [experimental design]

Accuracy v Precision?

A
  • Accuracy: measure of how close calculated values are to the accepted standard true value (trueness)
  • Precision is the closeness of 2 or more measurements to each other
  • precision is the resolution of the representation, typically defined by the number of decimal or
    binary digits.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Precision
It consists of 3 levels:

A
  1. Intra-assay precision
    - Repeatability
    - Describes the precision of within-run replicates (intra-assay precision)
    - It expresses the precision under the same operating conditions over a short interval of time.
  2. INTER-ASSAY PRECISION
    - Intermediate precision
    - Intermediate precision expresses within-laboratories variations: different days, different analysts, different equipment, etc
  3. Reproducibility
    - Reproducibility expresses the precision between laboratories (collaborative studies, usually applied to standardization of methodology).
    - Describes the precision of between-run replicates
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Inter-observer variability?

Intra-observer variability?

A

Inter-observer variability: Differences in measurements made by different people observing the same thing.

Intra-observer variability: Differences in measurements made by the same person observing the same thing at different times.

18
Q
  • Normal distribution?
  • Mean?
  • Standard deviation
A
  • Normal distribution (bell-shaped curves) are characterised by locality/centrality and spread dispersion scatter
  • Mean – measures centrality (position of curve)
    = the sum of measurements/number of
    measurements
  • Standard deviation – measures dispersion scatter (shape of curve)

SD = √ Σ (𝑥 – x̄)^2 / (n-1)
x: each data value
x̄: The sample mean
N: The total number

19
Q

SD versus SEM

A
  • SD is the measure of variability of the observations about the mean
  • SEM- measure of the precision of an estimate of a
    population parameter
  • Because SEM decreases as number increases

SEM = standard deviation / √ number of samples

20
Q

week 5

Effect of Outliers

A
  • values which are clearly ‘out’ or ‘wrong’ compared
    to others in a set.

-

  • Usually arise from gross or systematic error.
  • Distort mean value or calibration curve.
  • Increase standard deviation
21
Q

Calculating mean and
SD including outlier

A

Reject if outlier outside mean +/- 1.96 x SD

EG, if outlier is 19..
Including outlier: Mean = 15.86, SD = 1.57
15.86 + 1.96 x SD = 18.94
15.86 - 1.96 x SD = 12.78
= Can reject 19.0 as outlier as it lies outside

22
Q

What are Z Scores?

A
  • This is the number of standard deviations distance from the mean
  • High Z score = far from the mean
  • Low Z score= close to the mean
23
Q

How to convert data into a z score?

How to calculate where your data might fit into
a normal curve?

A
  1. Z score = value - x̄ / standard deviation
    z = (x-μ) / σ
    x=value μ=mean σ=SD
  2. To calculate where your data might fit into
    a normal curve…

x̄ ± Z value x standard deviation

24
Q

Dixon’s Q-test

A

CHECK FORMULA

  • xs is suspected outlier
  • xc is the closest value to xs
  • xbiggest - xsmallest = range including outlier

–> Compare Q with value from table for given CL.
–> If Q > table then value is an outlier

25
Q

Problems with Detecting Outliers

A
  • Removing data – always risky
  • Selection of outlier may be altered by
    subsequent measurements.
  • Only valid if single outlier – especially for small
    data sets
  • User has to identify potential outlier.
  • Q-test mathematically simpler. More clear cut.
  • 1.96 x SD test doesn’t need the look-up table
26
Q

week 6

Formula to calculate CI (confidence interval) from z-score?

A

CI = x̄ ±Z × (σ/√n)

x̄: The mean value
Z: The z-score appropriate for the confidence level
σ: The standard deviation
n: The sample size

27
Q

week 7

Research (/alternative) hypothesis?
EG
Null hypothesis?
EG

A
  • research hypothesis = states an expectation to be tested aka: alternative hypothesis [H_a]

Tomato plants exhibit a higher rate of growth when planted in compost rather than in soil

  • Investigator derives a statement that is the
    opposite of the research hypothesis = null hypothesis (in notation: H 0) = states there will be no difference

Tomato plants do not exhibit a higher rate of growth when planted in compost rather than soil

28
Q

If significance tests generate 95% or 99% likelihood that the results do not fit the null hypothesis, then..

A

…. then null hypothesis is rejected, in favour of the alternative.

  • You have to prove that something NOT HAPPENING is NOT LIKELY
29
Q

Falsifiable?

A
  • Falsifiable = something can be logically contradicted by an empirical test.
  • A core element of a scientific hypothesis is that it must be capability of being proven false.
  • helps improve research - the H0 gets closer to the
    reality each time, even if it isn’t correct, it is better than the last H0.
30
Q

Z-score
- If you’re comparing 2 sample’s Z-score?

A

Difference in means, divided by combined standard deviations of mean

Check two-sample z-test formula

31
Q

When would u use the two‐sample
z-test?

A
  • The two‐sample z‐test needs the 2 population standard deviations σ 1 + σ 2
  • If don’t have these data = need a different test that uses sample standard deviations
32
Q

Z test versus T Test

A
  • If you know the standard deviation= Use Z tests
  • Z-test is a statistical hypothesis test that follows a normal distribution while T-test follows a Student’s T-distribution.
  • A T-test is appropriate when you are handling small samples (n < 30) while a Z-test is appropriate when you are handling moderate-large samples (n > 30).
  • T-test is more adaptable than Z-test since Z-test will often require certain conditions to be reliable.
  • T-test has many methods that will suit any need.
  • T-tests are more commonly used than Z-tests
  • Instead of n numbers you use degrees of freedom (n-1)
33
Q

Paired vs Unpaired T test

A

Paired
- 2 samples
- Come from same source
- Dependent

Unpaired
- 2 samples
- Different sources
- Independent

34
Q

Assumptions:

A
  • Two samples come from distributions that may vary in mean value but not in standard deviation
  • The observations are independent
  • The data are quantitative + normally distributed
35
Q

The t statistic for comparing two
unpaired groups is…

A
  • calculated by dividing the difference of the means by the standard error of those differences, taking into account how many subjects were in the test
  • T value = [x̄1 - x̄2] / √ (S1^2/n1) + (S2^2/n2)

x̄1 = mean value for 1st group
x̄2 = mean value for 2nd group
S1 = standard deviation of 1st group
n1 = size of 1st group
S2 = standard deviationof 2nd group
n2 = size of 2nd group

36
Q

What degrees of freedom do u use when looking at a table for T test

A
  • n-1 degrees of freedom
  • Use it in t-tests
37
Q

what ‘probability’ would u use for 2 tailed v 1 tailed test

A

2 tailed t test - p =0.05 (2.5% of area under either side of the bell curve)
1 tailed test - p =0.025

38
Q

using the t test for comparing unpaired
means the SE diff is derived by pooling the
variances. How?

A

1: Find Standard deviation in sample 1 s1 + sample 2 s2

2: Multiply the square of the SD of sample1( s12 ) by
the degrees of freedom (number of x-1)

3:Repeat for sample 2

4: Add the two together then divide by the total
degrees of freedom to give a pooled variance

39
Q

week 9

paired t-test

A
  • If we want to compare two alternative treatments
    or experiments
  • Crossover trials, randomised trials
  • Placebo effect
  • Simultaneous application
  • reduce incidental variation
  • compare the size of the difference between two means in relation to the amount of inherent variability (the random error, not related to treatment differences) in the data.

Assumptions:
- Quantitative data
- Differences are independent of each other

40
Q

week 10

statistical test

  • The x^2 (chi square) test
A
  • Tests whether the number of individuals in different categories fit a null hypothesis
  • Carried out on numbers only
  • All X^2 tests are 2 sided
41
Q

which degree of freedom to use?

A
  • different for each table
  • Df: (number of rows-1) x (number of columns-1)

don’t include the ‘total’ rows/columns