Statistics Flashcards

Question

What is the mean and standard deviation of the standard normal distribution?

Answer 1

Mean 0 | Standard Deviation 1

Answer 2

Subtracting the mean and dividing by the standard deviation: | ((Any of the data values) - Mean) / Standard deviation

Answer 3

- To allow us to compare data - To perform more advanced statistical tests - If 0 is in the centre the centile are easier to calculate - There is only one table of probabilities for normal data

Answer 4

mean +/- 1.96 x Standard Deviation

Answer 5

Mean +/- 1.96 x Standard Deviation

Answer 6

Shapiro Wilk statistic

Answer 7

1. Logarithmic: Variances are proportional to the mean, fairly skewed data 2. Square root: Fairly skewed, counts 3. Reciprocal: Highly skewed data 4. Cube transformation: Data relating to volumes 5. Logit: Proportions

Answer 8

Calculate it with denominater n - 1 not n

Answer 9

The range we would expect, given a certain level of confidence, to include the population parameter

Answer 10

The standard deviation around the mean

Answer 11

Standard deviation/ (square root of number of items)

Answer 12

Sample mean +/- standard error of sample mean

Answer 13

When estimating the mean in normally distributed populations when the sample size is small and the population standard deviation is unknown.

Answer 14

To assess the validity of a claim about a population parameter.

Answer 15

Over 1.96 (+ve or -ve)

Answer 16

Rejecting a true null hypothesis

Answer 17

Accepting a false null hypothesis

Answer 18

The level of significance = the probability of making a Type 1 error. Usually this is set at 5% (95% confidence level)

Answer 19

Increasing sample size

Answer 20

A measure of the difference between what is expected if the null hypothesis were true and what is observed.

Answer 21

(Observed value - Expected value) / Standard Error

Answer 22

Normal distribution for mean

Answer 23

T test for mean

Answer 24

Chi squared test for proportions

Answer 25

A statement of the conditions (value of the test statistic) for which the null hypothesis will be rejected.

Answer 26

Reject the Null hypothesis as 2.3 is higher than 1.96

Answer 27

The critical values depend on the sample size

Answer 28

The critical value approaches that of normality

Answer 29

An approximation to a T test when samples are known to have arisen from normal distributions with unequal variances.

Answer 30

To compare samples with unequal variances

Answer 31

It is the ratios of the variances of the sample and is used as part of the independent T test.

Answer 32

An estimation of the population variance

Answer 33

When np and n(1-p) are greater than 5

Answer 34

1. The Confidence limits are p +/-2.58 x standard error 2. P = 0.7 (70% success rate) 3. Standard error = √(p(1-p))/n 4. Answer = 0.62 - 0.78

Answer 35

Steps involved: 1. Binomial situation with n = 200 and p = 0.64 (128/200) and 1-p = 0.36 2. np = 128, n(1-p) = 72. Since both of these are over 5 we can use the normal distribution. 3. To test the hypothesis that the treatment is 70% effective we use a test of the mean number of successes being 140 4. Null Hypothesis = Mean number of successes = 140 Alternate hypothesis = Mean number of successes is not = 140. 5. 5% significance level. 6. Test statistic z = (np - X)/ (√(np (1 - p) where X is the population/reference number of successes. 7. (128 - 140)/ (√(6.79) = -1.77 8. Less that 1.96 therefore we accept the Null Hypothesis that the population mean = 140 and so the success rate = 70%

Answer 36

1. Use the formula (p - p0)/ √(np(1-p))/n 2. Answer -0.59. 3. Accept Null Hypothesis.

Answer 37

Gives us information about the nature of the relationship eg linear. This enables predictions to be made.

Answer 38

The extent of the association between two variables

Answer 39

Correlation coefficient

Answer 40

Pearsons correlation coefficient (r)

Answer 41

Very strong positive correlation

Answer 42

R will also increase

Answer 43

1. At least one variable must be normally distributed. 2. They have been measured on a random sample 3. The pairs of variables are independant

Answer 44

It measures the proportion of the variation in the dependant variable (y) which is attributable to its linear relationship with variable x.

Answer 45

That both variables are approximately normally distributed

Answer 46

``` y = a + bx a = Intercept (value of y when x = 0) b = Slope (Change in y when x increases by one unit) ```

Answer 47

1. Correlation between x and y is significant 2. For each value of the x variable the values of the y variable has a normal distibution 3. The variances of the normal distributions are equal

Answer 48

That the data is plausibly normally distributed

Answer 49

A measure of linearity between points in a normal plot

Answer 50

Likely to be normally distributed.

Answer 51

Experiemental

Answer 52

Cross sectional

Answer 53

Treatment and control groups are being measured at the same time.

Answer 54

Sequential trial

Answer 55

A study where each subject acts as their own control

Answer 56

That there is no carry on effect from one treatment to another.

Answer 57

When each individual subject who is receiving the treatment is matched for factors critical to the outcome with a subject in the control group.

Answer 58

A group of subjects who are initially disease free are followed over time. They are likely to be exposed to a range of factors, which will be noted and the information used to establish risk factors for the disease

Answer 59

- Large, costly and take many years to deliver results. | - Unsuitable for rare diseases

Answer 60

A reterospective study of diseased subjects. The range of factors to which they have been exposed are reviewed to establish risk factors for a disease.

Answer 61

Cheap | Quick

Answer 62

Prone to bias eg patients who have the disease are more likely to have thought about risk factors and hence remember them better.

Answer 63

A non randomised study and therefore very susceptible to bias. Researcher decided which groups to put people in. Conclusions drawn are usually limited.

Answer 64

Less subjects required in a crossover trial.

Answer 65

1. Having a wash out period between successive treatments. | 2. Randomising the order of allocation of treatments

Answer 66

1. Point estimates of population parameters are derived from the corresponding sample parameters e.g. the mean, SD, proportion of successes. So if the sample found 75% of people to be in favour of something this is the point estimate for the proportion of the population which are in favour. 2. Interval testing is the interval in which we would expect, given a certain level of confidence, the population parameter to lie. The upper and lower values for the confidence intervals are the confidence limits.

Answer 67

Normal distribution = Mean is best | Skewed = Median is best

Answer 68

1. The sample is small and the data not plausibly normal. | 2. Transformations to address the problem cannot be found.

Answer 69

Sign Test | Wilcoxon signed rank test

Answer 70

The number of observations above and below the hypothesised median. If the null hypothesis is true then these should be equal because the median is a middle observation.

Answer 71

Based on ranks and therefore incorporates some measure of the actual values

Answer 72

The probability that the statistical summary (such as the sample mean difference between two compared groups) would be the same as or more extreme than the actual observed results. When a P value is less than or equal to the significance level, you reject the null hypothesis

Answer 73

Spearman Rank correlation

Answer 74

Wilcoxon paired test

Answer 75

Mann Whitney Test

Answer 76

Mean | Standard Deviation

Answer 77

Median (better for skewed data) | Inter quartile range (tells you of the variation around the median)

Answer 78

z/ One sample T test

Answer 79

Wicoxon/Sign test

Answer 80

z/ Unpaired T test

Answer 81

Paired z/T test

Answer 82

Mann Whitney

Answer 83

Wilcoxon paired test

Answer 84

Chi squared

Answer 85

80% or more of the cells have an EXPECTED VALUES of greater than 5 All expected frequencies are greater than 1 The total sample size is greater than 20

Statistics Flashcards

(115 cards)