Statistics Flashcards

1
Q

What is sampling?

A

randomnly take a small set of data from a huge set to be investigated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two aspects of sampling?

A

randomisation and size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two types of samples?

A

Independent or dependent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How to decide a population size?

A

depends on the research aim

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What information is needed to estimate sample size?

A

clinical difference in practice

standard deviation from previous/pilot studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How to decide sample size?

A

difference in means of 80% in medical research

0.05 significance level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ideal sample size

A

larger than 10 and ideal is 30

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a double blind experiment?

A

nobody knows what the f is going on

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do we know if data is normal distribution?

A

skewness coefficient, >2EM means its not normal distribution
P-p plot- dots arent around the line its not normal distribution
Kolomogorov-smirnov- if the significance is below 0.05 its not normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the null hypothesis?

A

G1 and G2 arent stastically different

If the tested results are <0.01 or 0.05 then reject the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When do you accept the null hypothesis?

A

p>0.05

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the t-test?

A

parametric test with a parameter similar to normal distribution
used to compare two groups of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is an applied situation for a t-test

A

to compare means for two groups of data with small sample size or data distribution should be normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When is ANOVA used?

A

when more than two means are compared with one another, if its just two then the t-test is used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is degree of freedom?

A

n-1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is variance?

A

descriptor to show how far away samples are from the data center
standard deviation squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is an independent sample t-test?

A

two samples from different populations regarding the same variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is paired t-test?

A

two sample means from the same population measuring the same variable at two different times e.g. pre and post test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is variance?

A

similar to standard deviation and is a descriptor to show how far away the data is from the center
standard deviation squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what are the two variances?

A

variance within group shows differences amongst samples and variances between groups shows differences between groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what is the mean of squared differences between groups and within groups

A

between groups= differences between group mean and total mean
within groups= between samples and their group mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what is the F value?

A

ANOVA uses the F value to test if p value is significant

MS between groups/ MS within groups = f value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the two test before/after ANOVA?

A

before- contrast

after- post-hoc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

how do we describe non-numeric data?

A

nominal and ordinal data

frequencies, percentages, distribution, charts- pie, bar, histogram, tables etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

what is the chi test?

A

also called the pearson chi square test
compares the observed and expected frequencies in each category to test if all categories contain the similar proportions of values
is there a difference between measured and expected values?

26
Q

Is Chi squared a parametric or non-parametric test?

A

Non parametric

27
Q

What is the expected frequency in a Chi test?

A

either all are equal or theoretical and the theoritcal depends on your expectation.

28
Q

What is the goodness of fit test?

A

compares the observed and expected frequencies in each category

29
Q

What is the asym sig?

A

when calculating the chi value you get a number and then this number will give you the asym sig
if the asym sig is <0.05 this is significant

30
Q

In the Chi square test which function of SPSS do they keep going on about?

A

crosstab function

31
Q

what are the applications and limitations of the chi test?

A

deals with nominal/ordinal data between two or more variables
could be used to compare numeric data but quantitative data could be lost
applied condition: almost no limitation in application

32
Q

can t-test be applied to stuff without normal distribution?

A

no it cannot

33
Q

how to describe normal distribution vs non-normal distribution?

A

normal- mean and standard deviation

non-normal- median and quartiles

34
Q

what are the two conditions in parametric test?

A

data to be numeric and normal distribution

35
Q

what are the criterion used in non-parametric tests?

A

the number of signs

the total of ranks in groups

36
Q

what does the sign test show?

A

shows which sample is larger rather than the difference between the two

37
Q

What is the wilcox signed ranks test for dependent data?

A

similar to the paired t-test but for non-normal distribution of data. To do with comparing two groups of medians
non-parametric tests

38
Q

what is the mann whitney test?

A
non-parametric data
non-numeric data
compares two independent groups 
uses rank data
does not require any data ditribution
39
Q

talk about the limitations and applications of non-parametric tests

A

chi, mann whitney and wilcox are the three non-parametric tests
used for nominal/ordinal data and to compare two/multiple variables
can compare numeric data with non-normal distribution

almost no limitations in applications

40
Q

what is the difference between the mann whitney and wilcox?

A

both are non-parametric tests and involve the summation of ranks
whitney- independent samples
wilcox- matched/dependent samples

41
Q

what plot can be used to measure whether there is a trend between two variables?

A

scatter/dot plot

42
Q

what is a correlation coefficient and when is it accepted?

A

used to describe whether two variables have a linear relationship and how strong that relationship is
also called pearssons coefficient
if the significance level is <0.05

43
Q

what is the range of vales for the correlation coefficient?

A

-1 to 1
as the correlation comes closer to 1 its called a positive correlation and as the correlation becomes closer to -1 it becomes a negative correlation
correlation of 0 indicates that there is no correlation and the best is a horizontal line through
the more closer to 1 the more correlated two values are

44
Q

when is correlation coefficient significant?

A

when the p value is <0.05 (*)

when the p value is<0.01 (**)

45
Q

describe what the R value <0 and >0 means

A

<0 - one value decreases while another increases

>0- both values increase

46
Q

what is the intercept?

A

the point at which the line cuts through the y axis

47
Q

what is linear regression?

A
linear regression considers that there is a linear relationship between two variables 
independent variable - x
dependent variable - y
b1- intercept 
b2- slope
48
Q

what is standard error of the estimate?

A

the range of predicted value

49
Q

what are residuals?

A

the actual values minus the predicted values - errors produced by the model and the larger this value is the worse it is
the sum of squared differences between the actual and predicted value

50
Q

what are the signs for regression and correlation coefficients?

A

regression coefficient- b1 and b2

correlation coefficient- R

51
Q

can linear regression be used for non-linear stuff?

A

if the variables arent linearly related then they can be transformed to a suitable form an then linear regression can be used

52
Q

what are censored cases?

A

when collecting data some cases cannot be determined/studied outwith factors related to the one being studied

53
Q

what data is needed for survival analysis?

A

information stopped
time analysed
factor studied

54
Q

describe survival analysis based on data quality and time period

A

data quality- the more the samples the better the better the results, if the number of samples isnt enough then dont do the survival analysis
time period

55
Q

what is meta-analysis?

A

a method to use multi source data to analyse what is favoured by most studies
provides the whole picture on an arguable issue from multiple sources
meanly see the mean and effect size?
consider sample size as weight in analysis?

56
Q

what is the odds ratio and how do you calculate it?

A

odds ration can be used as an estimate when the occurence of the factor is rare

57
Q

in spss which function will give you odds ratio?

A

RISK function

58
Q

when do you make a forest plot?

A

meta- analysis

59
Q

define regression

A

construct an equation to describe the relationship between two or more variables

60
Q

what does the residual tell us about the model?

A

the lower the resiuals, the more accurate the predicted and the better the model

61
Q

degree of freedom 1 vs 2?

A

df1 - number of groups - 1

df2- number of samples- number of groups

62
Q

what is the homogenity of variance test?

A

to test for the equality of group variances
not dependent on the asumptionof normality
HV>0.05 the variances are similar otherwise its not