Statistics Flashcards

1
Q

What are the definitions of x̄ and SD? what meanings of the indexes?

A

the mean is the average so all samples/ number of samples

  • it describes how data is concentrated

SD gives the average distance from samples to the centre value

  • it describes how data is separated
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is

a) data frequency?
b) data distribution?
c) what relationship exists between the two terms?

A

a) how often similar data occurs (no times sample value occurs)
b) the shape constructed by data distribution
c) frequency constructs the distribution eg in a histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If a huge no of samples are collected from nature

a) what is the distribution?
b) what shape is it?
c) what value is at the peak?

A

a) normal distribution
b) bell shaped
c) the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In ND, what is

a) x̄ + 1SD
b) x̄ + 2SD
c) x̄ + 3SD equal to in %?

A

a) 68.27%
b) 95.45%
c) 99.73%

roughly 65,95,99

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What types of file can be directly imported into SPSS?

A
  • excel
  • txt
  • direct input data in data view
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 3 main types of data?

A
  • numerical
  • nominal: categories without rank eg gender
  • ordinal: categories with rank eg satisfaction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In SPSS what data characteristics can be shown using the histogram?

A
  • frequency
  • distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In SPSS what can users do with the

a) variable view?
b) data view?

A

a) define variables: define name with letter, define type of data eg string/numeric, define how many posession, define no decimal …
b) Edit, calculate and analyse data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How do we calculate the median?

A

median is the middle sample value

  • reorder the values from smallest to biggest and pick the middle one
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how do we work out the mean?

A

the total of the numbers divided by how many numbers there are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are the maximum and minimum for sample data?

A

highest and lowest values

show us data range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In SPSS what can users do with the crosstab function?

A

show 2 variables in one table

run chi-square to hypothesis test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In an error bar what do the

a) circles
b) dashes

represent?

A

a) mean
b) SD, SEM, 95% CI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what characteristics from two variables can be shown using the scatter/dot?

A

tendency of the data or the relationship between variables: certain pattern or trend

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

using SPSS what file types can be exported as output?

A
  • direct copy to word
  • export as excel, word, powerpoint, txt, graph only
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

if two sets of sample data have different means are their global means significantly different? why?

A

depends on significance

  • if no significance it is not coming from the population but from your sampling
  • if significant then global means are different
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

in test of hypothesis, what are primary/null hypothesis, H0 and alternative hypothesis, H1 and H2?

A
  • H0 means that there is no significant difference
  • H1/H2 show significant difference
    • depending on group means
    • H1 = M1>M2 so G1>G2
    • H2 = M1<m2></m2>
    </m2>
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

in test of hypothesis what significant levels are normally used?

A

low p <0.05

high p>0.05

(0.01 would be used to emphasise a strong SD)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

when comparing two sets of sample data under what conditions are two groups of data considered to be significantly different?

A

p<0.05

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What do a single asterisk and double asterisk represent in terms of statistics?

A

* = p<0.05

** = p<0.01

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

under what condition is a primary hypothesis accepted?

A

p>0.05 accept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what main indexes will influence the results in test of hypothesis?

A

4 values

  • x̄ ,mean
  • SD , standard deviation
  • n , no of samples
  • p , probability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

in which situation can t test be applied?

A

numeric data

normal distribution data

two group

small sample size

to examine if two means are significantly different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

if a group of subjects are measured twice in a time interval, eg pre and post treatments, are the measured variables independent or dependent? what statistical method can be used to compare the means?

A
  • dependent data
  • paired t-test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

if a group of subjects are treated in different conditions ie. each patient gets a different type of hip replacement, are the measured variables independent or dependent? what statistical methods can be used to compare?

A
  • independent
  • independent sample t test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

what are usually applied situations for paired sample t test or independent sample t test?

A
  • paired sample if dependent data
  • independent sample t test if independent data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

can t test be applied if data is not continual?

A

no

cannot t test to non-numerical even rank is not continual

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is the standard error of mean?

A

a sample mean deviates from the actual mean of a population; this deviation is the standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How do we calculate standard error of mean?

A

SEM = SD/ √ n

30
Q

What is the confidence interval of x̄ + 2SEM?

A

95% confidence interval

31
Q

How do SD and the number of samples influence SEM?

A

SEM = SD/ √n so if SD inc the SEM increases, if n increases SEM decreases

32
Q

What does ANOVA stand for? What situation is it suitable?

A

Analysis of variances

numeric data comparing multi-group

33
Q

If p<0.05 is found from ANOVA result, does it mean that all groups of data are significantly different from each other?

A

no it means there is some sd among groups but not all groups sd

need to post-hoc test to know which groups as pairs have a SD

34
Q

what does post hoc mean?

A

after ANOVA if p<0.05 we do post-hoc for all groups to see what specific pairs are SD

35
Q

if data is ordinal what method can test the hypothesis?

A

chi-square

non-parametric test (wilcoxon or mann whitney)

36
Q

In chi square analysis what are the theoretical values used to compare with practical data?

A

all categories equal in percentage and sample size so that they all have same chance

eg 50% for two, 33.3% for 3

37
Q

in using chi square analysis, assuming that multiple groups of data are significantly different, does this mean that any two of them will be significantly different?

A

not sure

one p value for multi-group does not mean there is a SD between two groups

we need to run post-hoc to check

38
Q

What is the difference between nominal and ordinal data?

A

ordinal is a category with a rank eg satisfaction whereas nominal is just a category like gender

39
Q

when comparing means for two+ groups of data what situations should users apply non-parametric methods?

A

non-numeric information so ordinal or nominal data

numeric but not ND

40
Q

how to use statistical equations and text book tables eg. t table or z-table, to assess if two groups are significant different?

A

we have equation and we use it to calculate value

  • eg t test to calculate t value from own value
  • use t table from textbook to find t critical value and compare the two values
  • t critical number will define a range
  • our t value will fall in p<0.05 or p>0.05
  • p<0.05 needs to be more than t critical (which is usually 2)
41
Q

what are the main differences between parametric and non-parametric tests?

A

parametric tests use parameters to analyse numeric data and calculate probability

non-parametric tests directly use non-numeric info to compare groups

42
Q

in non-parametric test methods, what kinds of information are used to assess differences between groups of data?

what types of data are used?

A
  • the number of signs
  • the total of ranks

data is ordinal and rank (scale if not ND)

43
Q

when a scatter/dot graph shows that two variables have a certain association can we say that they are linearly associated?

A

no this shows the trend then we need to work out the correlation coefficient and use the significant level to confirm the trend

44
Q

if two variables have a linear correlation, how much significant value would be expected after the test of hypothesis?

A

p<0.05

45
Q

what range should the correlation coefficient be kept within?

A

-1 to 1

  • max 1
  • min 0
46
Q

is it possible that a correlation coefficient can be negative?

A

yes it is possible

47
Q

to obtian a linear regression equation, what coefficients will be calculated or estimated?

A

y = b1 + b2x

  • b1 is the intercept
  • b2 is the slope
48
Q

what is the definition of residuals in linear regression?

A

the actual values of dependent item minus its predicted values

ie. the errors produced by the model

49
Q

can the method of linear regression be extended to non-linear situations?

A

yes

replace the non-linear variable with a linear one

50
Q

why does sampling procedure have to be randomised?

A

so that data is representative of the population

51
Q

what is the censored case? what method is used to analyse the case?

A

when collecting data some of the cases cannot be collected for some reason not related to the factor studied

  • analyse using survival analysis (big sample size, long time data)
52
Q

what is the main idea in meta analysis?

A

uses multi-source data eg. local research centre, publication

to analyse what is favoured by most studies

53
Q

What is statistics?

A

taking a population with huge amounts of data and sampling this to get important information from a smaller amount of data

54
Q

what is the sample mean formula?

A

x̄ = ( Σ xi ) / n

55
Q

What does xi mean?

A

all of the x values

56
Q

What is the equation for standard deviation?

A

√ Σ ( xi – x̄)2 / ( n – 1 )

57
Q

In what type of sample is it hard to have ND?

A

small samples

58
Q

In which type of distribution do we use SEM?

A

normal

59
Q

When distribution isn’t normal what average do we use?

A

median

60
Q

Which type of distribution is a boxplot used for?

A

not normal

61
Q

What do we plot normal distribution on?

A

error bar

62
Q

can we analyse the mean, median and SD for ordinal data?

A

no

so need non-parametric tests eg chi-square

63
Q

If we have dependent, non-numerical data which hypothesis test should be used?

A

wilcoxon signed-ranks test

64
Q

if we have independent, non-numeric data which test should we use?

A

Mann-Whitney

65
Q

In linear regression equation what is

a) dependent variable?
b) independent variable?
c) intercept?
d) slope?

which is the constant and which is the coefficient

A

a) y
b) x
c) b1 (constant)
d) b2 (coefficient

66
Q

in a linear regression equation, what coefficients will be calculated or estimated?

A

the coefficient in front of the variable and a constant

67
Q

When we have a 95% CI what are the upper and lower limits of the mean?

A

x̄ +/- 2SEM

68
Q

What is the degree of freedom?

A

n-1

69
Q

What type of distribution is needed for ANOVA?

A

normal

70
Q

Which test can be used to hypothesis test percentage data?

A

chi square

71
Q

What is R?

when it is used what does data need to be

A

pearson’s correlation coefficient

continuous

72
Q

Change linear regression equation to what the values are?

A

y= b1 + b2x

dependent variable = constant + coefficient (independent variable)