Statistics Flashcards by Niamh Fitzpatrick

What are the definitions of x̄ and SD? what meanings of the indexes?

the mean is the average so all samples/ number of samples

it describes how data is concentrated

SD gives the average distance from samples to the centre value

it describes how data is separated

How well did you know this?

Not at all

Perfectly

What is

a) data frequency?
b) data distribution?
c) what relationship exists between the two terms?

a) how often similar data occurs (no times sample value occurs)
b) the shape constructed by data distribution
c) frequency constructs the distribution eg in a histogram

How well did you know this?

Not at all

Perfectly

If a huge no of samples are collected from nature

a) what is the distribution?
b) what shape is it?
c) what value is at the peak?

a) normal distribution
b) bell shaped
c) the mean

How well did you know this?

Not at all

Perfectly

In ND, what is

a) x̄ + 1SD
b) x̄ + 2SD
c) x̄ + 3SD equal to in %?

a) 68.27%
b) 95.45%
c) 99.73%

roughly 65,95,99

How well did you know this?

Not at all

Perfectly

What types of file can be directly imported into SPSS?

excel
txt
direct input data in data view

How well did you know this?

Not at all

Perfectly

What are the 3 main types of data?

numerical
nominal: categories without rank eg gender
ordinal: categories with rank eg satisfaction

How well did you know this?

Not at all

Perfectly

In SPSS what data characteristics can be shown using the histogram?

frequency
distribution

How well did you know this?

Not at all

Perfectly

In SPSS what can users do with the

a) variable view?
b) data view?

a) define variables: define name with letter, define type of data eg string/numeric, define how many posession, define no decimal …
b) Edit, calculate and analyse data

How well did you know this?

Not at all

Perfectly

How do we calculate the median?

median is the middle sample value

reorder the values from smallest to biggest and pick the middle one

How well did you know this?

Not at all

Perfectly

how do we work out the mean?

the total of the numbers divided by how many numbers there are

How well did you know this?

Not at all

Perfectly

what are the maximum and minimum for sample data?

highest and lowest values

show us data range

How well did you know this?

Not at all

Perfectly

In SPSS what can users do with the crosstab function?

show 2 variables in one table

run chi-square to hypothesis test

How well did you know this?

Not at all

Perfectly

In an error bar what do the

a) circles
b) dashes

represent?

a) mean
b) SD, SEM, 95% CI

How well did you know this?

Not at all

Perfectly

what characteristics from two variables can be shown using the scatter/dot?

tendency of the data or the relationship between variables: certain pattern or trend

How well did you know this?

Not at all

Perfectly

using SPSS what file types can be exported as output?

direct copy to word
export as excel, word, powerpoint, txt, graph only

How well did you know this?

Not at all

Perfectly

if two sets of sample data have different means are their global means significantly different? why?

depends on significance

if no significance it is not coming from the population but from your sampling
if significant then global means are different

How well did you know this?

Not at all

Perfectly

in test of hypothesis, what are primary/null hypothesis, H0 and alternative hypothesis, H1 and H2?

H0 means that there is no significant difference
H1/H2 show significant difference
- depending on group means
- H1 = M1>M2 so G1>G2
- H2 = M1<m2></m2>
</m2>

How well did you know this?

Not at all

Perfectly

in test of hypothesis what significant levels are normally used?

low p <0.05

high p>0.05

(0.01 would be used to emphasise a strong SD)

How well did you know this?

Not at all

Perfectly

when comparing two sets of sample data under what conditions are two groups of data considered to be significantly different?

p<0.05

How well did you know this?

Not at all

Perfectly

What do a single asterisk and double asterisk represent in terms of statistics?

* = p<0.05

** = p<0.01

How well did you know this?

Not at all

Perfectly

under what condition is a primary hypothesis accepted?

p>0.05 accept

How well did you know this?

Not at all

Perfectly

what main indexes will influence the results in test of hypothesis?

4 values

x̄ ,mean
SD , standard deviation
n , no of samples
p , probability

How well did you know this?

Not at all

Perfectly

in which situation can t test be applied?

numeric data

normal distribution data

two group

small sample size

to examine if two means are significantly different

How well did you know this?

Not at all

Perfectly

if a group of subjects are measured twice in a time interval, eg pre and post treatments, are the measured variables independent or dependent? what statistical method can be used to compare the means?

dependent data
paired t-test

How well did you know this?

Not at all

Perfectly

if a group of subjects are treated in different conditions ie. each patient gets a different type of hip replacement, are the measured variables independent or dependent? what statistical methods can be used to compare?

* independent * independent sample t test

what are usually applied situations for paired sample t test or independent sample t test?

* paired sample if dependent data * independent sample t test if independent data

can t test be applied if data is not continual?

no cannot t test to non-numerical even rank is not continual

What is the standard error of mean?

a sample mean deviates from the actual mean of a population; this deviation is the standard error

How do we calculate standard error of mean?

SEM = SD/ √ n

What is the confidence interval of x̄ + 2SEM?

95% confidence interval

How do SD and the number of samples influence SEM?

SEM = SD/ √n so if SD inc the SEM increases, if n increases SEM decreases

What does ANOVA stand for? What situation is it suitable?

Analysis of variances numeric data comparing multi-group

If p\<0.05 is found from ANOVA result, does it mean that all groups of data are significantly different from each other?

no it means there is some sd among groups but not all groups sd need to post-hoc test to know which groups as pairs have a SD

what does post hoc mean?

after ANOVA if p\<0.05 we do post-hoc for all groups to see what specific pairs are SD

if data is ordinal what method can test the hypothesis?

chi-square non-parametric test (wilcoxon or mann whitney)

In chi square analysis what are the theoretical values used to compare with practical data?

all categories equal in percentage and sample size so that they all have same chance eg 50% for two, 33.3% for 3

in using chi square analysis, assuming that multiple groups of data are significantly different, does this mean that any two of them will be significantly different?

not sure one p value for multi-group does not mean there is a SD between two groups we need to run post-hoc to check

What is the difference between nominal and ordinal data?

ordinal is a category with a rank eg satisfaction whereas nominal is just a category like gender

when comparing means for two+ groups of data what situations should users apply non-parametric methods?

non-numeric information so ordinal or nominal data numeric but not ND

how to use statistical equations and text book tables eg. t table or z-table, to assess if two groups are significant different?

we have equation and we use it to calculate value * eg t test to calculate t value from own value * use t table from textbook to find t critical value and compare the two values * t critical number will define a range * our t value will fall in p\<0.05 or p\>0.05 * p\<0.05 needs to be more than t critical (which is usually 2)

what are the main differences between parametric and non-parametric tests?

parametric tests use parameters to analyse numeric data and calculate probability non-parametric tests directly use non-numeric info to compare groups

in non-parametric test methods, what kinds of information are used to assess differences between groups of data? what types of data are used?

* the number of signs * the total of ranks data is ordinal and rank (scale if not ND)

when a scatter/dot graph shows that two variables have a certain association can we say that they are linearly associated?

no this shows the trend then we need to work out the correlation coefficient and use the significant level to confirm the trend

if two variables have a linear correlation, how much significant value would be expected after the test of hypothesis?

p\<0.05

what range should the correlation coefficient be kept within?

-1 to 1 * max 1 * min 0

is it possible that a correlation coefficient can be negative?

yes it is possible

to obtian a linear regression equation, what coefficients will be calculated or estimated?

y = b1 + b2x * b1 is the intercept * b2 is the slope

what is the definition of residuals in linear regression?

the actual values of dependent item minus its predicted values ie. the errors produced by the model

can the method of linear regression be extended to non-linear situations?

yes replace the non-linear variable with a linear one

why does sampling procedure have to be randomised?

so that data is representative of the population

what is the censored case? what method is used to analyse the case?

when collecting data some of the cases cannot be collected for some reason not related to the factor studied - analyse using survival analysis (big sample size, long time data)

what is the main idea in meta analysis?

uses multi-source data eg. local research centre, publication to analyse what is favoured by most studies

What is statistics?

taking a population with huge amounts of data and sampling this to get important information from a smaller amount of data

what is the sample mean formula?

x̄ = ( Σ xi ) / n

What does xi mean?

all of the x values

What is the equation for standard deviation?

√ Σ ( xi – x̄)² / ( n – 1 )

In what type of sample is it hard to have ND?

small samples

In which type of distribution do we use SEM?

normal

When distribution isn't normal what average do we use?

median

Which type of distribution is a boxplot used for?

not normal

What do we plot normal distribution on?

error bar

can we analyse the mean, median and SD for ordinal data?

no so need non-parametric tests eg chi-square

If we have dependent, non-numerical data which hypothesis test should be used?

wilcoxon signed-ranks test

if we have independent, non-numeric data which test should we use?

Mann-Whitney

In linear regression equation what is a) dependent variable? b) independent variable? c) intercept? d) slope? which is the constant and which is the coefficient

a) y b) x c) b1 (constant) d) b2 (coefficient

in a linear regression equation, what coefficients will be calculated or estimated?

the coefficient in front of the variable and a constant

When we have a 95% CI what are the upper and lower limits of the mean?

x̄ +/- 2SEM

What is the degree of freedom?

n-1

What type of distribution is needed for ANOVA?

normal

Which test can be used to hypothesis test percentage data?

chi square

What is R? when it is used what does data need to be

pearson's correlation coefficient continuous

Change linear regression equation to what the values are?

y= b1 + b2x dependent variable = constant + coefficient (independent variable)

Statistics Flashcards

(72 cards)