Statistics Flashcards
What is population/global data?
A huge dataset to be investigated, or an experimental set of data within a specific condition
What is sample data?
A small set of data from the population/global data
What is sampling?
Randomly taking samples from the population
What is mean?
The sum of all sample values divided by the number of samples
What is standard deviation?
Gives the average distance of the samples from the mean
How is standard deviation calculated?
Square root of the (sum of the distances from the mean) divided by the number of samples -1
What is frequency?
How often many similar data occurs/happens
What is distribution?
The shape constructed by data frequencies
True or false:
A histogram can show frequency?
True
What type of curve can be plotted on a histogram?
A normal distribution curve
What is normal distribution?
A range in which there is most data in the middle and less data at either ends of a curve. It is symmetrical around the mean. It shows that data near the mean are more frequent in occurrence compared to data far from the mean
What does variable view in SPSS show?
The name of the variable
The type of data the variable is
A label to describe the variable
Three data types in statistics
Numeric data
Nominal data
Ordinal data
What is numeric data?
Numbers
e.g. body mass, weight, stature
What is nominal data
Category data without rank
e.g. gender
What is ordinal data?
Categories with rank
e.g. feeling
What does data format in SPSS allow?
Subject information to be entered
What is the standard error of the mean?
The standard deviation of means.
Shows the range in which a global mean could fall
How is standard error of the mean calculated?
Standard deviation divided by the square root of n
In what percentage of means is mean+/- 1SE?
68-70% of means
In what percentage of means is mean +/- 2SE?
95% of means
In what percentage of means is mean +/- 3SE?
99% of means
What mean +/- SE is indicative of population mean?
Mean +/- 2 standard error - 95%
What is the confidence interval?
The range in which the global mean could be within
What is a 95% confidence interval?
Mean +/- 2 standard error
What is the median?
The middle value of the sample data
What are quartiles?
The sample values at 25%, 50% and 75%
What can a simple bar diagram be used for?
Counting number of cases
Showing percentages of cases
Showing two variables together
What does a pie plot show?
The percentage of data
What does the box plot show?
The median, quartiles and extreme values within a category
What does an error bar show?
Shows the mean and the 95% confidence interval or SD of the data
What does a scatter/dot diagram show?
Shows the tendency of the data or the relationship between variables
What is independent sample data?
When the measurements have no effect on each other
What is dependent sample data?
When the measurements have an effect on each other
True or false:
Whether the data is dependent or independent depends on the subjects
False - it depends on the experimental situation
Why do we need a reasonable sample size?
Fewer samples could result in a larger difference between the local and global population
What is statistical power?
The likelihood of a significance test detecting an effect when there actually is one
How is population mean range calculated?
Mean +/- 2 SEM
What does the letter W represent in statistics?
The difference between population means
What is subjective data?
Information that comes from opinions, feelings, perceptions etc
What is objective data?
Results produced from measurements
3 ways to determine whether the data is normal distribution
Skewness coefficient
P-P plot
Kolmogorov-Smirnov test
What is skewness coefficient?
A measure of the asymmetry of a distribution
True or false:
Between 2 sets of sample data the mean is sometimes the same?
False - the mean will never be the same
What is a T-test?
A parametric test where t is a parameter used to analyse data with normal distribution or similar to normal distribution
True or false:
The samples should be in normal distribution for T-test to be applied
True
What does a T-test do?
Examines if two means are significantly different, therefore requires two groups of data
What is a paired-sample T-test?
When there is two measurements for each subject
What is an independent T-test?
It compares the means of two independent groups
What test is used for testing for differences among multiple groups of data?
ANOVA
What is variance?
A descriptor to show how far away samples are from the data centre
What are the two variances?
The variance within group shows difference between samples
The variance between groups shows differences between groups
What shows the variance between groups?
The distances between the group means and the total mean
What indicates the variance within a group?
The size/diameter of a cycle/group
What are the two squared differences?
Between the group mean and the total mean
Between the samples and the group mean
What does ANOVA do?
Uses the F value to see if there are differences between groups
One-way ANOVA:
What is it?
What are the conditions?
What is the applied situation?
It is an extended t-test for comparing multiple groups of data together
It requires an independent factor and a quantitative dependent variable
The applied situation is in multi-group data
True or false:
For ANOVA, the data should be in normal distribution or similar to normal distribution
True
What does a Chi squared test do?
Compares the observed and expected frequencies in each category to test if all categories contain the similar proportions of values
Tests whether there is significant differences between groups
Examples of non-numeric data
Score system
Pain
Treatment type
Equipment type
What do non-parametric tests do?
Directly use non-numeric information from data to compare sample groups
What is the Wilcoxon signed-ranks method?
It has the null hypothesis that two related medians are similar
Allows us to compare a single median against a known value or two medians from the same individual group
What is Mann-Whitney test used for?
To compare two independent groups
Suitable for non-numeric data
Uses rank information
What is a scatter dot graph used to show?
Whether or not two variables are correlated
Shows trend and pattern
What is the correlation coefficient?
Used to describe whether two variables have a linear relationship and how strong the relationship is
It measures how variables are linearly related
The closer to/further from 1, the stronger the correlation between variables
Closer to 1
What does R>0 mean?
A variable increases while another variable increases
What does R<0 mean?
A variable increases when another variable decreases
What value is indicative of valuables having a linear relationship?
xxx <0.05
What is linear regression?
It is used to construct an equation to describe the relationship between two, or multiple, variables
What are residuals?
The sum of the squares of differences between the predicted and practical values
How can we use linear regression in variables that are not linear?
By transforming them into a suitable form by using a linear coefficient
What is survival analysis?
Initially used to analyse how death ratios changed with ages
Can now be used to analyse effect of medical therapies e.g. implants
What are censored cases?
Cases from which data cannot be collected or determined by their situations for some reason not related to the factor studied
What is meta-analysis
A method to use multi-source data to analyse the favourite by most of the studies
Why is it good to carry out meta-analysis?
Many studies have their own attitudes, report only the differences, and give different conclusions/opinions
Steps for meta-analysis
Select a factor
Collect data from multiple sources
Calculate parameters
Make a forest plot
Give conclusions
What is the odds ratio?
The proportion of the number of cases to the number of non-cases
What is the relative ratio?
The number of cases with the event compared to the number of samples