Variables Flashcards
When is a Spearman’s rho correlation test used
Non linear variable
Two ordinal, interval or ratio variables
Any distribution
What variable are used in simple linear regression
1 continuous predictor variable
1 continuous outcome variable
What does a z score show
How many standard deviations a value is from the mean
What is the empirical rule
In standard deviation 68% of scores are within 2 standard deviations of the mean
95% are within 4
99.7% are within 6
What do correlation tests do
Check if variables are related without hypothesising a cause and effect relationship
Composite variable
A combination of other variables
Is used when data is being analysed not generated
What is a correlation coefficient
A number between 1 and -1 that shows the strength and direction of a relationship between variables
What is homogeneity of variance
The assumption that the variance within each group being compared is similar across each group
Control variable
Variable that is kept constant throughout the experiment
Latent variable
A variable that is not measured directly but inferred via proxy
What is standard deviation
How much a single data point differs from the mean of the sample
Shows how much variability is in the dataset
Why is a standard error important
It helps estimate how well your sample data represents the whole population
How is the standard deviation calculated
It is the square root of: the sum of each value minus the population mean squared then divided by the number of values in the population
Continuous variable
A nominal variable that can be infinite
Binary variables
Categorical variable with only two possible answers
When is a chi squared test used
With categorical independent and dependent variables
In place of Pearson’s if it does not meet the assumptions
categorical variable
Represent a group of data such divided into categories eg gender, hair colours
What are the three types of t tests
One sample
Two sample/independent
Paired
Dependent variable
Variable that is impacted on by the independent variable but not changed directly by the researcher
When is a point-biserial correlation test used
Linear variables
One binary variable and one quantitative variable
Normal distribution
What is a linear variable relationships
When the results of one variable depend on another
What three assumptions are made with parametric tests
Homogeneity of variance
Normality of data (bell curve) - only applies to quantitative data
Independence of observations - the variables included are not related
What are regression tests used for
Looking for cause and effect relationships
Used to estimate the effect of one or more continuous variable on another
What are residuals
The difference between the observed value and the mean value that a particular model predicts for that observation.
What variables are used in logistic regression
A continuous predictor variable
A binary outcome variable
What does autocorrelation show
The degree of correlation of the same variables between two successive time intervals
Nominal variable
Variable used amounts and numbers eg how tall, how old
Ordinal variables
Variables that can be ordered eg finishing place in a race
Alternate hypothesis
The original hypothesis that assumes influence of one variable on another
When do you use a one sample t test
When comparing a group against a known standard value
Eg national population average
Comparing the acidity of a liquid to a ph neutral of 7
When do you use a two sample or independent t test
When studying groups from two separate samples
Eg from two towns
When do you use a paired t test
If the means come from the same population
Eg before and after an experiment takes place
What is a t-test used for
To compare the means of two groups
What is normal distribution
Data that is symmetrically distributed with no skew
Also known as a bell curve
When is an ANOVA and MANOVA test used
ANOVA - when there is one dependent variable from different samples - eg exam results from multiple schools
MANOVA - when there are two or more dependent variable eg math results, science results, English results individually
What does a chi squared test show
How well sample data fits what is expected
Three types of categorical variables
Binary
Nominal
Ordinal
Discrete variable
A nominal variable that is finite
Null hypothesis
The hypothesis that assumes no relationship exists between two or more variables
What is the p value
The measure of the probability that an observed result or difference occurred only by random chance
A Lower p value shows a greater statistical significance
What do comparison statistics tests do
Look for differences among group means
When is a pearsans r correlation test used
With linear relationships
Two quantitative variables
Any distribution
What does a standard error show
How different the population mean is likely to be from the sample mean
Shows how much the sample mean would varying you were to repeat a study using new samples form the same population
What variables are used in multiple linear regression
2 or more continuous predictor variables
1 outcome variable
What does mahalanobis distance show
The distance between two points in data
Useful for finding outliers
What are three types of comparison tests
T test
ANOVA
MANOVA
Independent variable
Variables that are changes to see the effect on another variable
What is the durbin-Watson test
a test statistic to detect autocorrelation in the residuals from a regression analysis.
What does winsorizing data mean
Replacing an outlier or outliers with the next highest/lowest value that is not an outlier
What is a univariate and multivariate outlier
Univariate - an outlier for just one variable
Multivariate - an outlier for numerous variables
When is a t test used
To determine if a process or treatment (change in one variable) actually effects the population of interest or if there is no relationship
When can a t test be used
When comparing the means of two groups only
Confounding variable
A variable that masks the true effect of another variable in an experiment
Can occur when another variable is closely related to a variable being studied but is not controlled for