Research methods Flashcards

Question

what is the Shapiro Wilks test

Answer 1

a normality test for a null hypothesis to check that results are normally distributed if p>0.05 we fail to reject null hypothesis

Answer 2

Tests if there is a difference in variance between multiple sets of data p>0.05 there is no difference in variance

Answer 3

data is normal with homogenous variance and therefor you can proceed with t test to analyse data

Answer 4

tests whether the mean of a single sample is significantly different from known or hypothesized mean

Answer 5

used when comparing the means of two independent groups or populations and assesses whether the difference in means between the two groups is statistically significant

Answer 6

used when comparing the means of two related groups or when each data point is paired. Determines whether the is a significant difference in the means

Answer 7

used when ANOVA indicates significant differences and can identify which group differs from others

Answer 8

Must make sure data is normally distributed using Shapiro-Wilkes tests

Answer 9

Tukey's HSD (Honestly Significant Difference test)

Answer 10

if the p value (Pr(>f)) is less that 0.05 you can reject null hypothesis. This indicates that at least one group is significantly different from the others. Yu can use Post-hoc now

Answer 11

ExampleA <- aov(dependent variable ~ independent variable, data = ANOVA1)

Answer 12

used to investigate the effects of two categorical independent variables on a continuously dependent variable

Answer 13

non-parametric rank test for statistical hypothesis testing used either to test the location of a population based on a sample of data, or to compare the locations of two populations using two matched samples

Answer 14

Check groups are independent Normality (shapiro test) Homogeneity (Barlett test)

Answer 15

a relationship between one dependent variable and explanatory variables. They are used mainly for prediction and estimation

Answer 16

Use equations - numerical dependent variables and 1 or more numerical or categorical independent (explanatory) variable

Answer 17

Hypothesize relationship between variables Specify probability distribution of random error term Evaluate the fitted model Use the model for prediction and estimation

Answer 18

Theory - theory of field, mathematical theory, previous research and common sense

Answer 19

simple linear regression when you have only one independent variable multiple linear regression which uses two or more independent variables

Answer 20

by using summary function summary(model) which provides detailed output including coefficients, r squared p values and more

Answer 21

The intercept and coefficients of independent variables

Answer 22

this value measures the models goodness of fit and represents the proportion of the variance in the dependent variable

Answer 23

a low p value (<0.05) for the independent variable(s) suggests a significant relationship between variables

Answer 24

they are essential for visualising and summarising data. They provide a quick way to assess distribution of data and identify outliers

Answer 25

a graphical representation of the distribution of a dataset. It displays a 5 number summary of a set of data

Answer 26

the minimum, first quartile, median, third quartile and maximum

Answer 27

The box itself is the interquartile range and spans from q1 - q3 with the median (q3) inside The lines on either side extend to the maximum and minimum values with 1.5x the IQR. Anything out with this are the outliers

Answer 28

makes it easy to identify skewness, central tendency and spread in data good for visualizing non normal distribution helps spot outliers easy to compare

Answer 29

Order the data Calculate the quartiles Determine the IQR (Q3-Q1) Find upper and lower limits Identify outliers Plot boxplot

Answer 30

The Wilcoxon test and the Mann-Whitney U test

Answer 31

The median represents the centre of the data distribution whilst the box length represents the spread of the middle 50% of the data. The whiskers show the range of most of the data and outliers can be identified as individual points outside the whiskers

Answer 32

they operate under the assumption that the data is normally distributed

Answer 33

t tests, anova

Answer 34

A class of statistical procedure that do not rely on assumptions about the shape or form of the probability distribution from which the data were drawn

Answer 35

You can use these tests with any numeric variables with any distribution

Answer 36

they use more information from available data which allows for more confidence of ruling out chance and finding real differences

Answer 37

data has to be normally distributed, interval or ratio level, and variance must be similar

Answer 38

A method suited for situations involving larger sample sizes where it provides reliable insights into the independence or association between categorical variables

Answer 39

non-parametric rank test for statistical hypothesis testing used either to test the location of a population based on a sample of data, or to compare the locations of two populations using two matched samples

Answer 40

useful in smaller sample sizes or when dealing with 2x2 contingency tables where expected cell counts are low It computes the exact probability of obtaining observed distribution

Answer 41

Similar to Wilcoxon test but for independent samples