Lecture Week 3 Flashcards

Question 1

Q

6 reasons for conducting exploratory data analysis

Answer

A

Check for data entry errors
Obtain a descriptive analysis of the data
Find patterns that are not obvious
Analyse and deal with missing data
Checking for outliers
Checking assumptions

Question 2

Q

What is Central Tendency

Answer

A

Tendency for the values of a random variable to cluster round its mean, mode, or median.

Question 3

Q

Multiple Measures of Central Tendency

Answer

A

Summary statistic that represents the center point value of a dataset
Three most common measures of central tendency are the mean, median, and mode.

Question 4

Q

Multiple Measures of Variability

Answer

A

Define how far away the data points tend to fall from the center.
Range, Interquartile Range, Variance and Standard Deviation

Question 5

Q

Quantitative measures of Shape

Answer

A

The distribution shape of quantitative data can be described as there is a logical order to the values, and the ‘low’ and ‘high’ end values on the x-axis of the histogram are able to be identified.
Histograms
Kurtosis
Skewedness

Question 6

Q

Confidence Intervals

Answer

A

For 95% confidence intervals, an average of 19 out of 20 contain the population parameter.
Suppose you have a 95% confidence interval of [5 10] for the mean.
You can be 95% confident that the population mean falls between 5 and 10

Question 7

Q

Normality & Sample Size

Answer

A

Normality and Skewness divided SE Skewness are impacted by sample size
If there is a really large sample, tests become hypersensitive
Even trivial deviations from normality will violate the assumption of normality
Can create a false positive
Skewness doesn’t really change though

Question 8

Q

Monte Carlo Tests

Answer

A

Simultated Studies that change the characteristics of the data show that even gross deviations from Normality don’t impact the statsistical significance of the tests.
Most tests can withstand even gross deviations from Normality - they are called robust
*

Question 9

Q

Central Limit Theorem

Answer

A

If you have a population with mean μ and standard deviation σ and large random samples
The distribution of the sample means will be approximately normally distributed.

Question 10

Q

What to do if you decide that data is not Normal?

Answer

A

No simple answer
You could do a transformation
Sometimes transforming the data can fix a problem of Normality
If transforming the data successfully converts the data to normality you can be confident the data is in fact Normally Distributed

Question 11

Q

Transformation

Answer

A

Applying a simple mathematical operation to data to deal with violations of assumptions

Question 12

Q

Common Transformations

Answer

A

Log 10
Square Root
Reciprocal

Question 13

Q

Log10 Transformation

Answer

A

Base 10 Logarithm
Formula in SPSS: Transform/Compute Variable/rename Target Variable/Numeric Expression: lg10(Variable)

Question 14

Q

Square Root Transformation

Answer

A

Formula in SPSS: Transform/Compute Variable/rename Target Variable/Numeric Expression: sqrt(variable)

Question 15

Q

Reciprocal Transformation

Answer

A

Formula in SPSS: Transform/Compute Variable/rename Target Variable/Numeric Expression: 1/Variable

Question 16

Q

Homogeneity of Variance

Answer

Study These Flashcards

A

This is problematic when we have an unbalanced sample
Tested Using Levene’s Test
Assumption underlying both t tests and F tests
Population variances of two or more samples are considered equal.
Corrected using one Transformation called a Power Transformation

Question 17

Q

Levene’s Test

Answer

Study These Flashcards

A

Tests Homogeneity of Variance
Must include a grouping variable - that is a variable which can be placed into groups (such as gender)
Formula in SPSS: Analyse/Descrfiptive Statistics/Explore/Dependent List: variable/Factor List: grouping variable/Plots/Spread vs Level with Levene Test/Power Estimation
In output look at Test of Homogeneity of Variance and then look at Based on Mean
And Standard Deviation

Question 18

Q

Power Transformation

Answer

Study These Flashcards

A

Used to transform data when the assumption of Homogeneity of Variance has been violated
This is raising figures to a Power Value sauch as squared or cubed
In SPSS Transform/Compute Variable/Rename: Target Variable/Numeric Expression: state ** variable

Question 19

Q

Spread vs Level Plot

Answer

Study These Flashcards

A

Plot itself is kinda pointless
In the fine print it says: Power for Transformation
This is the number to use when doing a Power Transformation to fix a problem with Levene’s Tests

Question 20

Q

General comments about transformations

Answer

Study These Flashcards

A

They are not a magic bullet
They don’t cope with zeros (there needs to be a constant such as adding 10 to every score)
They are unpredictable and can affect Normality and Homogeneity of Variance even if you weren’t planning to.
Some data is “untransformable”
Only use transformed figures to fix statistical tests
Only report from the true data

Question 21

Q

Acceptable Skewness level to acheive Normality Assumption

Answer

Study These Flashcards

A

Skewness statistic divided by std error equals either >+2 or <-2

Lecture Week 3 Flashcards

(21 cards)