Practical W2: Basics of Statistics Flashcards
One of the first things that’s super important after collecting your data is to graphically look at your data by making a
histogram
There are two main ways in which a distribution can deviate from normal - (2)
- skewness
- Kurotsis
Diagram of positive and negative skew
If the skewness value between -1 and 1 in SPSS then
it’s fine
If the skewness value in SPSS is less than -1 then
it is a negative skew = non-normal distribution
If the skewness value in SPSS is greater than 1 then
positive skew = non-normal distribution
Diagram of skewness value shown in SPSS
Kurotsis is basically looking at how
‘pointy’ your histogram is
Kurtosis tells us how much our data lies around the
ends/tails of our histogram which helps us to identify when outliers may be present in the data.
A distribution with positive kurtosis, so much of the data is in the tails, will be very
pointy or leptokurtic
A distribution with negative kurtosis, so the data lies more in the middle, will be more
sloped or platykurtic
Normal distribution will have kurotsis value of
0 (mesokurtic)
Characteristic of a negative skew
tail it is pointing towards the lower values and the data is clustered at the higher values
Characteristic of a positive skew
– the tail is pointing towards the higher values and the data is clustered at the lower values
Diagram of mesokurtic (normal) , leptokurtic and platykurtic distribution curve
Kurotsis value in SPSS between -2 and 2 is
all good, normal kurotsis
If kurotsis value in SPSS is less than -2 then shows
platykurtic (non-normal, issue with kurotsis)
If kurotsis value in SPSS is greater than 2
leptokurtic (non-normal, shows issues with kurotsis)
Diagram of kurotsis value in SPSS
Is kurotsis and skewness value here fine?
Good because both the skewness is between -1 and 1 and kurtosis values are between -2 and 2.
Is kurotsis and skewness values fine here?
Bad because although the skewness is between 1 and -1, we have a problem with kurtosis with a value of 2.68 which is larger than 2 and -2
3 ways to transformations your data to make it closer to normal distribution - (3)
- exponential
- power
- log
There is a tertium quid which prompts the saying that
correlation not causation
What is tertium quid a word for?
third factor?
The tertium quid is a variable that you may not have considered that
could be influencing your result
The tertium quid (third factor) is known as a
confounding variable
Example of may not considered tertium quid variable could be influencing your results - (2)
: we find that drownings and ice cream sales are correlated, we conclude that ice cream sales cause drowning. Are we correct?
NO, , since it is most likely that both are actually due to weather, and when it’s hotter outside people eat more ice cream and go more frequently to the pool or to the beach to swim.The fact that more people go to swim is the reason why there are more drownings.
If one/both of skewness/kurotsis value is out of range than assumptions for
parametric tests is not satisfied
Rule out tertium quid (third factor) through
RCTs = even out confounding variable between groups
In RCT, you randomly assign your participants to two or more groups involving - (2)
one group receives no intervention or experimental manipulation (so your control),
other group will receive the intervention or treatment and then you can directly compare the dependent variables.
To infer causation we need to
actively manipulate the variable we are interested in, and control against a group (condition) where this variable was not manipulated.
Example of control condition in a lesion studies - (2)
double dissociation experiment where one test is affected by a lesion in one area but not a second area and then a different test is conducted which affects the second area but not the first.
The only way we can actually infer causation is by comparing the two controlled situations; one where the cause so the lesion is present and one where the lesion is absent.
Another assumption for parametric tests is having
linearity/addivity
Linearity refers to the - (2)
combined effect of several predictors should form a straight line or show a linear relationship
the data increases at a steady rate like the graph
What does this graph show?
Your cost increases steadily as the number of chocolate bars increases