Practical W2: Basics of Statistics Flashcards

1
Q

One of the first things that’s super important after collecting your data is to graphically look at your data by making a

A

histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

There are two main ways in which a distribution can deviate from normal - (2)

A
  • skewness
  • Kurotsis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Diagram of positive and negative skew

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

If the skewness value between -1 and 1 in SPSS then

A

it’s fine

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

If the skewness value in SPSS is less than -1 then

A

it is a negative skew = non-normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

If the skewness value in SPSS is greater than 1 then

A

positive skew = non-normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Diagram of skewness value shown in SPSS

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Kurotsis is basically looking at how

A

‘pointy’ your histogram is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Kurtosis tells us how much our data lies around the

A

ends/tails of our histogram which helps us to identify when outliers may be present in the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

A distribution with positive kurtosis, so much of the data is in the tails, will be very

A

pointy or leptokurtic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A distribution with negative kurtosis, so the data lies more in the middle, will be more

A

sloped or platykurtic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Normal distribution will have kurotsis value of

A

0 (mesokurtic)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Characteristic of a negative skew

A

tail it is pointing towards the lower values and the data is clustered at the higher values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Characteristic of a positive skew

A

– the tail is pointing towards the higher values and the data is clustered at the lower values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Diagram of mesokurtic (normal) , leptokurtic and platykurtic distribution curve

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Kurotsis value in SPSS between -2 and 2 is

A

all good, normal kurotsis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

If kurotsis value in SPSS is less than -2 then shows

A

platykurtic (non-normal, issue with kurotsis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

If kurotsis value in SPSS is greater than 2

A

leptokurtic (non-normal, shows issues with kurotsis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Diagram of kurotsis value in SPSS

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Is kurotsis and skewness value here fine?

A

Good because both the skewness is between -1 and 1 and kurtosis values are between -2 and 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Is kurotsis and skewness values fine here?

A

Bad because although the skewness is between 1 and -1, we have a problem with kurtosis with a value of 2.68 which is larger than 2 and -2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

3 ways to transformations your data to make it closer to normal distribution - (3)

A
  1. exponential
  2. power
  3. log
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

There is a tertium quid which prompts the saying that

A

correlation not causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is tertium quid a word for?

A

third factor?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

The tertium quid is a variable that you may not have considered that

A

could be influencing your result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

The tertium quid (third factor) is known as a

A

confounding variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Example of may not considered tertium quid variable could be influencing your results - (2)

A

: we find that drownings and ice cream sales are correlated, we conclude that ice cream sales cause drowning. Are we correct?

NO, , since it is most likely that both are actually due to weather, and when it’s hotter outside people eat more ice cream and go more frequently to the pool or to the beach to swim.The fact that more people go to swim is the reason why there are more drownings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

If one/both of skewness/kurotsis value is out of range than assumptions for

A

parametric tests is not satisfied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Rule out tertium quid (third factor) through

A

RCTs = even out confounding variable between groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

In RCT, you randomly assign your participants to two or more groups involving - (2)

A

one group receives no intervention or experimental manipulation (so your control),

other group will receive the intervention or treatment and then you can directly compare the dependent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

To infer causation we need to

A

actively manipulate the variable we are interested in, and control against a group (condition) where this variable was not manipulated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Example of control condition in a lesion studies - (2)

A

double dissociation experiment where one test is affected by a lesion in one area but not a second area and then a different test is conducted which affects the second area but not the first.

The only way we can actually infer causation is by comparing the two controlled situations; one where the cause so the lesion is present and one where the lesion is absent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Another assumption for parametric tests is having

A

linearity/addivity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Linearity refers to the - (2)

A

combined effect of several predictors should form a straight line or show a linear relationship

the data increases at a steady rate like the graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What does this graph show?

A

Your cost increases steadily as the number of chocolate bars increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

This graph shows multiplicative/non-linear (not steady but sharp increase/change in data) which is not an assumption of

A

parametric tests

38
Q

What does this graph show?

A

might feel ok if you eat a few chocolate bars but after that the risk of you having a stomach-ache increases quite rapidly the more chocolates you eat.

39
Q

Why is it important to check for linearity in your data?

A

your statistical analysis will be wrong even if your other assumptions are correct because a lot of statistical tests are based on linear models.

40
Q

When we talk about additivity/linearity we are referring to the combined effect of

A

several predictors

41
Q

What is measurement error?

A

The discrepancy between the actual value we’re trying to measure and the number we use to represent that value.

42
Q

Example of measurement error - (2)

A

conducting an experiment where I was measuring the length of a tree and used cm and someone else in my research group measured the same tree using a different metric and got a different value from me that’s a measurement error.

This is an example or human error but recording instrument failure is another possibility.

43
Q

What are the 2 types of measurement error? - (2)

A
  • Systematic measurement error
  • Random measurement error
44
Q

Measurement error can happen across all psychological experiments from…

A

recording instrument failure to human error.

45
Q

What is systematic measurement error?

A

when the error is proportional to the the true value and effects the results of experiment in a predictable direction

46
Q

What is example of systematic measurement error?

A

for example if I know I am 5ft2 and when I go to get measured I’m told I’m 6ft this is a systematic error and pretty identifiable - these usually happen when there is a problem with your experiment

47
Q

What is random measurement error and when does it usually occurs? - (2)

A

when the measurable values are inconsistent when repeated measures of a constant attribute or quantity is taken,

so this error happens by chance and is more related to natural variabilit

48
Q

Example of random measurement error - (2)

A

my height is 5ft2 when I measure it in the morning but its 5ft when I measure myself in the evening.

This is because my measurements were taken at different times so there would be some variability – for those of you who believe you shrink throughout the day.

49
Q

Measurement error is completely different from variance in the sense that it is the

A

average spread of your data

50
Q

Variance is specifically the averaged squared deviation from

A

each number from its mean

51
Q

Variance helps us assess group differences to determine whether the populations that our samples come from

A

differ from each other

52
Q

How to calculate variance?

A
53
Q

Example of variance in line graph (orange dots and lines are variance)

A
54
Q
A
55
Q

The purpose of a control condition is to allow inferences about causality as field’s quote was:

A

only way to infer causality is through comparison of two controlled situations: one in which cause is present and one in which cause is absent

56
Q

What are residuals?

A

difference between the observed value of the dependent variable and the predicted value (usually mean).

57
Q

GLM assumption is that residuals will be

A

normally distributed - observed values of a variable will be normally distributed around the predicted value.

58
Q

Last assumption of GLM: Homoscedasticity which is that

A

residuals have constant variance at every level of x – for each level of the independent variable the amount of error or “noise” has a similar variance

59
Q

What is a dependent variable?

A

A dependent variable (or outcome variable) is a variable that is thought to be affected by changes in an independent variable.

60
Q

What is a confounding variable? - (2)

A

A confounding variable is a variable which has an unintentional effect on the dependent variable.

When carrying out experiments we attempt to control these extraneous variables; however, there is always the possibility that one of these variables is not controlled and if this affects the dependent variable in a systematic way, we call this a confounding variable.

61
Q

Predictor variables is

A

variable that is thought to predict another variable.

62
Q

What is an independent variable? - (2)

A

An independent variable is a variable that is thought to be the cause of some effect.

This term is usually used in experimental research to denote a variable that the experimenter has manipulated.

63
Q

We can not control for everything especially in sale of chocolate bars we might expect other variables to impact popularlity of chocolate so in LM (linear model) we can add something called - (4)

A

predictor variable, this are additional variables that are related to what your variable of interest.

For example, the time of year may be a predictor variable – like over easter you may see an increase in sales

In GLM you can plug this predictor variable and any others to expand your model using predictor variables i.e independent variables you may not be directly interested in.

we have several predictors in a regression it is a multiple regression.

64
Q

central limit theorem tells us that that if we have enough participants (typically larger than 30) the sampling distribution of the mean approaches a

A

normal distribution

65
Q

The central limit theorem states that the sampling distribution of the mean approaches a normal distribution, as the sample size increases.

This fact holds especially true for

A

sample sizes over 30 –> N >30

66
Q

, as a sample size increases, the sample mean and standard deviation will be (CLT)

A

closer in value to the population mean μ and standard deviation σ .

67
Q

The central limit theorem tells us that no matter what the distribution of the population is, the shape of the sampling distribution will approach normality as the sample size (N)

A

increases

68
Q

How is CLT useful? - (2)

A

research never knows which mean in the sampling distribution is the same as the population mean,

but by selecting many random samples from a population the sample means will cluster together, allowing the research to make a very good estimate of the population mean.

69
Q

as the sample size (N) increases the (CLT)

A

sampling error will decrease

70
Q

In a normal distribution the values of skew and kurtosis are

A

0

71
Q

Definition of tertium quid

A

the possibility that an apparent relationship between two
variables is actually caused by the effect of a third variable on them both (often called the third-variable
problem)

72
Q

Definition of confounding variable

A

a variable (that we may or may not have measured) other than the predictor variables in which we’re interested that potentially affects an outcome variable.

73
Q

Confounding variable jeopardises the

A

reliability and validity of an experiment’s outcome

74
Q

Confounding variables can be measured using reliable and

A

unreliable scale

75
Q

A test can still measure a useful construct or vaeriable but still not be

A

valid

76
Q

Internal consistency is - (2) and example

A

It measures whether several items that propose to measure the same general construct produce similar scores.

e.g., pp expressed agreement with statement like “enjoyed rock music” and disagreed with statement like “I hate rock music”

77
Q

DV or outcome variable is variable thought to be affected by changes in

A

independent variable

78
Q

An independent variable is a variable that is thought to be the cause of

A

some effect

79
Q

Reliability is whether an instrument can be

A

interpreted consistently across different situations

80
Q

What is the ‘fit’ of a model?

A

The ‘fit’ of the model is the degree to which a statistical model represents the data collected

81
Q

Counterbalancing can compensate for

A

practice effects as ensure that they produce no systematic variation between our conditions since it counterbalances the order in which person participates in a condition

82
Q

Practice effects are an issue in what design?

A

repeated design

83
Q

Giving participants a break between tasks is a technique used to compensate

A

boredom effects

84
Q

Homogenous variance assumption is that variance

A

within each of the populations is equal

85
Q

Residual variance helps us confirm how well a - (2)

A

regression line that we constructed fits the actual data set.

The smaller the variance, the more accurate the predictions are

86
Q

The coefficient of determination is the correlation

A

coefficent squared: amount of variability in one variable shared by another

87
Q

The sum of squares, variance and standard deviation are all measures of the

A

dispersion or spread of data around the mean (

88
Q

The probability is p = 0.80 that a patient with a certain disease will be successfully treated with a new medical treatment. Suppose that the treatment is used on 40 patients. What is the “expected value” of the number of patients who are successfully treated?

Calculation

A

because 80% of 40 patients is 32 (or 40 x .80 = 32)

89
Q

The sum of squared errors is the sum of the

A

squared deviances

90
Q

Assumptions of parametric data - (4)

A
  1. Normally distributed data
  2. Homogenity of variance: ariances should be the
    same throughout the data.
  3. Data measured at least at interval level
  4. Indpeendence