5.Review of Descriptive statistics and hypothesis Flashcards

1
Q

what are descriptive statistics?

A

statistics simply to describe data collected, whether it be sample or population data. It is a Screen of the data and observation of trends

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are inferential statistics?

A

use sample statistics to infer something about a population
to test whether a difference/relationship seen in sample data is sufficiently large to accept it may be real in the population
allows us to test hypotheses and make decisions based on sample data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what do equations aim to do?

A

achieve specific things for specific purposes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are equations made up of?

A

subcomponents all of which do something useful for achieving that purpose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what do equations produce?

A

numbers that are meaningful with respect to that purpose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the first thing we want to do when we imagine a set of data?

A

have a look at its distribution and we might want to think about how to characterise that distribution numerically

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are the characteristics of a data set?

A

central tendency, variability and shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is central tendency

A

mean
median
mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is variability?

A
sum of squares
variance 
standard deviation
range
standard error
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is the normal distribution?

A

a function that represents the distribution of many random variables as a symmetrical bell-shaped graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is modality (with regard to the shape of a distribution)?

A

the number of central clusters that a distribution possess

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are the two types of modality?

A

unimodal and bimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

unimodal

A

scores vary around one central point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

bimodal

A

scores vary around two “central” points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does kurtosis mean?

A

“Peakedness” - how tightly clustered are scores arond the mean?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

skew

A

the symmetry of the rails of the distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what are the characteristics of “normality” curve?

A

distribution is unimodal
has moderate peakedness
and has symmetric tails.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what does Sigma designate?

A

“The sum of” - so simply add them up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is the symbol for sigma?

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what does ∑x mean?

A

the sum of all values of x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

what does the mean tell us?

A

something useful about the center of the data-set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what does the mean not tell us?

A

doesnt tell us anything about the variability around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what is the equation of the mean?

A

mean=M= (∑x)/n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

what is a simple way we can calculate how each participant’s score varies with respect to the mean?

A

subtract the mean from each participant’s score.

X-M

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
how does subtracting the mean from each participant's score characterise the data set as a whole?
when using sigma thus ∑(X-M) This will always sum to zero. This is because we have subtracted the mean from each score that contributes to the mean. All we have left is the variability around the mean (which is 0)
26
what is ^2 (to the power of 2) also known as?
squared
27
what does X^2 designate?
X squared or X * X (X multiplied by itself)
28
why is using "square" handy?
because the square of negative numbers is positive
29
what is the abbreviation of sum of squared deviation?
SS
30
what is another way to say "Sum of squared deviation"
sum of squares
31
what is the equation for sum of squares or sum of squared deviations?
SS= ∑(X-M)^2
32
what does the sum of squared tell us?
it tells us something about the total variability in the data set, but does not really characterise the degree to which each participant varies around the mean
33
what is the abbreviation for variance?
SD^2 or σ^2
34
how do we calculate the variance?
by dividing the sum of squares by the number of operations minus 1 That is: σ^2= SS/(n-1)
35
what is the complete equation for variance?
σ^2= | ∑(X-M)^2 ---------------- (n-1)
36
what happens when you take the square root of the variance?
we can calculate the standard deviation
37
what is the abbreviation for standard deviation?
σ or SD
38
what is the complete equation for the standard deviation?
σ = √( SS / (n-1) )
39
what is the standard deviation?
the average amount of variability around the mean.. This is useful as any information about the degree of variability around the mean is important
40
what is the degrees of freedom?
the number of values in the final calculations of a statistic that are free to vary
41
what is the abbreviation of degrees of freedom?
df
42
what is the initial degrees of freedom equal to?
the number of observations
43
what is the abbreviation for the number of observations?
N
44
what is the equation for degrees of freedom when testing variability?
N-1
45
why do we minus 1 from N when calculating variabiliy (standard deviation) using degrees of freedom?
because when calculating the SD you first have to calculate the mean. In doing so, you use up one of your degrees of freedom. Therefore the df that remains for calculating the SD is N-1
46
what does using a degree of freedom where N-1 allow?
more accurate estimate of population parameters, which is what we want to do since we want to make inferences
47
what is the usual chosen measure of central tendency?
the mean
48
what doe the chosen measure of central tendency provide
provides an estimate of the level of performance in each condition
49
what is the usual measure of variability?
standard deviation
50
what does the measure of variability tell us?
how reliable the estimate is
51
What can outliers or extreme scores do?
effect both the measures of central tendency (especially the mean) and variability
52
what is the golden rule when measuring central tendency?
the measure of central tendency without an companying measure of variability cannot be accurately interpreted
53
finish the sentence: Depending on the characteristics of the distribution the measure of central tendency may...
not be a good indicator of how the subjects performed
54
what is the measure of central tendency an estimate of?
effect size
55
what is the measure of variability an estimate of?
error
56
what is the general form that most of the statistical tests can use?
Stat= (Estimate of Effect Size) / (Estimate of Error) This is known as the stat value
57
how do we do inferential statistics?
compare the stat value against an approproate probability distribution
58
what can we infer if the stat value is sufficiently far from the center of the probability distribution?
that the stat value is significantly different from the mean
59
what does it mean to be significantly different or significantly far?
when p
60
when do we decide our p value (or significant level)
before we do out statistical test
61
what is the central tendency characteristic of a normal distribution?
mean = median = mode
62
what is the standard normal distribution?
Mean = 0 SD = 1 where every score or point on the distribution is associated with a probability of how often that score arises
63
what is the Z score?
the standardised normal distribution. IT is basically telling us how many SDs away from the M a particular score is
64
how can we calculate the z score?
if we know the mean (M) and the standard deviation (SD) of our set of data set, we can convert any score (X) to a Z-score simply by subtracting the mean, and the scaling (i.e. dividing) by the SD
65
what is the equation for a z score?
Z= (X-M) / SD
66
how does one find the Z score?
at the back of any leading stats book or using an online Z calculator
67
How do we know if a score is an outlier
if the Z score is > 3
68
what do we do with a Z > 3
we would exclude these scores from further analysis
69
what is an appropriate estimate of effect size?
the difference between an individual's score and the mean of the distribution of the group of (individual's) scores This is appropriate because we are treating the group as the population of interest
70
what is an appropriate estimation of error>
the SD of the distribution of the group of (individual) scores this is appropriate because we are essentially treating the group as the population of interest
71
what is finding a z score a case of?
hypothesis testing
72
what is the Z test asking?
does this particular individual belong to or differ from a particular population (of which we know the mean and SD). more generally, we are asking questions about a group of people, where the population mean and the SD may be unknown
73
what do we need to do if we want to compare the mean of a group of peoples' scores? This is normally the case in an experiment
we need to compare this against a distribution of group mean scores
74
what is another way to say a distribution of group mean scores?
a distribution of means
75
the larger the set of means...
the smaller the variability
76
what is the comparison between a distribution of sampling means and a distribution of any given sample?
it has a much lower variability. This is proportional to the square root of the number of observations
77
what should we do if we wat to test a sample mean?
compare it to a distribution of sample means. But we do not need a whole bunch of sample means to form a distribution to test our particular sample mean of interest against
78
what does the behaviour of a normal distribution allow us to do?
to make an estimate of the variance (error) of the distribution of sample means
79
What is the Stsandard error of the mean
is the sample SD divided by the square root of the number of observations in the sample
80
what is the abbreviation of the standard error of the mean?
S(little)M
81
what is the equation of the standard error of the mean>
S_M= σ / √n
82
If we have a known population mean ( μ =100) and standard deviation (SD =10), we can determine whether a sample mean (M=104.75, n=20) is “significantly” different to the population. How can we do this?
using the Z equation. Z= (M-μ) / S_M = 104.75 – 100 / 10√20 =4.75 / 2.24 =2.12 as Z=2.12 is inside the critical region (below -1.97 or above 1.96) we can rejuct the null hypothesis and say there is a significant difference
83
what is the z equation for determining whether a sample mean is significantly different to the population?
Z= (M-μ) / S_M
84
when would you use a one sample t-test?
sometime the population parameters are not known. where the populatin mean is known by the Sd is not what can we use a one sample t-test
85
what is the equation for a one sample t-test?
t = (M-μ) / ( S / √n )
86
what is S in the one sample t-test equation?
S = estimated population standard deviation
87
wheat is not needed when estimating the population distribution?
the standard normal distribution (since one or more population parameter is unknown)
88
What is used instead of the standard normal distribution when estimating the population distribution?
a special family of distributions called t distribution
89
what are t distributions
approximations of the Z distribution, which changes shape according to the size of the degrees of freedom
90
why do t distributions change shape according to the size of the degrees of freedom?
because the larger our sample, the more accurate our sample statistics estimate the population parameters
91
what is the degrees of freedom?
the number of observation (N) minus the number of estimates made (e.g. the mean)
92
what do we need when using a table of the t distribution?
need to know the df and need to specify if we want a one-tailed (p
93
what does a larger degrees of freedom do to a t distribution?
makes the distribution taller, and when the df is smaller makes it flatter
94
how do t distributions and distributions differ>
they are similar but slight different for each sample size. get closer to normal as the sample size increases.
95
why is there more error involved in a t distribution?
because we have estimated population variabce so slight more distribution in tails
96
what does a smaller sample size of a t distribution indicte?
the smaller the df, the larget the critical t value that must be exceeded
97
what are the types of t tests
single sample t test | repeated measures t test
98
what is the equation for a single sample t test?
t= (M-μ) / S_M
99
what is the equation for a repeated measures t test?
same as single sample (t= (M-μ) / S_M ) but calculated from difference scores not raw scores. Remember µ = 0 in Ho no difference
100
when dealing with difference between means, what is something we need
a corresponding distribution and error term
101
what is the equation of independent groups design t test?
t = (M_1 - M_2) / S_Diff
102
what is the equation for S_Diff?
S_difference = √ (S^_M1 + S^2_M2)
103
how do you calculate effect size for a z distribution?
individual score - sample mean?
104
what is the error of a z distribution?
sample standard deviation
105
what is the general statistic form?
Stat = (Estimate of effect size) / (Estimate of error)
106
what is the equation for testing an individual against a sample?
Z=(X-M)/SD
107
what is the equation for testing a sample against a known population (where the population SD is unknown)
t= (M-μ) /(S/√n)
108
what is the equation for testing a sample where population parameters are known?
Z= (M-μ) / (σ/√n)
109
what is the equation for testing two samples against eachother>
t= (M_1-M_2) / S_Diff
110
what is the question of error and statistical significance?
Is the difference we see sufficiently large given the amount of associated error. It is more likely to be an effect of IV or just sampling error.
111
what are the tree assumptions to be made when making a statistical test?
1. all observations are independent 2. Distributions are normally distributed 3. Variance of one group is not too much larger than the other
112
what is the assumption that all observations are independent?
usually a methodological question. Ensure no one person’s performance is affected by or affects someone else's
113
whatis the assumption that distributions are normal?
o check histograms of both groups + skewness & kurtosis o if samples N>30 then sample distribution less important as theoretical distribution of the difference between the means will be normal o Homogeneity of variance
114
what is the assumption that variance of one group is not too much larger than the other
o If doing manually; largest variance > x4 smallest variance problematic o SPSS checks this automatically using Levene’s test o Breaches to homogeneity assumption can inflate Type 1 Error
115
what does a statistically significant result not prove>
IV caused DV
116
what does causation and interpretation of results depend heavily on?
the nature and integrity of the research design
117
what does statistical significance indicate?
that the results seen is highly unlikely to happen by chance alone.