Week 4 Flashcards

1
Q

What’s the importance of visuals in statistics?

A

-may have similar mean yet different spread of data which we can’t see solely looking at the numbers
-dots closer to line=shows stronger relationship between variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are histograms good at?

A

showing the distribution of data where you can have:
-symmetric data
-skewed right (+ skew like toes on right foot)
-skewed left (- skew like toes on left foot)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does a stem and leaf plot work?

A

put the tens in a column and the units/end digits in a list
5 8
6 26778
7 14555 (e.g. one is 75)
We could compare the 2 data sets using one stem and leaf plot.
can do mode e.g., 75 median and mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are boxplots useful for?

A

-Boxplots are useful for showing medians, ranges, IQ ranges,
skewness etc.
-We could also compare the 2 data sets using box plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can you tell if boxplots are skewed?

A

right/+ly skewed = most data on the upper end of the scale
left/-ly skewed= most data trailing on the lower end of the scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a normal distribution like?

A

normally distributed data fits nicely under a bell-shaped curve.
allowing us to do better and more accurate statistical tests.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Name the two ways in which a distribution can deviate from
normality

A

– Lack of symmetry (skewness)
– Pointiness (kurtosis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an example of frequency distribution?

A

■ Histograms
■ They’re individual frequency bars
■ Each bar gives the frequency of a given value e.g., we can count how many people have a healthy heart rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is an example of probability distribution?

A

■ Bell curves
■ They’re smooth, but segmented by SDs
■ Area under curve is the the probability that value occurs
■ We can work out the likelihood of a person having a healthy heart rate e.g.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

True or false: outliers have a bigger impact on smaller sized samples

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define skewness

A

■ Skewness is deviation from symmetry.
■histograms show a big difference between means, medians
and mode=skewed data
■ Skewness means some extreme scores are affecting the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define kurtosis (i.e. pointiness)

A

■ Kurtosis is a measure of the tailedness of a distribution
■ Tailedness = How often outliers occur
■ Three types = Mesokurtic (AKA zero, AKA normal); Leptokurtic
(AKA positive/thin); Platykurtic (AKA negative/flat)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Kurtosis: Leptokurtic

A

■ + kurtosis
■ High peak
■ Lepto = skinny (in the middle)
■ Fat tails (big gap underneath? check) (outliers): signifies
either lots of outliers or
occasional outliers which are
very extreme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Kurtosis: Platykurtic

A

■ Negative kurtosis
■ Flatter distribution
■ Platy = Broad (in the middle)
■ Skinny tails (outliers):
signifies few outliers or
outliers not so extreme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why is the distribution so important?

A

■ Tells us which measure of central tendency/dispersion represents our sample best/to use normal distribution=mean and
standard deviation skewed data=median and ranges.
■ Also tells us which inferential statistics we should use.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How can you assess the distribution of data using SPSS/histograms?

A

-Perfectly normally distributed data has a skewness of 0.
-skewness statistic is > twice the standard error=data likely skewed.

17
Q

Define standard error

A

An approximate standard deviation of the population. I.e. how far is the mean of our sample likely to be from the population mean.

18
Q

What are the rules in producing a table?

A

– Labelled and titled.
– Placed at the top of the most appropriate page.
– Font and size should be the same as the main text.
– Logical and easy to understand.

19
Q

Define figures

A

all other visuals that are not tables e.g. bar charts,scatter plots

20
Q

What 2 types of bar charts can you have?

A

1.simple (bars separate)
2.clustered (bars together)

21
Q

Explain what error bars are

A

-a visual representation of variability within your data. (on bar charts like the lines)
-Error bars hint at statistical significance
-two confidence intervals do not overlap=difference between
parameters will be significant.
-two confidence intervals do overlap=difference between two parameters can be significant or non-significant.
■ But – we have our p-values to tell us about statistical significance.

22
Q

What’s the most common confidence interval used in error bars?

A

-95% Confidence intervals are
-It’s the % of times you expect to reproduce an estimate (e.g., a
mean) within the range.
■ E.g., You are confident that 95 out of 100 times the estimate will fall between the upper and lower values specified by the confidence interval.

23
Q

When do the figures go in the appendix/results section?

A

A=When using figures to assess data (for you) e.g., Stem and leaf plots, boxplots and histograms which are used to assess distribution (e.g., skewness).

R=Figures used to visually present data analysis (for others) e.g., Bar graphs and scatterplots which are not used to assess data.

24
Q

Summarise this lecture

A

■ Visual aids are useful tools to assess/explain some aspects
of the data.
■ When assessing data, they can show measures of central
tendency/measures of spread.
(put in the appendix of reports.)
■ When explaining data, they can show differences or
relationships in the data.
(in the results section of reports.)