S1.2 Summary Statistics Flashcards

Calculate summary statistics for single data sets and use them in the interpretation of data.

1
Q

Statistics

Statistics is the science of collecting, classifying and analyzing information.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Statistic

A statistic is a numerical value (such as the mean or range) calculated from a set of data.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Measures of Location

A

The mean, median and mode are three summary statistics that represesent the centre or the average of a set of data. They are called the measures of central tendency ( or measures of location).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Mean

(Measure of Location)

A

The mean is the average score in a set of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Median

(Measure of Location)

A

The median is the score which is located in the middle of an ordered set of data. The median divides the data into two equal groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Mode

(Measure of Location)

A

The mode is the most occuring score in a set of data. The mode is more useful for categorical data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Quantiles

A

Quantiles are points in a set of ordered data which divide the data into equal groups. Commonly used quantiles are quartiles, deciles and percentiles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Quartiles

Quartiles (Q1, Q2, Q3) divide a data set into 4 equal groups.

A

Q3 separates the lower 75% of scores from the uper 25% of scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Deciles

Deciles (D1, D2, … D<span>9</span>) divide a set of data into 10 equal groups.

A

D2 cuts of the lower 20% of scores from the upper 80% of scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Percentiles

Percentiles (P1, P2, P3, … P100) seperate a large set of data into 100 equal groups.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Measures of Spread

A second important feature of a set of data is how spread out its scores are.

A

Three statistics which measure the spread of the scores in a set of data are the range, interquartile range (IQR) and standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Range

(Measure of Spread)

A

The range is a measure of spread.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Interquartile Range

(Measure of Spread)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Standard Deviation

(Measure of Spread)

A

Standard deviation measures how different each score in a data set is from the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Outlier

A

An outlier is a very high or a very low score in a set of data which is clearly apart from the other scores.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Effect of Outliers

Outliers can affect the reliability of some measures of spread and location.

A
17
Q

Detecting Outliers

A
18
Q

Five Number Summary

A
19
Q

Boxplot

A boxplot (or box-and-whisker plot) is a plot of a five-number summary.

A
20
Q

Cumulative Frequency Graph & Polygon

A

The CF polygon (ogive) starts from the beginning of the histogram and joins the top right-hand corner of each column.

21
Q

Estimating Quartiles

Q1, Q2, and Q3 can be estimated using the Cumulative Frequency Polygon (Ogive).

A
22
Q

Shape of a Distribution

A

The shape of a distribution (a set of data) shows how the data is spread. The distribution of a set of data can often be classified as being either symmetric, positively skewed or negatively skewed.

23
Q

Symmetric Distribution

A symmetric distribution is evenly spread either side of its centre.

A
24
Q

Positively Skewed

Positively skewed data has a higher propotion of low scores.

A
25
Q

Negatively Skewed
Negatively skewed data has a higher proportion of high scores.

A
26
Q

Samples & Populations

A

The population mean and standard deviation are called parameters, The sample mean and standard deviation are statistics used to estimate the values of the population parameters.