S1.2 Summary Statistics Flashcards by Ian Gillies

Statistics

Statistics is the science of collecting, classifying and analyzing information.

How well did you know this?

Not at all

Perfectly

Statistic

A statistic is a numerical value (such as the mean or range) calculated from a set of data.

How well did you know this?

Not at all

Perfectly

Measures of Location

The mean, median and mode are three summary statistics that represesent the centre or the average of a set of data. They are called the measures of central tendency ( or measures of location).

How well did you know this?

Not at all

Perfectly

Mean

(Measure of Location)

The mean is the average score in a set of data.

How well did you know this?

Not at all

Perfectly

Median

(Measure of Location)

The median is the score which is located in the middle of an ordered set of data. The median divides the data into two equal groups.

How well did you know this?

Not at all

Perfectly

Mode

(Measure of Location)

The mode is the most occuring score in a set of data. The mode is more useful for categorical data.

How well did you know this?

Not at all

Perfectly

Quantiles

Quantiles are points in a set of ordered data which divide the data into equal groups. Commonly used quantiles are quartiles, deciles and percentiles.

How well did you know this?

Not at all

Perfectly

Quartiles

Quartiles (Q₁, Q₂, Q₃) divide a data set into 4 equal groups.

Q3 separates the lower 75% of scores from the uper 25% of scores.

How well did you know this?

Not at all

Perfectly

Deciles

Deciles (D₁, D₂, … D_{<span>9</span>}) divide a set of data into 10 equal groups.

D₂ cuts of the lower 20% of scores from the upper 80% of scores

How well did you know this?

Not at all

Perfectly

Percentiles

Percentiles (P₁, P₂, P₃, … P₁₀₀) seperate a large set of data into 100 equal groups.

How well did you know this?

Not at all

Perfectly

Measures of Spread

A second important feature of a set of data is how spread out its scores are.

Three statistics which measure the spread of the scores in a set of data are the range, interquartile range (IQR) and standard deviation.

How well did you know this?

Not at all

Perfectly

Range

(Measure of Spread)

The range is a measure of spread.

How well did you know this?

Not at all

Perfectly

Interquartile Range

(Measure of Spread)

How well did you know this?

Not at all

Perfectly

Standard Deviation

(Measure of Spread)

Standard deviation measures how different each score in a data set is from the mean.

How well did you know this?

Not at all

Perfectly

Outlier

An outlier is a very high or a very low score in a set of data which is clearly apart from the other scores.

How well did you know this?

Not at all

Perfectly

Effect of Outliers

Outliers can affect the reliability of some measures of spread and location.

Study These Flashcards

Detecting Outliers

Study These Flashcards

Five Number Summary

Study These Flashcards

Boxplot

A boxplot (or box-and-whisker plot) is a plot of a five-number summary.

Study These Flashcards

Cumulative Frequency Graph & Polygon

Study These Flashcards

The CF polygon (ogive) starts from the beginning of the histogram and joins the top right-hand corner of each column.

Estimating Quartiles

Q₁, Q₂, and Q₃ can be estimated using the Cumulative Frequency Polygon (Ogive).

Study These Flashcards

Shape of a Distribution

Study These Flashcards

The shape of a distribution (a set of data) shows how the data is spread. The distribution of a set of data can often be classified as being either symmetric, positively skewed or negatively skewed.

Symmetric Distribution

A symmetric distribution is evenly spread either side of its centre.

Study These Flashcards

Positively Skewed

Positively skewed data has a higher propotion of low scores.

Study These Flashcards

**Negatively Skewed** Negatively skewed data has a higher proportion of high scores.

**Samples & Populations**

The population mean and standard deviation are called **parameters**, The sample mean and standard deviation are **statistics** used to estimate the values of the population parameters.

S1.2 Summary Statistics Flashcards

Calculate summary statistics for single data sets and use them in the interpretation of data. (26 cards)