Lesson Four: Describing Quantitative Data (Spread) Flashcards

1
Q

Spread of Distribution of Data:

A

Describes how far the observations tend to be from each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Standard Deviation:

A

A measure of the spread in the distribution, it measures the square root of the variance.

If data is close together, the standard deviation is relatively small.

If the data is spread apart, the standard deviation is relatively large.

If the standard deviation is zero, then all the values are identical, there is no spread in the data.

The standard deviation CANNOT be negative.

Formulas:
Population Standard Deviation = square root of ( { Sigma E [ (x - Mu u) squared] } / N )

Description of Population Standard Deviation = ( the square root of [ sum of all data points used in following equation ( {a value in the data set - the mean of the data set } squared ) divided by the number of data points in the population ] )

Summary Standard Deviation Formula: = square root of ( { Sigma E [ (x - Mu u) squared] } / N-1 )

Description of Sample Standard Deviation = ( the square root of [ sum of all data points used in following equation ( {a value in the data set - the mean of the data set } squared ) divided by the number of data points in the population - 1 ] )

N-1 is the sample variance.

Excel Formula: =stdev.s()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Variation:

A

The average of each point from the mean.

Excel Formula: =var.s()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data Resistance:

A

When the process for evaluating data is resistant to artificial inflation from outliers.

Standard Deviation and Variance are both NOT resistant to sampling risk. Both become artificially inflated by outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Percentiles:

A

They indicate the values below which a certain percentage of the data in a data set is found.

Ex. 90th percentile means 90% or less of the data is represented.

Percentiles divide data into 100 equal groups. There are 99 percentiles used to divide 100 groups of data.

The word percent means divided by one hundred. Thus, the 89th percentile of numbers 1 through 100 would include numbers 1 through 89 but none above.

Formula: n = (P/100) * N
P=percentile
N= number of values in a data set (sorted from smallest to largest)
n= ordinal rank of a given value

Excel Formula: =percentile.inc(cell range reference, desired percentile (ex. written as .9 for the 90th percentile))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Quartiles:

A

These are special percentiles. They divide the data into four equal groups. Thus, there are three quartiles: 25th, 50th, and 75th.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Five Number Summary:

A

This is a simple summary made up of the minimum, first quartile, median (or second quartile), third quartile, and the maximum.

Excel Formulas:
Minimum: =min(data range)

1st Quartile: =percentile.inc(data range, .25)

2nd Quartile or Median: =percentile.inc(data range, .50)

Third Quartile: =percentile.inc(data range, .75)

Maximum: =max(data range)

To use this, your data must be univariate (a single variable).

Ex. a list of weights is one variable. If you have a list of ages and want to compare ages to weights, it becomes bivariate data (two variables). The matching pairs make it impossible to find a five number summary.

The data must also be one of these three things:

-Ordinal: Organized in order.

-Interval: Has values of equal intervals that mean something (ex. a thermometer can have intervals of ten degrees).

-Ratio: Exactly the same as interval but the zero on the scale does not exist. (Ex. age of zero doesn’t exist so it’s a ratio. However, zero exists in temperature, so it would need to be interval.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Box and Whisker Chart or Boxplots:

A

A graphical representation of the five-number summary. It’s resistant to outliers!!

Formula:
1. Draw a number line.
2. Draw a vertical line segment above each of the quartiles.
3. Connect the tops and bottoms of the line segments, making a box.
4. Make a smaller mark above the values corresponding to the minimum and maximum.
5. Draw a line from the left side of the box to the minimum, and draw another line from the right side of the box to the maximum. These lines look like whiskers!

Excel Formula:
1. Place data into excel, each point having its own cell.
2. Highlight the data you want to plot.
3. Go to the insert ribbon in Excel and select the histogram icon from the “Charts” section. Select Box and Whisker chart.
4. If you’re displaying more than one boxplot, add the separate data you want to compare to another column in excel, and highlight both. Follow steps 1-3 as normal, and consider adding a legend to make it easier to understand. To do this, click the plus button on the top right hand corner of the graph. Click “Add Chart Element”. Select “Legend” and choose where you’d like it to appear.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly