Geography Exam Skills Flashcards

1
Q

What do descriptive statistics do?

A

DESCRIBE basic features of data - provide a SUMMARY of the results

First step in ant data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two types of descriptive statistics and what do they do?

A

Measure of CENTRAL TENDENCY - a single value that attempts to describe a set of data by identifying the central position within the data set.
Measures of DISPERSION - describe the spread of data around a central value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the three measures of central tendency and what do they show?

A

MEAN - average of the numbers, calculates by adding the values and dividing that by the number of values in the data set.
MEDIAN - the middle number, found by ordering the values into numerical order and finding the middle value
MODE - most frequently appearing value, can be bimodal (two modes), trimodal (three modes) and multimodal (4+)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the three measures of dispersion?

A

Range
Variance
Standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When would you use and not use a mean value?

A

Can be calculated with continuous and discrete data

Cannot be calculated with categorical data, as the values cannot be summed (e.g types of trees/ groups)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are advantages and disadvantages of a mean?

A

Advantages: accurate summary when data has normal distribution; useful as an initial step for calculating other statistical measures e.g standard deviation

Disadvantages: can be skewed when there are outliers or extreme values, so will be u representative of the data set; can also be unreliable if the data set is small.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When would you use or not use a median?

A

The median is more appropriate than mean in presence of extreme values; is the preferred measure of central tendency when the distribution is not symmetrical.

Less appropriate when there are variations in data as less sensitive to these changes so may not present a true picture; and cannot be calculated with categorical data ad values cannot be ordered.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are advantages and disadvantages of a median?

A

Advantages: not affected by extreme values and can indicate skew when compared to the mean; mean>median means positive skew and median>mean means negative skew.

Disadvantages: cannot be used for further calculation except interquartile range; does not give any information on the spread of values within the data set; can be misleading e.g data sets with different ranges can have the same median.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When would you use and not use a mode?

A

Can only be used for frequency/ categorical data.

There may be no modal value or several.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are advantages and disadvantages of a mode?

A

Advantages: simple to find

Disadvantages: it is not based on the whole data set unlike the median and mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a measure of dispersion?

A

How spread out are the values? E.g two places may have same mean temps over a day but one has big differences in temps such as a desert, whereas another place may have a narrow range of temps and have the same mean, such as a tropical rain forest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which is the easiest measure of dispersion, and how is it calculated?

A

The easiest is RANGE
Take the lowest value from the highest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do you calculate STANDARD DEVIATION?

A
  1. Calculate the mean of the data set.
  2. Calculate the difference between each value and the mean.
  3. Square each difference (to eliminate negative values)
  4. Total the squared differences.
  5. Divide by the number of values take 1 (n-1)
  6. Square root it.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the standard deviation show?

A

The larger the standard deviation, the greater the variation from the mean (and reversed)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are limitations of standard deviation?

A

If data is positively or negatively skewed, SD is not appropriate to measure dispersion.

It can be time consuming

Can only compare between samples of comparable populations e.g cannot compare SD for litter layer depth with SD for soil pH.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is dispersion?

A

measuring how variable data is within a sample

17
Q

what are dispersion graphs?

A

they show visually the spread of values in the data set; each value is plotted as an individual point against a vertical scale. if a horizontal axis is included there will be labels for each data set; can be quantitative (numeric) or qualitative (descriptive)

18
Q

data in a dispersion graph can be divided into four equal parts; what are they called and what do they do?

A

QUARTILES
shows the positions of the median, upper, and lower quartiles

19
Q

what is Q1, Q2, and Q3

A

LOWER quartile- 25% values are below this
MEDIAN- 50% values above/ below this
UPPER quartile- 75% values are below this

20
Q

what is the IQR?
what is an advantage?

A

interquartile range is a measure of dispersion- calculates by finding the difference between the quartiles
IQR= Q3- Q1

does not use lowest or highest values (which are most likely to be inaccurate/ extreme) so is more reliably accurate than range.

21
Q

what does it mean when a data set has a high IQR, or a low IQR?

A

high IQR- data is very dispersed
low IQR- data is less dispersed

22
Q

what are advantages and disadvantages of dispersion graphs?

A

ADVANTAGES: visually effective display of data; can easily compare dispersion of values between two locations; clustering can be seen easily; data range is easily identifiable.

DISADVANTAGES: ideally needs more than 20 values; plotted on same scale needed to compare two locations; outliers can cause scaling problems; only one variable plotted so does not show causative relationship; does not show comparison through time.

23
Q

how do you calculate IQR?

A
  1. order values from highest to lowest
  2. calculate median- (N+1)/2
  3. calc lower quartile- (N+1)/4
  4. calc upper quartile- (3(N+1))/4
  5. interquartile range= upper quartile- lower quartile.