Statistics Flashcards
What is random sampling?
Usually using computer generated random number tables. Every item has an equal chance of being selected
What is stratified sampling?
The area under study is divided up into homogenous units and each unit is randomly or systematically sampled
What is systematic sampling?
Items are picked at a regular interval e.g. every 30 metre
Outline a bar graph:
Easy to understand
Histograms are bars based on frequencies
Outline a line graph:
Used for continuous data
Can plot several lines on one graph
Can have different scales on each axis
Outline a pie chart:
The segments represent the share of the total value
Visual but can be difficult to read
How do you work out the mean?
Calculated by adding up all the values in a datset and dividing the total sum by the number of values in the data set.
What is the median?
This refers to the central value in the ranked data set.
What is the mode?
This is the most frequently occurring number in the data set.
What is the range?
Highest value take away the lowest value.
What happens when the median, mode and mean are the same value?
We get a normal distribution, however, most data sets are SKEWED with differences in the 3 measures.
Outline how to work out interquartile range:
1) Put numbers in order.
2) Find out the median
3) Find the median of the data set before and after the median number (upper and low quartile)
4) Take upper away from lower quartile to get the interquartile range.
What is standard deviation?
The standard deviation is the most useful measure of the dispersion of a set of data from its mean.
A low SD indicates that the data is clustered around the mean whereas a high value indicates dispersion.
Outline how to calculate standard deviation:
1) Difference of each value from the mean is worked out.
2) Each of these values is squared (to remove negative values)
3) All the squared values are added together
4) This summed value is then divided by the number of values in the data set.
5) The square root of this value is found.
What is the formula for standard deviation?
SD = ∑ ( x - x )²
_______
n
n = number of values in the data set x = each value in data set Second x (should have a line over) = mean of all values in data set
What are the advantages of using standard deviation?
Shows how much data is clustered around a mean value
It gives a better idea of how the data is distributed
What are the disadvantages of using standard deviation?
It doesn’t give you the full range of the data
It can be hard to calculate
Only be used with data which can be plotted on a histogram so where a independent variable is plotted against frequency of it.
What is the formula for Spearman’s rank?
Rs = 1 - 6 ∑ d²
_______
n3 - n
n = rank d= difference
What does the result mean from spearman’s rank?
If value is greater than 0.05 level of significance value - 95% confident that there is a relationship between the two variables
- Over 0.01 = 99% confidence
- if value is below critical value, null hypothesis is accepted.
What is GIS?
Geographical Information Systems
The use of computer software to link a variety of data to a location.
What are the uses of GIS?
Might be that a company needs information on socio-economic groupings within a town in order to decide where best to locate their new store.
The GIS database for that application might also have data on the location of similar stores, the transport infrastructure & council tax rates.
What are the advantages of using GIS?
It allows information to be presented in a visual and easily understood format.
Different types of information can be overlain onto base maps of an area.
Advantages of using a bar chart?
- show relationships between 2 or more variables
- visually attractive
- Can show positive and negative values
- Simple to construct and read
Advantages of using pie charts?
- visually attractive
- shows proportion of components
- shows scale
Weaknesses of using bar graphs?
- Plotting too many bars makes it appear cluttered- less easy to interpret
- Using too many or too few classes can mask important patterns in the data
Weaknesses of using pie charts?
- Less than 3 segments look simplistic
- If many segments a similar size; it is hard to interpret