Research Skills 6 : Introduction to analysing data Flashcards

Question 1

Q

Inspecting and Plotting your data

Answer

A

ALWAYS start by looking at your raw data before calculating statistics.
Look at the actual numbers.
Check for obvious mistakes, missing values and outliers.
Think about the best graphical representation for your data
Graph the results and look at them.

Statistics only represent your data and describe your data. They do not substitute the results

Question 2

Q

Summarising Data: Descriptive Statistics

Answer

A

You can use descriptive statistics (e.g. average) to simplify your data

MEASURES OF “AVERAGE“ (“Measures of central tendency”)

-Mean
“Common average”, “arithmetic mean”

-Median
The middle value. Put all the observations in order of size. Find the middle value- the value which has the same number of observations larger than it, as smaller than it.

Question 3

Q

Disadvantages of descriptive stats

Answer

A

The mean is strongly affected by outliers

The median is insensitive to outliers and to skewed distributions

Question 4

Q

Name two measures of spread

Answer

A

Range

2. Standard Deviation

Question 5

Q

What is the Range?

Answer

A

Smallest to largest value

- But only tells you about the largest and smallest value, nothing about the spread of all the other observations.

Question 6

Q

What is Standard Deviation?

Answer

A

A mathematical measure of the spread of data around the mean.
Notice that SD is a measure of the spread
It does not show the actual spread or range
± 1 SD around the mean will include a lot of the data
± 2 SD around the mean will include most of the data
But some results will be even further out

Question 7

Q

What is Variance?

Answer

A

The average of the squared differences from the Mean, or the square of the Standard Deviation (SD2)

Question 8

Q

What is Interquartile Range ?

Answer

A

Divide the data into the 
top 25%
next 25%
next 25% 
bottom 25%

Interquartile range covers the middle two groups. Used by population scientists with large datasets. Not useful with small numbers of observations.

Question 9

Q

Graphing Data: what are the two types of data?

Answer

A

Numerical (quantitative) data

2. Categorical Data

Question 10

Q

What is Standard Error of the Mean (sem) ?

Answer

A

This does not measure the spread of the data. It measures our confidence in the estimate of the mean.

Question 11

Q

Standard Deviation vs SEM

Answer

A

Standard deviation is a measure of the spread in your data. As you get more data the spread will stay about the same- the s.d. will change only slightly.
S.e.m. is a confidence interval , a measure of the uncertainty in your estimate of the mean..

Question 12

Q

What is a disadvantage of SEM?

Answer

A

this only works for large numbers of observations (>20-30)
For small numbers of observations, the s.e.m. is too optimistic
You could find the true confidence intervals using a t-test

Question 13

Q

What are properties of Normal Distribution?

Answer

A

Among the properties of 
- the normal distribution:
it is symmetrical about the mean
- it extends to + and to – infinity
- however ~ 95% of observations lie within   ± 2 standard deviations of the mean

Question 14

Q

What is Normal Distribution?

Answer

A

The “normal distribution” is a particular mathematical distribution with two parameters, the mean and the standard deviation.

Question 15

Q

What is the Central Limit Theorem?

Answer

A

If a variable is affected by a lot of different random factors
Each has a small effect
And their effects are additive
The distribution will approximate to a normal distribution

Question 16

Q

Summary I - Own your data

Answer

Study These Flashcards

A

Look at the raw data
Plot the raw data
Think about what the data it means

Question 17

Q

Summary II - Descriptive Statistics

Answer

Study These Flashcards

A

Measures of average

Mean: works best for mathematicians
Median: sometimes gives a more sensible answer when there are outliers, or a skewed distribution

Measures of spread

Range(only tells you about smallest and largest observation)
Standard deviation (s.d.) (more useful measure of overall spread)
Variance (=s.d.2)
Interquartile range (only useful if large number of observations)

Question 18

Q

Summary III - Error Bars

Answer

Study These Flashcards

A

Could mean anything
So must be defined in the figure legend
S.D. error bars are a measure of the spread of the data
S.E.M. error bars are an indication of your confidence in the estimate of the mean

Question 19

Q

Summary IV - Standard Error of the Mean

Answer

Study These Flashcards

A

S.e.m. is a confidence interval

We can be ~60% confident that the “true” mean is ± 1 s.e.m. distant from the experimental mean
And 95% confident that the “true” mean is approx. ± 2 s.e.m. distant from the experimental mean

As the number of observations gets larger the s.e.m. gets smaller
Our confidence in the estimate of the mean is higher

Research Skills 6 : Introduction to analysing data Flashcards

(19 cards)