Lecture 2 - Describing & Summarising Data + Normal Distribution Flashcards by Charlie Davies

measures of central tendency:

mode (most frequent value), arithmetic mean (n) & the median (middle value in ranked dataset)

How well did you know this?

Not at all

Perfectly

what measure of central tendency is affected the most by extreme values?

the mean is affected most by extreme values, the median would not be affected as much

How well did you know this?

Not at all

Perfectly

how do you round with your mean values?

you always round your mean values to one decimal place (e.g. 4.988 —> 5.000)

How well did you know this?

Not at all

Perfectly

what do histograms primarily show?

frequency

How well did you know this?

Not at all

Perfectly

what does a positive skew graph look like?

left-slanted bell shape

How well did you know this?

Not at all

Perfectly

what does a negative skew graph look like?

right-slanted bell shape

How well did you know this?

Not at all

Perfectly

the more variables in our data…

… the less certain we can be about the estimates from the data, such as the mean

How well did you know this?

Not at all

Perfectly

sum of squares:

total sum of squares = sum of all observations ( value in a sample - mean value of a sample)^2

How well did you know this?

Not at all

Perfectly

what is the problem with the sum of squares equation?

the more data points you have, the bigger the sum of squares value will be

How well did you know this?

Not at all

Perfectly

unreliability is proportional to:

variance

How well did you know this?

Not at all

Perfectly

standard deviation equation:

standard deviation = √sum of (each value - mean)^2 / size of population

How well did you know this?

Not at all

Perfectly

what does standard error of the mean calculate and how does it differ from standard deviation?

standard error calculates the scatter of the mean values, whereas the standard deviation is the scatter of the raw data values (observations)

How well did you know this?

Not at all

Perfectly

Two Standard Error rules of thumb:

1) standard error is a measure of how confident we are that our sample mean is close to the population mean

2) in 95.5% of cases the population mean will fall within ca. 2 standard errors of the sample mean

How well did you know this?

Not at all

Perfectly

Gaussian Distribution:

same as normal distribution it is a common continuous probability distribution

it is bell shaped asymptotic at the extremes and symmetrical around the mean with no skew: mean = median - mode

area under the curve is directly proportional to the relative frequency of observations and their probability (p)

How well did you know this?

Not at all

Perfectly

what is the Gaussian (Normal) Distribution important for?

statistical analysis

How well did you know this?

Not at all

Perfectly

describe the features of a box-&-whisker plot:

Study These Flashcards

central line is median, the top line in the box is the 1st quartile, bottom line in the box is the 3rd quartile and the whole box itself is the interquartile range with the whiskers being the largest and smallest data values

IQR equation:

Study These Flashcards

IQR = [3rd Quartile] - [1st Quartile]

what does the location of the median within a box plot give information regarding?

Study These Flashcards

the placement of a median within the box plot gives information regarding skewness in a dataset

what are the variabilities and uncertainties for the following central tendencies?

1) mean

2) median

Study These Flashcards

mean = variance, SD, SE of the mean. 95% confidence interval

median = interquartile range

standard error of the mean calculation:

Study These Flashcards

standard error = SD / √No. of samples

continuous variable:

Study These Flashcards

values within a range, can be measured (e.g. size: 130cm, 27cm etc)

discrete variable:

Study These Flashcards

fixed values, integer, can be counted (e.g. no. of chromosomes)

ordinal variable:

Study These Flashcards

n factor levels with implicit order (e.g. size class: small, medium & large)

nominal variable:

Study These Flashcards

n factor levels without implicit order (e.g. eye colour: grey, blue, brown etc / treatment: sham vs. testosterone)

two types of numerical (quantitative) variable:

continuous (within a range, measured) and discrete (in a range, counted)

two types of categorical (qualitative) variable:

nominal (n factor levels without implicit order: eye colour, testosterone vs sham) and ordinal (n factor levels with implicit order: small, medium, large)

Lecture 2 - Describing & Summarising Data + Normal Distribution Flashcards

(26 cards)