Descriptive and Inferential Statistics Flashcards

Question 1

Q

What is descriptive statistics?

Answer

A

Deal with an entire dataset e.g. population, goal is to summarise raw data and represent graphically and not extend conclusions beyond the observed dataset.

Question 2

Q

What is inferential statistics?

Answer

A

Goal is to make inferences beyond your data. To infer something about a population based on a smaller model, or sample. Make an estimation of a population parameter from a statistic or test a hypothesis.

Question 3

Q

What is the arithmetic mean?

Answer

A

Add all values and divide between number of values. Sensitive to outliers.

Question 4

Q

What is the geometric mean?

Answer

A

Multiply values and take nth root. Can reduce the effect of outliers.

Question 5

Q

What is the weighted mean?

Answer

A

Times each value but its ‘weight’, add together and divide by value of all weights.

Question 6

Q

What measure of centrality is best for normally distributed data?

Answer

A

Mean, median or mode

Question 7

Q

What measure of centrality is best for negatively or positively skewed data?

Answer

A

Mode (3 measures of centrality will not coincide)

Question 8

Q

What measure of central tendency would we use for categorical data?

Question 9

Q

Define variation

Answer

A

Average distance an observation is from the mean

Question 10

Q

How do you calculate variation?

Answer

A

Subtract each value from the mean, then square the result. Then work out arithmetic mean of these numbers.
‘sum of squared differences from the mean’

Question 11

Q

What is standard deviation?

Answer

A

Square root of variance.

Larger sd= wider spread of data

Question 12

Q

Why and how to we adjust variance equation?

Answer

A

Divide by n-1 instead of n.

This brings variance estimation closer to true population variance.

Question 13

Q

Can we use variance and sd for all types of data?

Answer

A

Only for normally distributed data.

Question 14

Q

What measures of variation can we use for skewed data?

Answer

A

Quartiles or box plots

Question 15

Q

What is the empirical rule?

Answer

A

States that 68.26% data values lie within +/- 1 sd

45% within +/- 2sd
74% within +/-3 sd

Question 16

Q

What is the standard error and how do we calculate it?

Answer

A

Value that tells us the precision of a sample based estimate.
sd/ square root of n

Question 17

Q

How would you represent continuous data?

Answer

A

histograms, box-plots, normal plots

Question 18

Q

What is a confidence interval?

Answer

A

A confidence interval describes the amount of uncertainty associated with a sample estimate of a population parameter. It describes the margin of error either side of our point estimate.

Question 19

Q

What is a type 1 error?

Answer

A

A false positive - falsely rejecting a null hypothesis. ‘optimist’

Question 20

Q

What is a type 2 error?

Answer

A

A false negative - falsely accepting a null hypothesis. ‘pessimist’

Question 21

Q

What does a p-value represent?

Answer

A

They evaluate how well the sample data support the null hypothesis. High p = data are likely with a true null. Low p = data are unlikely with a true null.

Question 22

Q

In estimation statistics what measures of interest are there?

Answer

A

mean, prevalence of a disease (proportion), regression line, RR, OR

Question 23

Q

What are the two categories of estimation?

Answer

A

Point: single value statistic e.g. estimated mean or proportion.
Interval: defined by two numbers, between which the population parameter is estimated to lie, with a high degree of probability. i.e. confidence intervals

Brainscape's Knowledge GenomeTM

Descriptive and Inferential Statistics Flashcards

Brainscape's Knowledge Genome^TM