Statistics Flashcards

Question 1

Q

Central Limit theorem

Answer

A

When you repeatedly sample from an underlying population with unknown characteristics: the distribution of sample means will approximate the normal distribution (if the sample size is sufficiently large, typically > 30)

The distribution of sample means follows approximately normal distribution for sufficiently large samples.

Question 2

Q

The law of large numbers

Answer

A

Given any random process, the difference between sample mean and underlying population mean decreases as number of samples increases (observed probability approaches theoretical)

The larger your sample, the closer your sample mean is to the population mean.

Question 3

Q

Inferential statistics

Answer

A

Inferring the characteristics of a population given a particular sample (for any sufficiently large sample we can estimate the mean of the underlying population from the sample mean)

Question 4

Q

Standard deviation

Answer

A

It is a statistic that tells us how much individual values in a data set differ from the mean of that set. It measures the spread, or variability, of a set of numbers.

Question 5

Q

Standard error of the mean

Answer

A

It measures the precision of the sample mean as an estimate of the population mean. It tells us how much the sample mean (the average of our sample data) is likely to differ from the true population mean (the average of all possible data points if we could measure them all). It is the standard deviation divided by the square root of the sample size.

Question 6

Q

Z-score

Answer

A

A z-score, also known as a standard score, tells us how far a particular data point is from the mean in terms of standard deviations. It helps us understand how unusual or typical a particular value is within a data set. It is calculated by taking a specific data point and substracting the sample mean and then deviding it by the standard deviation. Getting a Z-score around suggests its kinda normal data point.

Question 7

Q

Confidence intervals

Answer

A

For sample indicate the interval in which there is a 95% likelihood that the population mean falls

Question 8

Q

QQ-plot

Answer

A

A QQ-plot, or quantile-quantile plot, is a type of plot used to compare the distribution of a data set to a theoretical distribution, most commonly a normal (bell curve) distribution. It’s a helpful visual tool for checking whether your data is normally distributed, which is often an important assumption in statistics.

Question 9

Q

Parametric tests

Answer

A

Can be done on normally distributed data

Question 10

Q

Non-parametric tests

Answer

A

Can be done on non-normally distributed data

Question 11

Q

Quasi-experiment

Answer

A

Collection of data of 2 or more naturally occurring variables in the world (e.g. shoesize and breathhold) - no random assignment of subjects!

Question 12

Q

A full experiment

Answer

A

Systematic manipulation of variables (Independent variables) to observe how they influence an outcome measure (Dependent variable)

Question 13

Q

T-test

Answer

A

When we want to test if two means are different

Question 14

Q

Regression

Answer

A

When we want to predict a continuous dependent variable from one or more
continuous OR categorical independent variables

Question 15

Q

Correlation test

Answer

A

When we want to test the relation between two continuous variables

Question 16

Q

Degrees of freedom

Answer

A

It essentially tells us the number of values in a calculation that are free to vary

In multiple regression, degrees of freedom are the number of data points minus the number of parameters (coefficients) estimated.