Lesson 1 Flashcards

1
Q

What does the term ‘random variable’ refer to in statistics?

A

A variable whose value is subject to variations due to chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True or False: The mean and median are always equal in a symmetric distribution.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the formula for calculating the variance of a data set?

A

Sum of (each data point - mean)^2 divided by the number of data points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In a normal distribution, what percentage of data falls within one standard deviation of the mean?

A

Approximately 68%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the purpose of conducting a hypothesis test in statistics?

A

To determine if there is enough evidence to reject a null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the formula for calculating the z-score of a data point in a normal distribution?

A

(Data point - mean) divided by standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the definition of the p-value in statistical hypothesis testing?

A

The probability of obtaining results at least as extreme as the observed results, assuming the null hypothesis is true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does the term ‘confidence interval’ represent in statistics?

A

A range of values within which a population parameter is estimated to lie.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the difference between correlation and causation in statistics?

A

Correlation indicates a relationship between two variables, while causation implies that one variable directly affects the other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the ‘central limit theorem’ state in statistics?

A

Regardless of the shape of the population distribution, the sampling distribution of the sample mean will be approximately normally distributed for large sample sizes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the formula for calculating the standard error of the mean?

A

Standard deviation divided by the square root of the sample size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In a regression analysis, what does the coefficient of determination (R^2) measure?

A

The proportion of the variance in the dependent variable that is predictable from the independent variable(s).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the purpose of conducting a chi-squared test in statistics?

A

To determine if there is a significant association between two categorical variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the formula for calculating the margin of error in a confidence interval?

A

Critical value multiplied by standard error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the term ‘outlier’ refer to in statistics?

A

An observation that lies an abnormal distance from other values in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the difference between a Type I error and a Type II error in hypothesis testing?

A

Type I error occurs when the null hypothesis is true but is rejected, while Type II error occurs when the null hypothesis is false but is not rejected.

17
Q

In probability theory, what is the complement rule?

A

The probability of an event not occurring is equal to 1 minus the probability of the event occurring.

18
Q

What is the formula for calculating the coefficient of variation?

A

Standard deviation divided by the mean, multiplied by 100.

19
Q

What is the purpose of a box plot in statistics?

A

To visually represent the five-number summary of a dataset and identify outliers.

20
Q

What is the difference between a population and a sample in statistics?

A

A population includes all members of a specified group, while a sample is a subset of the population used to make inferences about the population.

21
Q

What is the formula for calculating the odds ratio in a 2x2 contingency table?

A

(ad)/(bc) where a, b, c, and d are the cell counts in the table.

22
Q

What is the definition of a statistical parameter?

A

A measurable characteristic of a population, such as the mean or standard deviation.

23
Q

What does a p-value of 0.05 indicate in hypothesis testing?

A

A 5% chance of observing the data if the null hypothesis is true, commonly used as a threshold for statistical significance.

24
Q

What is the formula for calculating the interquartile range of a dataset?

A

The difference between the third quartile (Q3) and the first quartile (Q1).

25
Q

In a two-sample t-test, what does the t-statistic measure?

A

The difference between the means of two samples relative to the variability within the samples.