911 Flashcards

Week 1-4

1
Q

What is the definition of statistics according to Davidian and Louis?

A

Statistics is the science of learning from data, measuring, controlling, and communicating uncertainty. It is essential for controlling the course of scientific and societal advances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the main types of statistical paradigms?

A

Bayesian Statistics
Classical (Error) Statistics
Likelihood-Based Statistics
Akaikean-Information Criterion-Based Statistics
Frequentist Statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What defines a research question?

A

A question that defines what a study hopes to learn, addressing how, why, when, and what.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the types of research questions?

A

Descriptive: Describes what is going on or what exists.
Relational: Examines relationships between variables.
Causal: Determines if a variable affects one or more outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are characteristics of a researchable question?

A

Narrow and specific.
Can be answered by observable evidence (data).
Has significant relevance for guiding policy, theory, or practice.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a hypothesis?

A

A specific, concise statement predicting the outcome of a study, indicating variables and the relationship to be examined.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the types of hypotheses?

A

Null hypothesis
Alternative hypothesis
Hypothesis of difference
Hypothesis of point-prevalence
Hypothesis of association

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is operationalization?

A

Defining a concept by specifying the activities or operations needed to measure it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the difference between independent and dependent variables?

A

Independent Variables (IV): Variables that predict or cause the dependent variable.
Dependent Variables (DV): Variables that are explained or predicted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are levels of measurement in data?

A

Continuous data (Quantitative)
Categorical data (Qualitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the types of data in research?

A

Primary Data: Data collected by the researcher (e.g., surveys, interviews).
Secondary Data: Data previously collected for other purposes (e.g., administrative records).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the advantages and disadvantages of primary data?

A

Advantages: Collect the exact data needed, define and measure variables directly.
Disadvantages: Expensive, time-intensive, labor-intensive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the advantages of secondary data?

A

Less expensive
Large quantities available
Depth (multiple years) and breadth (many data elements)
Easy to access (some)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some disadvantages of secondary data?

A

May not be collected for research purposes
May have data entry errors or omissions
May not measure the exact variables needed
Barriers to linking datasets (e.g., HIPAA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the difference between Missing Completely at Random (MCAR) and Missing at Random (MAR)?

A

MCAR: Missingness has no relationship with any observed or missing data.
MAR: Missingness is related to some observed data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the three types of statistical tests based on the purpose of analysis?

A
  1. Tests of Differences (e.g., T-test, ANOVA)
  2. Tests of Associations (e.g., Chi-square, correlation, regression)
  3. Parametric and Nonparametric tests (dependent on distribution and sample size)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the main types of descriptive statistics?

A
  1. Measures of Frequency (e.g., Count, Percent, Frequency)
  2. Measures of Central Tendency (e.g., Mean, Median, Mode)
  3. Measures of Dispersion/Variation (e.g., Range, Standard Deviation, Variance)
  4. Measures of Position (e.g., Percentile Ranks, Quartile Ranks, Z-scores)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the three measures of central tendency?

A

Mean: The arithmetic average.
Median: The middle score in a distribution.
Mode: The most frequently occurring score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the range in descriptive statistics?

A

The range is the difference between the highest and lowest values in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the standard deviation?

A

The square root of the average squared deviation, providing the average amount by which scores differ from the mean.

21
Q

How do you decide which measure of central tendency to use based on data type?

A

Nominal data: Use the mode.
Ordinal data: Use the median.
Interval/Ratio data (without outliers): Use the mean.
Skewed distribution (with outliers): Use the median.

22
Q

What are misleading graphs and examples of how they mislead?

A

Starting scale not at zero
Leaving scale off the graph
Selecting data that is accurate but misleading

23
Q

What is the primary goal of statistical inference?

A

To generalize results from a sample to a larger population beyond the observations in the sample.

24
Q

What is probability theory?

A

The foundation of inferential statistics, calculating the likelihood of an event occurring out of all possible outcomes.

25
Q

What are the three properties of probability distributions?

A

Probability of an event is between 0.0 and 1.0.
The total probability of all possible outcomes must add to 1.0.
Mutually exclusive events’ probabilities can be added together.

26
Q

What is the normal distribution?

A

A theoretical mathematical distribution used to evaluate the probability of events. It assumes a bell curve shape and is symmetrical, unimodal, and continuous.

27
Q

What percentage of scores fall within one, two, and three standard deviations from the mean in a normal distribution?

A

One standard deviation: 68%
Two standard deviations: 95%
Three standard deviations: 99%

28
Q

What is a Z-score?

A

A standard normal deviate representing how many standard deviation units a score falls above or below the population mean (μ).

29
Q

What is the Central Limit Theorem?

A

The theorem states that for large sample sizes (N ≥ 30), the distribution of the sample means will approach a normal distribution, regardless of the original population’s distribution.

30
Q

What is the sampling distribution of the mean?

A

A theoretical distribution of mean scores from all possible random samples of a given size from a population.

31
Q

What is the standard error of the mean?

A

The standard deviation of the sampling distribution of the mean, indicating how accurately the sample mean estimates the population mean.

32
Q

What is a confidence interval?

A

A range of score values expected to contain the population mean with a certain level of confidence (e.g., 95% or 99%).

33
Q

What is the relationship between variance and standard deviation?

A

Standard deviation is the square root of variance, providing a more interpretable measure of how much scores differ from the mean.

34
Q

What are the characteristics of a normal distribution?

A

Symmetrical
Unimodal
Continuous
Mean = median = mode
Asymptotic (tails never touch the baseline)

35
Q

What is hypothesis testing?

A

A process used to assess if results from a test are valid or repeatable, involving comparing a research hypothesis with a null hypothesis.

36
Q

What are the main steps in the hypothesis testing process?

A

Specify research hypothesis.
Specify null hypothesis.
Specify alpha (α) level.
Select sample and implement design.
Select and compute statistical test.
Compare observed significance level (p) with alpha.
Make a decision regarding the null hypothesis.

37
Q

What is a null hypothesis (H0)?

A

The hypothesis that there is no relationship between the hypothesized variables in the population.

38
Q

What is an alternative hypothesis (H1)?

A

The hypothesis that there is a relationship between the hypothesized variables in the population.

39
Q

What is a significance level (α)?

A

A probability value that provides the criterion for rejecting the null hypothesis, representing the probability of making a Type I error (rejecting the null when it is true).

40
Q

What is statistical significance?

A

It means that the observed result is unlikely to have occurred by chance, given a significance level (e.g., p < 0.05), but it does not imply practical importance.

41
Q

What is effect size?

A

A measure of the strength of the statistical relationship, indicating the magnitude of the difference or relationship in the data, often denoted by η² (eta squared).

42
Q

What are Type I and Type II errors in hypothesis testing?

A

Type I error (α): Incorrectly rejecting the null hypothesis (false positive).
Type II error (β): Failing to reject the null hypothesis when it is false (false negative).

43
Q

What is the difference between a directional (one-tailed) and non-directional (two-tailed) hypothesis?

A

Directional (one-tailed): Specifies the direction of the relationship (e.g., “greater than”).
Non-directional (two-tailed): Does not specify the direction, just that a difference exists (e.g., “not equal to”).

44
Q

What is a Z-test?

A

A statistical test used to determine the difference between two population means when the population variances are known.

45
Q

What are the assumptions for conducting a Z-test?

A

Sample size greater than 30.
Data are independent.
Data are continuous and approximately normally distributed.
Data come from a probability sample.

46
Q

What is a one-sample T-test?

A

A parametric test used when the population standard deviation is unknown, testing whether a sample mean differs from a known or hypothesized population mean.

47
Q

What are degrees of freedom (df)?

A

The number of independent scores that are free to vary in a statistical calculation, symbolized by “df.”

47
Q

What is the purpose of a T-test?

A

To test hypotheses about mean differences between a sample and population when the population variance is unknown.

48
Q

What is the difference between a Z-test and a one-sample T-test?

A

A Z-test is used when the population standard deviation is known, while a one-sample T-test is used when the population standard deviation is unknown and is estimated from the sample.