Final Exam - Weeks 5 & 6 Flashcards

1
Q

Sample size calculation and choosing how many people should be sampled for a quantitative study depends on what factors?

A
  • Research question
  • How data will be analyzed
  • Level of statistical significance
  • Statistical power
  • The effect size
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The level of statistical significance is usually chosen at ________ or _________.

A

0.05 or 0.01

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why should the needed sample size be calculated before doing a study?

A

Make sure it is possible to recruit enough people into the study to detect a difference if it exists (have enough power to detect a difference).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In quantitative sampling the goal is to select a representative sample so you can generalize to the larger study population = good _______________________

A

external validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Explain the 5 steps in sampling for quantitative studies.

A
  1. Define population: by specifying criteria
  2. Develop sampling plan: select the sampling method, try to minimize systematic error, simple random sampling, systematic, stratified, cluster
  3. Determine sample size: will depend on a number of factors, including the size of the population, the level of precision you require, and the amount of variability in the population.
  4. Implement sampling procedures:
  5. Compare critical values of sample to population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain systematic sampling.

A

Systematic sampling is a method of probability sampling where every nth member of a population is selected to be part of the sample. The process of selecting the sample involves selecting a random starting point from the population, and then selecting every nth element after that starting point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain stratified random sampling.

A

Stratified random sampling is a method of probability sampling where the population is divided into smaller subgroups, or strata, based on certain characteristics that are relevant to the study. Then, a random sample is selected from each stratum, proportional to the size of that stratum within the population. This ensures that each subgroup is represented in the sample, allowing for more accurate estimates of population characteristics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

_______________ sampling has the advantage of ensuring that each subgroup is represented in the sample, allowing for more accurate estimates of population characteristics. It also reduces sampling error and increases precision.

A

Stratified random sampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Explain cluster sampling.

A

Cluster sampling is a method of probability sampling where the population is divided into clusters or groups, and a random sample of those clusters is selected for inclusion in the study. This method is often used when the population is spread out over a large geographical area or is difficult to access.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Suppose you want to conduct a study on the health of children in a particular region of a country. The region is divided into 20 school districts, and you want to select a sample of 400 children. What type of sampling might you use?

A

Cluster sampling (multistage sampling)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When it is not feasible to get a probability sample, or a sampling frame of the population is not available then ___________ methods are used.

A

non-random.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a sampling frame?

A

A sampling frame is a list or representation of the population from which a sample is selected. It serves as a basis for identifying and selecting the sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Name some types of non-probability sampling.

A

Convenience sampling, purposive sampling, snowball sampling.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Explain convenience sampling.

A

Individuals or units are selected for inclusion in the study based on their availability and willingness to participate. It is one of the easiest and least expensive methods of sampling, but it is also one of the least reliable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain purposive sampling.

A

Involves selecting individuals or units for inclusion in the study based on specific criteria or purpose. In this method, the researcher selects the sample based on their knowledge and understanding of the population and the research question (deliberate selection of participants based on certain criteria).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Explain snowball sampling

A

Technically this type of sampling is a subtype of purposive sampling. Snowball sampling involves selecting participants based on referrals from other participants. The researcher starts with a few individuals who meet the selection criteria and then asks them to refer others who may also meet the criteria.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is methodological rigor and why is it important?

A

Methodological rigor refers to the degree to which a research study is designed and conducted with high standards of quality and rigor, such that the study can be trusted to produce valid and reliable results.

Methodological rigor is essential in research because it ensures that the study is conducted in a way that minimizes bias and maximizes the accuracy and generalizability of the findings. This includes designing the study with appropriate research questions and hypotheses, selecting appropriate research methods and measures, using appropriate statistical analyses, and ensuring that the sample is representative and the data is reliable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Explain what a priori hypothesis is.

A

A hypothesis based on assumed principles and deductions from conclusions of previous research, and are generated prior to a new study taking place.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The two levels of quantitative statistical analysis are:

A
  1. Descriptive: used to summarize the data
  2. Inferential: used to draw conclusions about population parameters based on data from sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Frequency distribution, measures of central tendency, and measures of dispersion/variability are all what type of measure?

A

Descriptive measurements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What are measures of central tendency?

A

Statistics that describe the location of the center of the distribution of numerical and ordinal measurements (mean, median, mode).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are measures of dispersion/variability?

A

Statistics that describe the degree of dispersion of the differences among scores (range, standard deviation, variance, coefficient of variation, percentiles).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Explain the difference between median, mean, and mode.

A

The mean is also known as the average and is calculated by adding up all the values in a set of data and dividing by the total number of values in the set.

The median is the middle value in a set of data that has been arranged in numerical order. If the data set has an even number of values, the median is calculated as the average of the two middle values.

The mode is the most frequently occurring value in a set of data. It is the value that appears the most number of times and can be used to describe the most common value in a data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Explain normal distribution.

A

Normal distribution is a statistical concept used to describe the distribution of a set of data. It is a continuous probability distribution that is symmetrical around the mean, with most of the data falling close to the mean and fewer data points occurring away from the mean.

The normal distribution is characterized by two parameters: the mean and the standard deviation. The mean is the center of the distribution , and the standard deviation measures the spread of variability of the data around the mean.

25
Q

Explain skewed distribution.

A

A skewed distribution is a type of distribution in which the data is not evenly distributed around the mean, and is not symmetrical around the center of the distribution. One tail of the distribution is longer than the other, resulting in a lopsided or asymmetrical shape.

There are two types of skewed distributions: positively skewed (or right-skewed) and negatively skewed (or left skewed).

Skewed distributions can occur for various reasons, such as outliers or extreme values in the data, or because data in inherently skewed by nature.

26
Q

Skewed distributions can have a significant impact on statistical analysis, as they can affect the accuracy of measures of central tendency, such as the mean, median, and mode. In a skewed distribution, the mean is typically shifted towards the direction of the longer tail, and is not representative of the typical value of the data. Therefore, in such cases, the ________________ may be a more appropriate measure of central tendency.

A

median

27
Q

When conducting statistical analysis using continuous data, it is essential to examine the distribution of the data to determine whether it is appropriate to use certain statistical methods, why?

A

This is because the distribution of the data can have a significant impact on the validity and reliability of statistical tests.

In general, statistical methods assume that the data is normally distributed, meaning that it follows a bell-shaped curve, with most of the data clustering around the mean and fewer data points at the extremes. If the data is not normally distributed, it may be skewed or have outliers, which can affect the validity of statistical tests.

28
Q

What is continuous data?

A

Continuous data is data that can take on any value within a range, and is typically measured on a scale, such as time, weight, or height.

29
Q

If the data is __________________ distributed, statistical methods such as t-tests, ANOVA, and correlation can be used with confidence, as these methods are robust to normality assumptions. However, if the data is not _________________distributed, it may be necessary to use non-parametric methods, such as the Wilcoxon rank-sum test, the Kruskal-Wallis test, or Spearman’s rank correlation, which do not require normality assumptions.

A

normally

30
Q

Explain symmetrical distribution.

A

In statistics, a symmetrical distribution refers to a probability distribution where the values of the variable are evenly distributed on both sides of the mean, resulting in a mirror image when the distribution is folded in half. In other words, the left half of the distribution is a mirror image of the right half.

Examples of symmetrical distributions include the normal distribution (also known as the Gaussian distribution) and the uniform distribution. In a normal distribution, the mean, median, and mode are all equal and the distribution is perfectly symmetrical around the mean. In a uniform distribution, all values have an equal probability of occurring, resulting in a symmetrical distribution.

31
Q

When data is _______________ distributed, the mean and median are equal, and the standard deviation can be used to determine the variability of the data.

A

symmetrically

32
Q

For skewed distribution why is it more appropriate to use the range of values and the median to describe the distribution of the data?

A

The range of values represents the spread of the data and gives an idea of how far apart the values are from each other. The median, on the other hand, is the middle value in the data set and is not affected by extreme values.

Using the median and range of values allows us to describe the central tendency and variability of the data in a more robust and accurate way than relying solely on the mean. For example, if the data is positively skewed, the mean will be greater than the median, and if it is negatively skewed, the mean will be less than the median. In such cases, using the median and range of values can give a more accurate picture of the distribution.

33
Q

The median is not affected by extreme values , true or false?

A

True.

34
Q

What are measures of association?

A

Measures of association are statistical techniques used to describe and quantify the relationship between two or more variables. These measures help determine the strength and direction of the association between variables. There are different types of measures of association, including correlation coefficients, contingency coefficients, and regression coefficients.

35
Q

A symmetrical distribution of numerical data used in most statistical tests, if sample size is >100 this distribution is usually used.

A

Normal distribution.

36
Q

A probability distribution similar to the standard normal distribution, used to test hypotheses involving numerical data for small sample sizes (N<30).

A

T-distribution.

37
Q

When distribution is not normal, and non-parametric methods are used.

A

Skewed distribution.

38
Q

Explain correlation coefficients and in particular Pearson’s correlation coefficents.

A

A measure of association. Correlation coefficients are used to describe the strength and direction of the relationship between two continuous variables.

Pearson’s correlation is used with interval data and reflects the linear relationship between two variables.

+1 = perfect positive correlation (linear relationship)
-1 = perfect negative correlation (linear relationship)
0 = no linear relationship

39
Q

Explain Spearman Rank Correlation (what is it?).

A

A measure of association. Non-parametric measure of the strength and direction between the association between two variables. It is used when the variables are measured on an ordinal or continuous scale and the assumption of normality or linearity is violated.

40
Q

Is odds ratio and relative risk a measure of association?

A

Yes, both are considered a measure of association as they measure the relationship between two categorical variables.

41
Q

Explain ANOVA.

A

ANOVA stands for analysis or variance. It is a statistical method for comparing the means of two or more groups or populations. ANOVA tests whether the means of the groups are significantly different from each other, based on the variation in the data.

42
Q

_____________ is commonly used when comparing the means of more than two groups, and is an extension of the t-test for two groups. __________ can be used with both categorical and continuous variables, and can also be used to determine whether there is an interaction effect between two or more variables.

A

ANOVA

43
Q

When you do ANOVA you get a test statistic called the ______________ this _____________ is associated with a particular p-value, this p-value will tell you whether the means are statistically different.

A

F statistic.

44
Q

The basic idea behind ANOVA is to divide the total variation in the data into two components: _________________ and __________________.

A

Variation between groups and variation within groups.

If the variation between groups is greater than the variation within groups, then there is evidence that the means of the groups are significantly different from each other.

45
Q

Explain the F-statistic.

A

ANOVA produces an F-statistic, which is calculated by dividing the variation between groups by the variation within groups. If the F-statistic is large enough, indicating that the variation between groups is significantly larger than the variation within groups, then we reject the null hypothesis of no difference between the groups.

46
Q

If the populations from which data to be analyzed by ANOVA violate one or more of the assumptions, the results of the analysis may be incorrect of misleading, what are these assumptions?

A
  • Independence: measures are independent of each other (no association), if not true then one-way ANOVA is not appropriate.
  • Normality: measures are normally distributed, if not true then ANOVA may not be able to detect a difference when it exists.
47
Q

The __________ the sample, the less variance and the more power to detect differences.

A

larger

48
Q

____________________ determines whether a null hypothesis can be rejected.

A

Statistical significance.

49
Q

The researcher calculates a p-value (based on the distribution of values), which is the probability of observing an effect given that the ______________ is true.

A

Null hypothesis.

50
Q

The null hypothesis is rejected if the p-value is __________ than the significance level.

A

less

51
Q

If a research question states:
“Do adults 65 and older who receive occupational therapy services on restorative care units have higher levels or leisure activity participation post-discharge than those who do not receive these services?”

What might the null hypothesis and alternative hypothesis be?

A

H0 (null hypothesis): There is no difference in leisure activities in the first month post-discharge between older adults who receive OT services and those who do not”

HA (alternative hypothesis): Older adults who receive OT services will participate in significantly more leisure activities post-discharge in the first month than those who do not receive OT services.

52
Q

The lower the p-value, the ______________ the statistical significance of the observed difference.

A

Lower.

53
Q

A _________ is a measure of the probability that an observed difference could have occurred just by random chance.

A

p-value

54
Q

P-value greater than or equal to 0.05

A

Not enough evidence to reject the null hypothesis, statistically non-significant.

55
Q

P-value lesser than or equal to 0.05

A

Enough evidence to reject the null hypothesis, statistically significant.

56
Q

Sample size needed to test a hypothesis depends on four factors:

A
  1. Expected difference between the two groups
  2. Variability of the data
  3. Power of the study (usually at least 80%)
  4. Level of significance accepted (usually 5%)
57
Q

In research the sample size should be large enough to avoid Type I and Type II error, explain these types of error.

A

Type I error is the incorrect rejection of a true null hypothesis.

Type II error is the failure to reject a false null hypothesis.

58
Q

p-value=0.07

Is this result statistically significant?

A

No, the p-value is larger than 0.05