Statistics Flashcards

1
Q

What purpose does a basic knowledge of statistics serve for urologists?

A

It allows urologists to make informed decisions about treatments, read the medical literature, and conduct their own research and quality improvement studies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Figure 1: Example of Shrinking Confidence Intervals with Increasing Sample Size

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain the relationship between sample size and the estimation of the population.

A

As the sample size increases, the sample becomes a better estimate of the population. The larger the sample size, the more precise the statistics and the reduction in sampling error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

: Define “standard error.”

A

Standard error represents the standard amount of error expected in a sample given the sample size. It’s a measure of sampling error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What do confidence intervals represent?

A

Confidence intervals represent the best estimate where a specific percentage (e.g., 95%) of the sample means would fall around the population value for a specific distribution if multiple samples were selected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do the widths of confidence intervals change as sample size changes?

A

Smaller samples produce wider confidence intervals due to increased sampling error. As the sample size increases, the width of the confidence intervals decreases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Describe the significance of the confidence interval in relation to the IQ example provided.

A

In the first scenario with n=9, the confidence interval ranged from 99 to 119 (95% CI: 99-119). Since this range contains 100, there’s no statistical difference. However, with a larger sample of 100 students, the 95% CI shrank to 107-111, indicating a significant difference from the average IQ of 100.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain the relationship between a 95% confidence interval and its associated p-value.

A

A 95% confidence interval corresponds to a p-value of p < 0.05, meaning there’s only a 5% chance that scores outside the 95% CI belong to that specific distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What’s the primary issue with small sample sizes when it comes to statistical significance?

A

Small sample sizes can result in wide confidence intervals that might be too broad to detect or see differences that are actually present, leading to a Type II error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Define “statistical power” and explain its importance in research.

A

Statistical power is the chance of finding statistical significance when a true difference is present. It translates into the likelihood of detecting statistical significance. The power of a study is primarily determined by the sample size, with larger studies having the power to detect small differences due to narrower confidence intervals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does the concept of “statistical power” relate to sample size and the resources needed for research?

A

Larger studies have the power to detect small differences since the width of their confidence intervals will be narrow. However, large sample sizes can be expensive, so researchers plan the number of subjects needed before starting a study, considering the predicted effect of the study and the desired width of the confidence interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the chance of finding statistical significance for an appropriately powered study?

A

An appropriately powered study will have an 80% chance of finding statistical significance if the predicted differences are produced by the study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define the term “dependent variable” (DV).

A

The variable being measured for change, influenced by the treatment or other variables in the study, also called the outcome variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the “independent” variable in research?

A

It’s either the variable manipulated by the researcher (e.g., group assignment in experimental designs) or the variable(s) believed to influence the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a study comparing a new drug vs. a placebo for treating erectile dysfunction, identify the independent and dependent variables.

A

Independent Variable: Group assignment (new treatment vs. placebo). Dependent Variable: Measure of erectile function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In a study analyzing the impact of age and diabetes on urinary flow rate, specify the independent and dependent variables.

A

Independent Variables: Age and diabetes. Dependent Variable: Urinary flow rates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is an “operational” definition in research?

A

It’s how a researcher chooses to define or measure the variables in the study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How can erectile function be measured in a continuum?

A

Using the Erectile Function Domain (EFD) of the International Index of Erectile Function (IIEF), where lower scores indicate poorer function and higher scores indicate better function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When using the EFD, how can erectile function be operationalized as a continuous variable?

A

: EFD scores range from 6-30.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

: EFD scores range from 6-30.

A

By determining if a subject has ED or not based on the EFD. A score less than 26 indicates ED, whereas a score equal to or greater than 26 means no ED.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does it mean when a variable is measured as a binary variable?

A

The variable is defined in a way where only two outcomes are possible (e.g., a condition exists or it doesn’t).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How does the measurement of the dependent variable influence the choice of statistical tests?

A

The way the dependent variable is measured will decide the type of statistical tests used to analyze the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Which statistical measures are associated with continuous variables?

A

Continuous variables produce descriptive statistics such as means, standard deviations, and variances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are the statistical tests appropriate for analyzing data with a continuous dependent variable?

A

T-tests, Analysis of Variance (ANOVA), correlation coefficients, and linear multiple regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Which descriptive statistics are produced by binary variables?

A

Frequency (or the number of subjects in each category), often reported as percent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Can binary outcomes report statistics like means, standard deviations, or variance? Why or why not?

A

No. Since binary outcomes are reported in percentages, it is not possible to calculate such statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

List the statistical procedures appropriate when the dependent variable is measured as a binary variable.

A

Chi-square, Fisher’s exact test, point-biserial correlation coefficients, and logistic regression.

28
Q

Why might it be crucial for a urologist to understand the difference between continuous and binary variables in research?

A

Understanding the difference allows the urologist to properly interpret research findings, choose appropriate statistical tests, and ensure the validity of the results in clinical decision-making.

29
Q

Fig 2. Decision Tree for Statistical Tests

A
30
Q

What are the two basic types of study designs discussed in the AUA Core Curriculum Statistics?

A

Group Designs and Correlational Designs.

31
Q

In which type of study design is the researcher comparing two or more groups to assess differences in the dependent variable?

A

Group Designs.

32
Q

What type of design is considered an “experimental design” and allows researchers to conclude cause and effect due to random assignment and controlling for important elements?

A

Randomized controlled trials (RCTs).

33
Q

What are “Quasi-experimental designs”?

A

These are group designs that compare two or more groups but do not use random assignment or control for important elements. They cannot conclude cause and effect.

34
Q

Describe “repeated-measures designs” as mentioned in the AUA Core Curriculum.

A

It follows one group over time and assesses the group two or more times. An example is a pre-post design where one group is evaluated before and after an intervention. These designs can’t establish cause and effect as there’s no separate control group.

35
Q

What is the main focus of Correlational Designs?

A

The relationship between two or more variables within a group, without comparing groups.

36
Q

Provide a simple example of a correlational design.

A

Finding the association between age and erectile function.

37
Q

How can correlational designs explore multiple independent variables?

A

They can investigate the relationship between multiple independent variables and one dependent variable, such as how age and diabetes relate to erectile function.

38
Q

When the dependent variable is assessed as a continuous variable in a group design, which two primary statistical tests should be considered?

A

T-tests and Analysis of Variance (ANOVA).

39
Q

What determines the use of a t-test over an ANOVA?

A

The number of groups or repeated-measures being compared. Use a t-test for two groups or repeated measures and ANOVA for more than two.

40
Q

What are the non-parametric alternatives to t-tests?

A

Mann-Whitney U test (for independent measures) and Wilcoxon signed-rank test (for related samples).

41
Q

When would you use the independent measures t-test?

A

When comparing means from two independent or different groups.

42
Q

Describe the scenario where a related samples t-test would be appropriate.

A

For a repeated-measures research design, like examining changes in a group before and after an intervention.

43
Q

What’s the non-parametric test to use when the assumption for a related samples t-test isn’t met?

A

Wilcoxon signed-rank test.

44
Q

In what situations is an ANOVA used?

A

When a study is comparing three or more groups or repeated-measures.

45
Q

Which non-parametric test is a substitute for an ANOVA when normality cannot be assumed?

A

Kruskal-Wallis one-way analysis of variance.

46
Q

How is a repeated-measures ANOVA different from the standard independent-measures ANOVA?

A

Repeated-measures ANOVA deals with three or more repeated measures on the same sample, often used in a pre-post design with additional follow-ups.

47
Q

What is a multifactorial ANOVA?

A

An ANOVA that incorporates more than one independent variable, with each independent variable labeled as a “factor.”

48
Q

: In a 3 X 2 multifactorial ANOVA, what does “3” and “2” represent?

A

“3” represents the three levels of one independent variable (e.g., treatment group assignment), and “2” represents two levels of another independent variable (e.g., presence of diabetes).

49
Q

What are the two main effects produced by a multifactorial ANOVA?

A

The determination of statistical significance between treatment groups and the determination of statistical significance based on another factor (e.g., diabetes yes/no).

50
Q

Define “interaction” in the context of a multifactorial ANOVA.

A

Define “interaction” in the context of a multifactorial ANOVA.

51
Q

What is the Pearson correlation coefficient used for?

A

The Pearson correlation coefficient (“r”) is used to assess the relationship or association between two variables.

52
Q

What is the range of values a correlation coefficient can have?

A

The value of a correlation coefficient ranges from zero (indicating no relationship) to one (indicating a perfect linear relationship).

53
Q

What is the interpretation guide for the strength of the Pearson correlation coefficient?

A

Small: r=0.10; Medium: r=0.30; Large: r=0.50.

54
Q

How is the coefficient of determination obtained and what does it represent?

A

By squaring the correlation coefficient. It represents the percent of variance one variable explains of a second variable when multiplied by 100.

55
Q

When should you use the Spearman’s rank correlation coefficient over the Pearson correlation coefficient?

A

When the variables are measured in rank order or when underlying parametric assumptions for the Pearson correlation coefficient (e.g., normal distribution) are not met.

56
Q

What is the primary difference between multiple linear regression and logistic regression?

A

In multiple linear regression, the dependent/outcome variable is continuous. In logistic regression, the dependent/outcome variable is binary.

57
Q

For regression models, what is the general rule of thumb regarding the number of subjects per predictor variable?

A

There should be at least 15 subjects per predictor variable.

58
Q

What is the main aim of regression in assessing the association between age and vascular co-morbidities on erectile function?

A

Regression aims to view the independent contribution of age and the independent contribution of vascular co-morbidities on erectile function, while controlling for their mutual association.

59
Q

How are multifactorial ANOVA and multiple regressions related?

A

: Both are multivariable models. The ANOVA statistical tests are very specific examples of multiple regression. ANOVA models are often the first choice for analyzing data from group designs, but sometimes analyses expected to be run with ANOVA are run with multiple regression instead.

60
Q

When is the Chi-square test used in group designs concerning a binary dependent variable?

A

The Chi-square test is used when exploring differences between groups where the dependent variable is binary, e.g., ED vs. no ED, to indicate significant differences in percentages between groups.

61
Q

Under which condition should one use Fisher’s exact test instead of the Chi-square test?

A

Fisher’s exact test should be used when the number of subjects in any subgroup (or cell) is below five.

62
Q

For group designs with a binary dependent variable and increasing groups or repeated measures, which statistical tests remain appropriate?

A

Chi-square or Fisher’s exact test.

63
Q

What is the purpose of logistic regression in group designs with binary dependent variables?

A

Logistic regression is used when considering multiple independent variables in relation to a binary dependent variable. It is similar to multiple regression but focuses on a binary outcome variable.

64
Q

What is a point-biserial correlation, and when is it used?

A

It is a correlation coefficient used when the independent variable is continuous and the dependent variable is binary, e.g., the relationship between age and erectile function (measured as ED vs. no ED). It is similar to Pearson correlation in interpretation.

65
Q

In logistic regression analysis concerning a binary dependent variable, how would you control for the relationship between multiple independent variables?

A

The regression analysis will control for relationships between the independent variables and provide the association between each independent/predictor variable and the dependent variable.