Advanced Statistics Flashcards

1
Q

Survival analysis is used when:

A
  1. considering long-term effects
  2. understanding prognosis and trx effectiveness.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

All questions regarding survival analysis will have a ______ component.

A

Time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What type of distribution does survival time data typically have?

A

positive skewness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the components of censored observation?

A
  • Includes those who have not reached the terminal event by end of the study
  • leads to data that is incomplete
  • leads to underestimation of event occurrences.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the important characteristics of the Kaplan-Meier curve?

A
  1. Analyzes the probability of an event at specific time intervals
  2. Generates a step function that will change survival estimate each time a pt reaches a terminal event
  3. accounts for the censored observations
  4. Often reported with confidence intervals to better apply population parameters
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the median survival time?

A

the point at which cumulative survival function = .5 (50th percentile)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

the mean survival time is estimated as:

A

an area under the K - M curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a hazard rate? how is it estimated?

A

the rate at which how rapidly a subject will experience a terminal event
estimated by the slope of the line fitted to K - M curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a hazard ratio?

A

compares how often a particular terminal event happens between 2 groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Hazard ratio will tell us if a group is:

A

faster, slower, or event rates are the same in both groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When is a log-rank test at its best?

A

When there are equal proportions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the limitations of the Log Rank Test?

A
  1. can only test one variable at a time
  2. cannot control for confounders or other risk factors
  3. cannot include interaction terms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What would be a null and alternative hypothesis for survival analyses?

A

null: the groups have identical distribution curves
alternative: the groups have different distribution curves

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The COX proportional hazard model is better than the LOG rank if you want to what?

A

control for confounding variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the goal of the factor analysis?

A
  1. Seeks to understand whether and to what extent items from a survey or scale reflect specific contracts
  2. Provides information about reliability, item quality, and construct validity
  3. High sensitivity to identify problematic items and assess the number of factors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the 2 types of factor analysis?

A

exploratory factor analysis
confirmatory factor analysis

17
Q

factor analysis allows us to ___________ complex variables.

A

simplify

18
Q

factor analysis will function to find _____________ between variables.

A

similarities

19
Q

What are the 4 uses and goals of exploratory factor analysis?

A
  1. explore the possible underlying factor structure of a set of observed variables
  2. identify the underlying factor structure
  3. describe and identify the number of factors
  4. provide a means of explaining variation amongst variables.
20
Q

What assumptions must we make to run an exploratory factorial analysis?

A
  1. continuous data
  2. normal distribution
  3. sample size is large enough (greater than 200 with more than 3 observations per variable).
  4. correlation is greater than .2 between variables.
21
Q

What are the limitations of exploratory factor analysis?

A
  1. subjective analysis
  2. variables may not always be generalizable to the population
  3. no causal inferences can be made
22
Q

A confirmatory factor analysis will test the hypothesis that a relationship exists between:

A

the observed variables and an identified factor.

23
Q

What are limitatios to a confirmatory factorial analysis?

A

sample size must be large
very sensitive to outliers and missing data.

24
Q

What is a cluster analysis?

A

an exploratory data analysis tool for organizing observed data into meaningful clusters based on combinations of variables.

25
Q

What is agglomerative hierarchical clustering?

A

bottom-up –> 1 piece of data set and merges it with others to form larger groups.

26
Q

What is divisive hierarchial clustering?

A

top down –> starts w/ whole data set and partitions data step by step.

27
Q

what are limitations to hierarchial clustering?

A

arbitrary decisions (subjective)
consideration of data types
misinterpretation possible

28
Q

What is non - hierarchical clustering?

A

data points are grouped into non - overlapping subsets (clusters) such that each object is in exactly one cluster.

29
Q

What is the most widely used clustering?

A

K - mean clustering

30
Q

What are limitations to K mean clustering?

A

subjective test

31
Q

how does k - mean clustering occur?

A

data is classified into K # clusters, and each data point is mapped into the clusters with its nearest mean.

32
Q

What type of cluster analysis woudl you use if you have categorical and continuous data?

A

2 step clustering or hybrid approach

33
Q

What are some benefits to 2 step clustering?

A
  1. allows for the ability to create clusters on both categorical and continuous variables.
  2. number of clusters is automatically determined
  3. makes the analysis of a large data set very efficient
34
Q

The cluster quality validation index measures:

A

how well the general goal of clustering is achieved.

34
Q

The cluster quality validation index measures:

A

how well the general goal of clustering is achieved.