Multivariate analyses Flashcards

1
Q

Multivariable analysis

A
  • used for data with one dependent outcome variable but more than one independent variable
  • multivariable analysis determines the relative contributions of different causes to a single event or outcome
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Multivariate analysis

A

-used for data with more than one dependent outcome variable as well as more than one independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Multiple regression

A

-used if both the dependent and independent variables consist of continuous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Logistic regression

A

-used if the dependent variable consists of dichotomous categorical data (two outcomes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Cox proportional hazards model

A

-used if the dependent variable also includes a time factor (e.g survival curve)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Log-linear analysis

A

-if the dependent variable consists of nominal categorical data (ie more than two outcomes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Analysis of variance (ANOVA)

A

-for analysis of continuous dependent variable with categorical independent variables use ANOVA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Analysis of covariance (ANCOVA)

A

-used if there are both categorical and continuous independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Path analysis

A
  • an extension of multiple regression
  • examines situations in which there are several final dependent variables with ‘chains’ of influence ie. variable A influences variable B which in turn affects variable C
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Cluster analysis

A
  • a multivariate tool used to organise variables into relatively homogeneous groups or ‘clusters’
  • involves the generation of a similarity matrix
  • produces a dendrogram
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Canonical correlation

A
  • multivariate tool used to explore the relationship between two sets of variables
  • involves the computation of eigenvalues
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Discriminant function analysis

A
  • a multivariate technique used to detect which of several variables best discriminates between two or more groups
  • similar to the multivariate analysis of variance MANOVA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Factor analysis

A
  • refers to a set of statistical methods used to detect underlying patterns in the relationships among a number of observed variables
  • aims to identify whether the correlations between a set of multiple observed variables can be summarised in terms of a smaller number of underlying, latent, unobserved variables called ‘factors’
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Two main types of factor analysis

A
  1. exploratory factor analysis

2. confirmatory factor analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Exploratory factor analysis

A
  • used for the preliminary investigation of a set of multiple observed variables
  • doesn’t make assumptions about the compositions of underlying latent variables or factors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Applications of exploratory factor analysis

A
  1. data reduction when multiple (over 25) variables have been measured
  2. classification of symptoms into meaningful concepts
  3. definition of subscales of new measures
17
Q

Confirmatory factor analysis

A
  • method for testing whether a specified factor structure remains valid with a new dataset
  • primarily used for assessing the construct validity of questionnaires or tests.
18
Q

Conducting a factor analysis

A
1. construct a correlation matrix
2 extraction
3. rotation
4. define the factors to be retained
5. labelling
19
Q

Extraction in factor analysis

A

-common factor analysis and principal components analysis are most frequently used

20
Q

Rotation

A

-involves measuring the eigenvalues

21
Q

Eigenvalue

A

-the amount of total variance explained by each factor

22
Q

Kaiser rule

A

-only factors with eigenvalues greater than 1 are retained

23
Q

Scree plot

A
  • plot the component numbers against eigenvalues

- choose the number that forms the elbow or bend before the plot levels off on the right side

24
Q

Labelling

A

-there is a general cosensus that the variables with a factor loading greater than or equal to 0.40 are probably making a significant contribution to that factor in constrast to those with smaller factor loadings

25
Q

Defining the factors to be retained

A
  • factor analysis

- eigenvalues are used to work out which values to keep

26
Q

Path analysis

A
  • refers to causal modelling and prediction beyond simple regression
  • independent variables are exogenous variables and dependent variables are endogenous
  • arrows display presumed causal relations
27
Q

Arrows in path analysis

A
  • single headed arrow flows from a putative cause to the effect
  • double headed curve arrow indicates mere correlation but no predictive causal links
28
Q

Path coefficient

A
  • used in path analysis

- indicates the direct effect of a variable assumed to be a cause on another variable assumed to be an effect

29
Q

Stratification

A

-often used to control or analyse the effect of confounder variables

30
Q

Stratum

A

-a sub-group within a sample often defined by the presence or absence of a variable of interest

31
Q

Mantel-Haenszel procedure

A

-applies a method of weighting for each strum to produce a summary score to help create an adjusted RR

32
Q

Standardisation

A
  • another method of stratification used in large data sets for public health statistics to produce adjusted rates
  • age is the most often standardised variable
  • a hypothetical ‘world standard population’ is often used
33
Q

Direct standardisation

A
  • stratum specific rates from study sample are applied to the standard population
  • summary score is produced from this data
34
Q

Indirect standardisation

A
  • stratum specific rates from the standard population are applied to the study sample
  • this gives expected rates
  • the expected rates are divided by the observed rates to arrive at standadrised rates e.g standardised mortality ratio
35
Q

Key learning points

A
  • stratification is useful only for known confounders
  • adjustment can be applied to Odds ratio as well as RR
  • multivariate techniques such as regression can also be used for analysing confounders