Compositional Data Flashcards
Compositional Data
- non-negative vectors that sum to a unit-sum constraint
- can be proportions, percentages, counts/frequencies
Relative Property
- only ratios/relative proportions between components are meaningful
- total sum value is arbitrary - absolute scale does not matter
Spurious Correlation
- misleading, neagtive correlation between components due to sum constraint
- change in one value of one composition, forces change in others
Scale Invariance
- ratios between components unchanged under rescaling
multiplying by constant - ratios stay the same
Subcompositional Coherence
- relantionships between parts remain valid even when analysing a subset of components
A,B,C - analysing just A,B shoudl not give conflicting conclusions
Subcompositional Dominance
- if one component dominates the full composition, it should dominate in any subcomposition
if A always greater than B in A,B,C - then should remain true in A,B
Permutation Invariance
- order of the components should not affect the analysis
A,B,C should give the same results as B,A,C
Ternary Diagram
- graphical visualtion of three components
- near vertex - high concentration of that component
- near centre - equal proportions of all components
ALR
additive log-ratio
* divides ratios by one component
* dependent on choice of divisor
CLR
centered log-ratio
* divides ratios by geometric mean
* covariance singular as all components retained - robustness issues as not fully independent
ILR
isometric log-ratio
* uses orthonormal coordinates to transform
* creates independent, orthogonal coordinates
* harder to interpret
Rounded Zeros
- represent values that fall below some detection limit
- not true zero values
- due to measurement error or below detection limit
- treated by replacing the zero values
Structural Zeros
- true zeros
- actual zero or absence of component
- informative
- carry important information
- model-based approaches