CAP Data Flashcards
Line intercept sampling
Sampling method where elements in a region are selected if a chosen line segment (transect) intersects the element
Theoretical sampling
Sample method where individuals are added to the sample based on results of data already collected
Non-standard values data transformation
Identify categories represented by multiple categorical values and replace with a standard value
Principal Component Analysis (PCA)
Dimensionality reduction method that uses orthogonal transformation to transform data set into a new coordinate system where first coordinate contains the most variance, second coordinate contains the second-most variance, etc.
Data volume
The quantity of data stored in a warehouse
probability proportionate to size sampling
Sample method where probability of an individual being chosen for the sample is proportional to the size of its subpopulation
Smoothing data transformation
Apply a simple moving average or a LOESS regression to the data
panel sampling
Sample method where individuals randomly chosen for an experiment are asked for information in waves of data collection
Stratified sampling
Sample method where population is divided into subpopulations and individuals are randomly chosen for the sample from these subpopulations
Binning data transformation
Divide the values of a continuous variable into discrete intervals
Data strategy
A plan designed to improve the enterprise’s acquisition, storage, management, sharing, and use of data
Statistical uncertainty
Natural randomness in a process that effects each experimental trial
Voluntary sampling
Sample method where individuals choose to join the sample
Skewness data transformation
Transform the distribution using a function such as a logarithm, a square root, or an inverse
Normalization data transformation
Scaling the data to remove differences in magnitude between continuous variables; examples include min-max, z-scores, and decimal scaling
Structured data
Information organized into a formatted repository so that its elements are easily searchable by basic algorithms
Interval scale
Items in the scale are differentiated by degree of difference with no absolute zero as part of the scale
Data value
The worth of data stored in and extracted from a warehouse
Master data
Data objects agreed on and shared across the enterprise
Precision of measurements
The closeness of agreement between independent measurements, primarily comes from random error