General Definitions Flashcards
DCOVA Framework
Define, Collect, Organize. Visualize, Analyze data.
What are data?
In statistics, data are “the values associated with a trait or property that help distinguishing the occurrences of something”.
Variable
A characteristic of an item or individual.
Descriptive Statistics
Refer to methods that primarily help summarize and present data.
Inferential Statistics
Refer to methods that use data collected from a small group to reach conclusions about a larger group.
Big Data
The collections of data that cannot be easily browsed or analyzed using traditional methods.
Categorical Variables
Take categories as their values (also known as qualitative variables)
Numerical Variables
Have values that represent a counted or measured quantity (also known as quantitative variables).
Discrete Variables
(Numerical variables) Are numerical values that arise from a counting process (e.g. total amount paid).
Continuous Variables
(Numerical variables) Are numerical values that arise from a measuring process and those values depend on the precision of the measuring instrument used (e.g. distance form home to store).
Primary Data Source
Data collected on your own.
Secondary Data Source
Data collected by someone else.
Population
Consists of all the items or individuals about which you want to reach conclusions.
Parameter
When you analyze data from a population you compute a parameter.
Sample
A portion of a population selected for analysis.
Statistics
When you analyze data from a sample you compute statistics.
Structured Data
Refers to all types of data that are structured or organized in any form.
Unstructured Data
Data having very little or no repeating structure or organization.
Frame
A complete or partial listing of the items that make up the population from which the sample will be selected.
Nonprobability Sample
Selecting items or individuals without knowing their probabilities of selection.
ADVANTAGES: convenience, speed, and low cost.
CANNOT be used for statistical inference.
Probability Sample
Selecting items or individuals based on known probabilities.
Convenience and Judgement Samples
Subcategories of nonprobability sample.
In a convenience sample you select items that are easy, inexpensive, or convenient to sample.
In a judgement sample you collect the opinions of preselected experts in the subject matter.
Simple Random Sample
Subcategory of probability sample. It is the most elementary sampling technique. every item from a frame has the same chance of selection as every other item, and every sample of a fixed size has the same chance of selection as every other sample of that size.
Sampling with replacement means that after you select an item, you return it to the frame, where it has the same probability of being selected again.
Sampling without replacement means that once you select an item, you cannot select it again.
Systematic Sample
Subcategory of probability sample. In a systematic sample, you partition the N items in the frame into n groups of k items, where K=N/n