STATS Flashcards
Variable
A characteristic that varies between subjects or objects (e.g., age, gender, income).
Data
Plural term representing the collection of all observations of variables.
Dataset
A collection of data, typically organized for research or analysis.
Categorical (Qualitative
Described in words, e.g., car colors.
Numerical (Quantitative)
Can be counted or measured, e.g., income.
Discrete
Countable variables, like the number of customers.
Continuous
Uncountable variables, like time or weight.
Time Series
Data measured over time (e.g., stock prices over months).
Frequency
Refers to the count, percent, or frequency of data values.
Position
Frequency
Boxplots
are useful visual tools that depict the median, quartiles, and outliers of data.
Sampling
critical to ensure that inferences drawn from the sample reflect the entire population. Sampling methods fall into two categories:
Probability Sampling
preferred for accurate population estimation
Simple Random Sampling
Each member of the population has an equal chance of being selected.
Systematic Sampling
Selects every nth individual in a list.
Cluster Sampling:
The population is divided into clusters, and a random sample of clusters is selected.
Stratified Sampling
The population is divided into strata, and samples are taken from each stratum.
Non-Probability Sampling
often biased and not ideal for statistical inferences
Convenience Sampling
Selecting individuals based on convenience.
Volunteer Sampling
Individuals volunteer to participate.
Snowball Sampling
Existing participants recruit future subjects.
Questionnaire
A primary method for gathering data from samples. Care must be taken to ensure validity (accuracy) and reliability (consistency) of the data.
Hypothesis Testing
Used to test claims or hypotheses about a population parameter based on sample data.
Correlation Analysis
Measures the strength and direction of relationships between two variables.
Regression Analysis
Determines how one variable influences another (e.g., predicting sales based on advertising spend).
Classical Probability
lassical probability assumes that all outcomes of an event have an equal likelihood of occurring. This method is used when the total number of possible outcomes is known, and each outcome is equally likely.
Empirical Probability
Empirical probability calculates the probability of an event based on actual data or experiments, rather than theoretical predictions. It is used when the total number of possible outcomes is unknown, and probabilities are determined by observation.