Topic 5 & 6 Flashcards
Joint Probability Distribution & Fundamentals Sampling Distributions and Data Descriptions
a probability distribution of two random variables, π and π or their pair. These random variables may be either discrete or continuous
joint probability distribution
referring to discrete random variables
Joint Probability Mass Function
referring to continuous random variables
Joint Probability Density Function
is the probability distribution of a subset of a collection of random variables, focusing on one variable at a time.
marginal distribution
is the entire set of observations or elements that we are interested in studying
Population
is a subset of the population selected for analysis.
Sample
consistently overestimates or underestimates certain characteristics of the population, leading to inaccurate inferences
biased sampling procedure
the process of using data from a sample to draw conclusions or make predictions about a population.
statistical inference
eliminates any possibility of bias in the sampling procedure, ensuring that every elements of the population has an equal chance of being selected
Random Sample
any function of the random sample/variable
statistic
is a characteristic or measure that describes the entire population
parameter
is the probability distribution of a statistic
sampling distribution
states that if we take a sufficiently large sample size (π) from a population with any distribution (whether finite or infinite), the sampling distribution of the sample mean (πΜ ) will be approximately normal, with a mean (π) and a variance of π^2 / π, provided that the sample size is large.
Central Limit Theorem (CLT)
Common Sampling Distributions
- t - distribution
- F - distribution
used extensively to estimate the population parameters but either have a small sample size or do not know the populationβs standard deviation. It is similar to normal distribution
but has heavier tails, which accounts for the extra uncertainty when working with smaller samples.
t - distribution
used when you want to compare the variances of two populations to see if they are significantly different.
F - distribution
It is a principle that helps estimate unknown values about a population using the data from a sample
Point Estimation
is a number or value that is in some sense a reasonable value (a good guess) of the true population parameter.
point estimate
is a range of values, derived from sample data, that is likely to contain the true population parameter (such as the population mean or proportion).
confidence interval
If the population standard deviation π is known
we used the standard normal distribution (also
known as z-distribution). we use the t-distribution to account for the extra uncertainty introduced by using the sample standard deviation π .
range of values the experimenter is interested in the possible value of future observations.
Prediction intervals