3: Probability distributions Flashcards

Question 1

Q

Quantile

Answer

A

a specific value; defines a particualr part of data set. a quantile determines how many values in a distribution are above or below a certain limit.

Question 2

Q

Inference

Answer

A

Drawing conclusions about a population from a sample

Question 3

Q

Probability + calculation

Answer

A

the chance of something happening (always between 0 and 1; the area under the normal distributed curve).

For instance, if the probability of a value being less than 1.8 is 0.85 (85%), then the probability of it being greater than 1.8 is 1 - 0.85 = 0.15 (15%).

Question 4

Q

probability distribution

Answer

A

describes the chance of different outcomes of a random variable

Question 5

Q

Normal distribution + shape

Answer

A

continuous probability distribution in which most data points cluster toward the mean, while the rest taper off toward either extreme. Bell/hill shaped

Question 6

Q

Poisson distribution

Answer

A

counts; A discrete, non-negative probability distribution that can be right skewed. Has only one parameter, average rate at which these events occur, rate parameter λ (lambda), which is the mean number of events.

The Poisson distribution exactly models the number of events in a fixed time or space when the events are independent (one doesn’t affect the other) and happen at a constant rate.

Question 7

Q

Binomial distribution

Answer

A

a fixed number of independent trials, each with two possible outcomes: success or failure. ratios, fractions, binary data. Can be skewed, left and right. Two parameters, probabilty of succes and number of trials.

difference is that the Binomial distribution deals with a fixed number of trials and a constant probability of success, while the Poisson distribution deals with the rate of events over time or space and is often used when the number of trials is very large or not fixed.

Question 8

Q

Variance

Answer

A

Shows the extent to which observations deviate from one another (variance large = differences in group large)
the spread between numbers in a data set (used to determine how far each number is from the mean and from every other number in the set).

Question 9

Q

Random Variable

Answer

A

a variable whose outcome (values) is subject to a random process (determined by chance), Like flipping a coin, heads or tail, it is random and no other influence. A random variable can be either discrete (having specific values) or continuous (any value in a continuous range).

represent measurable properties from random processes, and their distributions give insight into variability

Question 10

Q

properties of Random variable

Answer

A

We cannot predict the value of a random variable with absolute precision. as the test in each sample group will be different.
Functions base on random variables are also random variables.
The function calculating the mean uses random variables so is a RV. New samples can give different means.
-

Question 11

Q

Statistics and RV

Answer

A

Measures like the mean, variance, and standard deviation are random variables themselves and have distributions. How good these estimates are, is measured by the standard error (SE).

Question 12

Q

standard deviation (SD)

Answer

A

tells you how much the data itself varies. A measure of spread

The spread of data. The average amount of variability in your dataset. It tells you, on average, how far each value lies from the mean.

SD=√residual stand error^2 or √residual variance

Question 13

Q

standard error (SE)

Answer

A

meassure of uncertainty there is in a sample statistic like the mean or a slope

SE gets smaller as the sample size increases because more data provides a better estimate of the population parameter, leading to reduced variability in the estimate
SE = SD / √(sample size n)
or SE = coefficient/t-value

Question 14

Q

Difference SE and SD(stdev)

Answer

A

SD tells you how much the data itself varies. A measure of spread
SE measure of uncertainty there is in a sample statistic like the mean or a slope

Question 15

Q

Degrees of Freedom + formula

Answer

A

Is the amount you have to calculate a statistic. It’s calculated as the sample size minus the number of paramters estimated. df=n-1 (n=sample size)

WhenDF runs out model is to complicated for the number of observations

Question 16

Q

Describe the relation between a random variable, degree of freedom and a statistical model;

Answer

Study These Flashcards

A

A random variable represents outcomes of a random process, a degree of freedom refers to the number of independent values that can vary in a calculation, and a statistical model uses random variables and degrees of freedom to estimate parameters and make inferences about data

Question 17

Q

Residuals

Answer

Study These Flashcards

A

The difference between the actual outcome and that predicted by the model -> the sample estimate of the error.

Residuals represent the differences between the observed data points and the predicted values from a model. Residual = Observed value - Predicted value.
They show how well the model fits the data: smaller residuals mean a better fit, while larger residuals indicate that the model is not fully capturing the data’s pattern.
Residuals are also a reflection of the random, unexplained variability in the data (also referred to as error or noise).

The random part of a model that accounts for the unpredictability or unexplained variation (error).

Question 18

Q

What does it mean that for the Poisson distribution, the mean is equal to the variance

Answer

Study These Flashcards

A

In a Poisson distribution, which counts the number of events happening in a specific time or space, the mean (average number of events, λ) is the same as the variance (the spread of the data, also λ). This means that as the average number of events goes up, the variation in how many events actually occur also increases. This property is important for understanding how Poisson distributions work, especially for rare events.

Question 19

Q

Skew

Answer

Study These Flashcards

A

refers to the asymmetry of a probability distribution. It indicates whether data points tend to fall to the left or right of the mean

Binomial:
1. Right Skewed (Positive Skew)
Occurs when the probability of success is small. When p is low, there are many more ways to get a low number of successes than to get a high number. Thus, the distribution has a longer tail on the right side.
2. Left Skewed (Negative Skew)
Occurs when the probability of success is large. When p is high, there are many more ways to get a high number of successes than to get a low number. Therefore, the distribution has a longer tail on the left side.

Normal can not be skewed and poisson Right skewed only when the mean (λ) is low. In this case, there are more occurrences of low counts (like 0 or 1 events), and the tail on the right side (for larger counts) is longer.

Question 20

Q

Difference variance and Residuals

Answer

Study These Flashcards

A

Variance shows difference btween data and residuals show between predicted value by model and real value

3: Probability distributions Flashcards

(20 cards)