Statistik Task 2 Flashcards
The probability Distribution of a continuous random variable takes on the form of a Density curve. What does that mean?
A probability Distribution is a Display that links each Outcome of an Experiment with its probability of occurence,
If a variable can take on any value betwen its Minimum value and its maximum value, it is called a continuous variable. The probability Distribution of a continuous random variable is called a continuous probability Distribution. The probability Density function describes the continuous probability Distribution.
The best wayto plot it is a Density curve that Shows the probability Distribution. The area under the Density curve is equal to 100 percemt of all probabilities. As we usually use decimals in probabilities, we can also say that the are is equal to 1.
How do you determine the probabilities of values in a continuous random variable?
Since in the case of a continuous random variable, the probabiliy that X takes on any particular value x is 0, finding P(X=x) is not going to work. With continuous random variables, the probability only makes sense for intervals on X (a,b), that is, P(X < values) or P(value < X < value).
To do that, we have to use a probability Density function.
Probability Density function
The probability Density function of a continuous random variable X with support S, is an integrable function f(x) satisfying:
- F(x) is positive everywhere, that is,
f(x)>0. for all x - The area under the curve f(x) is 1
- If f(x) is the function of x, then the probability that x belongs to A, where A is some interval is given by the integral of f(x) over that interval.
If we were to determine the P(X=value) we would have to specify infinitely smal intervals. Therefore, P(X=value)=0
The Distribution of IQ is symmetrical and single-peaked with a mean of 100 and a Standard Deviation of 15. If we were to take a sample from this Population with a size of N01, The IQ score of the one individual would be a random variable. Describe the probability Distribution, expected value (and Standard Deviation for this random variable. next slide)
The IQ score of the one individual is a random varibale because it can Change whenever we draw a new random sample.
The probability Distribution for this random sample would have exactly the same shape as the Population Distribution. The graph would Display a normal Distribution (=example of a Density curve) with only one Peak. The values Closer to the mean occur more frequently than those who are further away.
Part II Describe expected value
Any normal Distribution has two Parameters: the expected value and the Standard Deviation. The expected value of the Distribution will Always equal to the Population mean for n=1 or n=25 the mean stays the same. It holds that
πxbar=π.
here the expected value of the Distribution is 100.Therefore, the mean is an unbiased estimator of πx.
Part III What is unbiased/biased?
- > an estimate is unbiased if, averaged across an infinite number estimations, it would yield the Parameter again. That is why we can always use the mean to estimate the expected value without bias, as Long as the sample is random.
- we obtain a biased estimate if we use an inappropriate parameter -> an estimate from a non-random sample
Part IIII describe Standard deviation
The Standard daviation of the Distribution πxbar determines the width of the curve. When it Comes to the shape of the Distribution, it can best be explained with the central Limit Theorem: if N is large enough, the sampling Distribution of x-bar will become approximately normal, regardless of the shape of the Population Distribution.
X is a random variable with a specific expected value. If X2 is a linear Transformation of X, what can you say about the expected value of X2? Try and link the answer to this Question to the z-score transformation
In General, the function of a random variable g(x) is not the same as the expected value for that function. The only case in which they are the same is when g is linear Transformation g(x)= a+bx
-> the expected value of a linear Transformation of X is just the linear Transformation of the expected value of x
It related to z-score transformations in that it converts all the scores to a different scale of measurement, but the meaning of the scores stays the same - expected value and linear Transformation may have the same value but those are two differnt scales.
If X and Y are two random variables, what can you say about the expected value of X+Y?
Suppose we have two random variables: X with a mean of πx and Y with a mean of πy.
-> the mean of the sum of These variables πx+y is given by the following equation:
E(X+Y)= E(X)+E(Y)
same applies in the case of the mean of the difference between These random variables
πx-y.
-> if x and y are random variables, then
E(X-Y)= E(X)-E(Y)
If X and Y are two Independent random variables, what can you say about the variance of X-Y?
The Concept of variance describes how strongly scores deviate from the mean. If X and Y are two Independent random variables, then the variance of (X+Y) and the variance of (X-Y) are described by the following equations:
var(X+Y) = var (X-Y) =
var(X)+var(Y)
-> so we can say that the variance of X-Y is equal to the variance of X+Y
What do we mean by statistic?
A statistic is a piece of data from a Population. It is used as a way to understand the data that is collected about us and the world.
Estimator: Used to guess something about a Population. E.g. if we know the sample mean, we can use it to guess what the Population mean is.
What do we mean by a Parameter?
A Parameter is a value that tells us something about a Population. Parameter never changes because everyone is surveyed to find the Parameter.
give an example of statistics
60% of US residents agree with the latest Health care proposal. It is not possible to actually ask hundreds of millions of People whether they agree. Researchers have to just take samples and calculate the rest
give an example of a paramter
40% of 1211 students at a particular elementary School got below a 3 on a standardized test. We know this, because we have each and every students test score
Is a mean a statistic or a Parameter?
An example that could serve for both statistic and Parameter is the mean. Until and unless we have the entire Population, a Parameter cannot be calculated. However, it can be estimated with a sample from the Population. Thus a statistic is used to estimate a Parameter. The more sample we have, the better our estimation.
How is a statistic related to a random variable?
Because the statistic is a summary of Information about a Parameter obtained from the sample, the value of a statistic depends on the particular sample that was drawn from the Population. Its values Change randomly from one random sample to the next one.
Therefore a statistic is a random variable
What do we mean by the term samplin Distribution?
A sampling Distribution is a graph of a statistic for our sample data. Some common statistic used for samplin Distribution include the mean, the range, the Standard Deviation of the sample and the variance.
Discuss the difference between the Population Distribution, the Distribution of sample scores and the sampling Distribution. To what does each Distribution relate?
Part 1: Population distribution
Population Distribution is the whole set of values, or individuals, we are interested in. The shape is fixed: it can be normal, skewed, multimodal,.. the common Parameters are:
the Population mean π
and the Population Standard Deviation π.
- > it relates to the Population - Parameters of Population Distribution are of interest.
- > unkown in the beginning - estimated by means of a sample if it is large enough
- > z-scores -> Z= (X-π): π
Discuss the difference between the Population Distribution, the Distribution of sample scores and the sampling Distribution. To what does each Distribution relate?
Part 2: Distribution of sample scores
Distribution of sample scores is the Distribution of X in a random sample. It is considered as the starting Point of our Analysis. Its shape varies depending on the sample (EO). Samples of small size can differ a lot from each other. Large sample have bigger chances of Looking strongly like the Population. Common estimates are: the sample mean X-bar and Standard Deviation Sx.
- > it relates to the Distribution of sample . Distribution of scores for a single SRS
- > it is always known because it is the sample we have drawn.
- > -> z-scores -> Z= (X-X-bar): Sx
Discuss the difference between the Population Distribution, the Distribution of sample scores and the sampling Distribution. To what does each Distribution relate?
Part 3: sampling distribution
Sampling Distribution is the Distribution of all the X-bar we could acquire in a random sample. We can see it as a theoretical Distribution. it is based on the idea that infinite Repetition is possible.
The shape is equal to the Population for N=1, but if N is large enough, it becomes almost normal, even if the Population is not (CLT). Common Parameters are: the expected value (