Chapter 3: Fundamentals of Statistics Flashcards

Question

How are conditional distributions depicted with discrete random variables?

Answer 1

The condititonal probability density function [fXY (y|x) = fX,Y (x,y)/fx(x)] is most easily seen when X and Y are discrete, then: fY|X (y|x) = P(Y = y|X = x) The right-hand side is read as 'the probability that Y = y given that X = x

Answer 2

When Y is continuous, fX|y (y|x) is not interpretable directly as a probability, for the reasons discussed earlier, but condititional probabilities are found by computing areas under the conditional pdf.

Answer 3

- Measures of central tendency (the Expected Value, the median); - Measures of variability or spread (variance and standard deviation), and - Measures of association between two random variables (coveriance and correlation).

Answer 4

If X is a random variable, the expected value (or expectation) of X (denoted E(X) and sometimes μx or simply μ is a weighted average of all possible values of X. AKA - the mean. ▶ Expected value or Expectation of a function of a RV: E(X) = ∑ all x x · Pr(X = x) for discrete RV E(X) = ∫ x · f(x)dx for continuous RV

Answer 5

The Probability Density Function.

Answer 6

E(X) is defined as an integral: E(X) = ∫ x · f(x)dx Note: ∫ has infinity symbol above and - infinity symbol below it.

Answer 7

X = $2 $1 -$1 p = 1/6 1/6 4/6 E(X) = $2 1/6 + $1 1/6 − $1 4/6 = −$0.17 ▶ On average you will lose 17 cents per play by playing this game.

Answer 8

When calculating expectations we should be aware of the following rules: 1. E(a) = a for any real constant a 2. E(a · X ) = a · E(X ) 3. E(a + b · X ) = a + b · E(X ) 4. E(r (X ) + h(X )) = E(r (X )) + E(h(X ) 5. E(a · X + b) = a(X ) + b(Y ) ▶ Note that in general E (r (X)) ̸ = r (E(X)) For example E (log(X)) ̸ = log(E(X))

Answer 9

This is another measure of central tendency.

Answer 10

This is another measure of central tendency. If X is continuous, then the median of X, say m, is the value such that one-half of the are under the pdf is to the left of m and one-half is to the right of m

Answer 11

When X is discrete and takes on a finite number of odd values, the median is obtained by ordering the possible values of X and then selecting the value in the middle. For example, if X can take on the values {-4, 0, 2, 8, 10, 13, 17}, then the median value of X is 8.

Answer 12

They are different, but neither are better than the other as a measure of central tendency. They are both valid ways to measre the centre of the distribution of X. In one special case, the median and EV are the same (if X has a symmetric distribution about the value μ, then μ is both the EV and the median.

Answer 13

▶ The variance is a measure of the dispersion of the RV around its mean. ▶ It is defined as: σ2 = V (X) = E((X − E(X))↑2) ▶ The variance measure the expected distance of X to its mean. ▶ The variance is a positive real number, measured in the square of the units in which X is measured. ▶ We often find more useful its square root, the standard deviation σ = √ σ↑2

Answer 14

Because it is a measure of variability, it is used because measures of central tendency do not tell us everything we want to know about the distribution of a random variable.

Answer 15

1. Var(X) = 0 if, and only if, there is a constant c such that P(X = c) = 1, in which case E(X) = c. This first property says that the variance of any constant is zero and if a random variable has zero variance, then it is essentially constant. 2. For any constants a and b, var(aX + b) = a↑2var(X) This means that adding a constant to a random variable does not change the variance, but multiplying a random variable by a constant increases the variance by a factor equal to the square of that constant.

Answer 16

This is denoted as sd(X) and is simply the positive square root of the variance: sd(X) = + √var(X) The standard deviation is sometimes denoted σx or σ.

Answer 17

1. For any constant c, sd(c) = 0 2. For any constants a and b, sd(aX + b) = |a|sd(X)

Answer 18

We define a new random variable (x) by subtracting off its mean (μ) and dividing by its standard deviation (σ): Z = X - μ /σ This can be written as: Z = aX + b, where a = (1/σ) and b = -(μ/σ) Therefore: - E(Z) = aE(X) + b = (μ/σ) - (μ/σ) = 0 - Var(Z) = a↑2 var(X) = (σ↑2/σ↑2) = 1

Answer 19

They are measures of the linear relationship between two variables - AKA Measures of Association. While the join pdf of two random variables completely describes the relationship between them, it is useful to have summary measures of how, on average, two random variables vary with one another.

Answer 20

▶ Covariance measures the degree of linear association between the variables ▶ Cov (X , Y) = E((X − E(X)) · (Y − E(Y)))

Answer 21

1. Cov(a · X , b · Y) = a · b · Cov(X , Y) for any constants a, b. 2. Cov(X1 + X2, Y) = Cov(X1, Y) + Cov(X2,Y) 3. Cov(X, X) = Var(X) 4. Cov (X , Y ) = E(X · Y ) − E(X ) · E(Y)

Answer 22

Property Cov. 1: If X and Y are independent, then cov(x,y) = 0 It’s important to remember that the converse of cov.1 is not true: this means that zero covariance between X and Y does not imply that X and Y are independent.

Answer 23

Property Cov. 2: For any constants a1, b1, a2 and b2, Cov(a1X + b1, a2Y + b2) = a1a2cov(X, Y)

Answer 24

Property Cov. 3: |cov(X,Y)| ≤ sd(X)sd(Y) This is known as the Cauchy-Schwartz inequality

Answer 25

How we measure variables may have no bearing on how strongly they are related, BUT the covariance between them does depend on the units of measurement. E.g. covariance between education and earnings depends on whether earnings are measured in dollars or thouasands of dollars, or whether education is measured in months or years. SUMMARY: the magnitude of Cov (X , Y) depends on the units of X and Y , this is why we often use the correlation coefficient.

Answer 26

The fact that covariance depends on units of measurement is a deficiency that is overcome by the correlation coefficient between X and Y: ρXY = Corr (X , Y ) = [Cov (X , Y)]/[sd(X) * sd(Y)] = [σXY]/[σX * σY] which is a unit-free measure of their linear association.

Answer 27

The correlation coefficient between X and Y is sometimes denoted ρXY (and is sometimes called the population correlation).

Answer 28

It can be shown that ρXY ∈ [−1, 1]. In particular: ▶ ρXY = 0 : No correlation (but this does not imply independence) ▶ ρXY = 1 : Perfect positive correlation - We can write Y = a + b · X with b > 0 ▶ ρXY = −1 : Perfect negative correlation - We can write Y = a + b · X with b < 0 ▶ Two independent RVs are always uncorrelated. The converse is NOT ALWAYS TRUE

Answer 29

For constants a and b, Var(aX + bY) = a↑2var(X) + B↑2var(Y) + 2abcov(X, Y) It follows immediately that, if X and Y are uncorrelated (so that cov(X,Y) = 0) then: Var(X + Y) = Var(X) + Var(Y) And Var(X - Y) = Var(X) + Var(Y)

Answer 30

A Normal Distribution is a continuous random variable that can take any value ▶ Its probability density function is a bell shape.

Answer 31

X ∼ N(μ, σ ↑ 2) - N = normal - μ = E(X ) - σ2 = Var (X ) The pdf is mathematically rotten as: f (x) = 1/[σ√2π] * exp[−(x−μ)↑2/2 σ↑2], −∞ < x < ∞

Answer 32

Because a normal distribution is symmetrical.

Answer 33

Z = (X - μ)/σ E(Z) = 0 V(Z) = 1 E.G. If X ∼N(10,4) Then Z = (X - 1)/2 ∼ N(0,1)

Answer 34

First we normalise X: Z = (X - 8)/2 ∼ N(0, 1) P(4 < X < 12) = P((4-8)/2) < (X-8)/2 < (12 -8)/2) = P (-2 < Z < 2)

Answer 35

- Chi-square distribution - T distribution - F distribution

Answer 36

X = ∑n i=1 Zi↑2 It is used to describe the distribution of a sum of squared random variables. It is also used to test the goodness of of fit of a distribution of data, whether data series are independent, and for estimating confidences surrounding variance and sd for a random variable form a normal distribution.

Answer 37

A T-distribution is similar to the standard normal distribution, but with a more pointy tip and fatter tail. T = Z/(√X/N)

Answer 38

This is used for hypotheses testing in the context of multiple regression analysis. The graph has a peak that leans towards the f(x) axis. F = (X1 / k1) / (X2 / k2)

Answer 39

Pr(Y = y|X = x) === Probability of Y=y given X=x Probability that 2 items are brought given that the customers are women: Pr(Y = 2|X = 2) We fixed the value of X. The condition PDF of one variable, given (a value of) the other variable is: FY|X=x (y) = fxy(x, y)/fx(x) FX|Y=y (x) = fxy(x, y)/fy(y)

Answer 40

You add up each row/column and get the totals. Those totals are Fy and Fx. Example Fxy X Fy Y 0 1 2 0 0.05 0.1 0.03 | 0.18 1 0.21 0.11 0.19 | 0.51 2 0.08 0.15 0.08 | 0.31 ___________________________ Fx 0.34 0.36 0.30 | 1.00

Answer 41

You get the conditional distribution of Y given X. JOINT and MARGINAL Distributions Fxy X Fy Y 0 1 2 0 0.05 0.1 0.03 | 0.18 1 0.21 0.11 0.19 | 0.51 2 0.08 0.15 0.08 | 0.31 ___________________________ Fx 0.34 0.36 0.30 | 1.00 CONDITIONAL Distribution of Y given X: FY|X X Y 0 1 2 0 .05/.34=.147 .1/.36=.278 .03/.30=0.10 1 .21/.34=.618 .305 .633 2 .08/.34=.235 .417 .267 ____________________________________________ Total 1 1 1

Answer 42

The mean of the conditional distribution of Y given that another variable X takes on a value x is the conditional expectation (or conditional mean) of Y given X = x and is denoted by: E(Y|X = x)

Answer 43

E(Y|X = x) generally changes as we change x. That is, as we allow X to vary, we get a function of X, known as their conditional expectation function (CEF), denoted as E(Y|X).

Chapter 3: Fundamentals of Statistics Flashcards

(67 cards)