Statistics& Financial Modelling Flashcards

Question

Probability Distributions for Discrete Random Variables

Answer 1

Let X be a discrete random variable and x be one of its possible values. * The probability that random variable X takes specific value x is denoted P(X = x) * The **probability distribution function** of a random variable is a representation of the probabilities for all the possible outcomes. * Can be shown algebraically, graphically, or with a table.

Answer 2

* 0 ≤ P(x) ≤ 1 for any value of x * The individual probabilities sum to 1;

Answer 3

The **cumulative probability function**, denotedF(x₀), shows the probability that X does not exceed the value x_0. **F(x₀)=P(X≦x₀)** Where the function is evaluated at all values of x_0.

Answer 4

The derived relationship between the probability distribution and the cumulative probability distribution. Let X be a random variable with probability distribution P(x) and cumulative probability distribution F(x₀). Then

Answer 5

Derived properties of cumulative probability distributions for discrete random variables.离散随机变量累积概率分布的导出性质。 Let X be a discrete random variable with cumulative probability distribution F(x₀). Then 1. 0 ≤ F(x₀) ≤ 1 for every number x₀ 2. for x₀ \< x1, then F(x₀) ≤ F(x₁)

Answer 6

Expected Value (or mean) of a discrete random variable X:

Answer 7

If P(x) is the probability function of a discrete random variable X , and g(X) is some function of X , then the expected value of function g is

Answer 8

* Let random variable X have mean μ_x and variance σ²_x * Let a and b be any constants. * Let Y = a + bX * Then the mean and variance of Y are μ_Y= E(a+bX)=a+bμ_x σ²_Y= Var(a+bX)=b²σ²X * so that the standard deviation of Y is σ_Y= |b|σx

Answer 9

* Let a and b be any constants. * a) E(a)=a and Var(a)=0 i. e., if a random variable always takes the value a, it will have mean a and variance 0 * b) E(bX)= bμ_x and Var(bX)= b²σ²_x i. e., the expected value of b·X is b·E(x)

Answer 10

# * Consider only two outcomes: “**success**” or “**failure**” * Let **p** denote the probability of success * Let **1 – p** be the probability of failure * Define random variable X: x = 1 if success, x = 0 if failure * Then t**he Bernoulli probability distribution** is P(0)=(1--p) and P(1)=p

Answer 11

The mean is μ_x = p The variance is σ²_x = p(1 – p)

Answer 12

* A fixed number of observations, n * e.g., 15 tosses of a coin; ten light bulbs taken from a warehouse * Two mutually exclusive and collectively exhaustive categories * e.g., head or tail in each toss of a coin; defective or not defective light bulb * Generally called “success” and “failure” * Probability of success is P , probability of failure is 1 – P * Constant probability for each observation * e.g., Probability of getting a tail is the same each time we toss the coin * Observations are independent * The outcome of one observation does not affect the outcome of the other

Answer 13

P(x) = probability of x successes in n trials, with probability of success p on each trial x = number of ‘successes’ in sample,(x = 0, 1, 2, ..., n) n = sample size (number of independent trials or observations) p = probability of “success”

Answer 14

The shape of the binomial distribution depends on the values of p and n

Answer 15

* “n” trials in a sample taken from a finite population of size N * Sample taken without replacement * Outcomes of trials are dependent * Concerned with finding the probability of “X” successes in the sample where there are “S”successes in the population

Answer 16

Properties of Joint Probability Distributions of Discrete Random Variables Let X and Y be discrete random variables with joint probability distribution P(x, y) 1. 0 ≤ P(x, y) ≤ 1 for any pair of values x and y 2. the sum of the joint probabilities P(x, y) over all possible pairs of values must be 1

Answer 17

* The covariance measures the strength of the linear relationship between two variables * If two random variables are statistically independent, the covariance between them is 0 * The converse is not necessarily true

Answer 18

* Let random variable X be the price for stock A * Let random variable Y be the price for stock B * The market value, W, for the portfolio is given by the linear function W= aX+ bY (a is the number of shares of stock A, b is the number of shares of stock B)

Answer 19

* A continuous random variable is a variable that can assume any value in an interval * thickness of an item * time required to complete a tasktemperature of a solution * height, in inches * These can potentially take on any value, depending only on the ability to measure accurately.

Answer 20

The probability density function, f(x), of random variable X has the following properties:

Answer 21

The uniform distribution is a probability distribution that has equal probabilities for all equal-width intervals within the range of the random variable

Answer 22

* The mean of X, denoted μ_x , is defined as the expected value of X μ_x= E[X] * The variance of X, denoted σ_x² , is defined as the expectation of the squared deviation, (X - μ_x)², of a random variable from its mean σ_x² =E[(X-μ_x )²]

Answer 23

1. Bell Shaped 2. Symmetrical 3. Mean, Median and Mode are Equal * Location is determined by the mean, μ * Spread is determined by the standard deviation, σ * The random variable has an infinite theoretical range:+∞ to -∞ * The normal distribution closely approximates the probability distributions of a wide range of random variables * Distributions of sample means approach a normaldistribution given a “large” sample size * Computations of probabilities are direct and elegant * The normal probability distribution has led to good business decisions for a number of applications

Answer 24

By varying the parameters μ and σ, we obtain different normal distributions

Answer 25

The formula for the normal probability density function is

Answer 26

Note that the distribution is the same, only the scale has changed. We can express the problem in original units (X) or in standardized units (Z)

Answer 27

* The Standard Normal Distribution table in the textbook (Appendix Table 1) shows values of the cumulative normal distribution function. * For a given Z-value a , the table shows F(a). (the area under the curve from negative infinity to a )

Answer 28

To find **P(a \< X \< b)** when X is distributed normally: * Draw the normal curve for the problem in terms of X； * Translate X-values to Z-values； * Use the Cumulative Normal Table.

Answer 29

Suppose X is normal with mean 8.0 and standard deviation 5.0; Find P(X \< 8.6)

Answer 30

Suppose X is normal with mean 8.0 and standard deviation 5.0. Now Find P(X \> 8.6)

Answer 31

Steps to find the X value for a known probability: 1. Find the Z value for the known probability 2. Convert to X units using the formula: x_a=μ+z_aσ **Example:** * Suppose X is normal with mean 8.0 and standard deviation 5.0. * Now find the X value so that only 20% of all values are below this X

Answer 32

Not all continuous random variables are normally distributed. 并非所有连续随机变量都是正态分布的。 It is important to evaluate how well the data is approximated by a normal distribution.评估数据与正态分布的近似程度非常重要

Answer 33

* Arrange data from low to high values; * Find cumulative normal probabilities for all values; * Examine a plot of the observed values vs. cumulative probabilities (with the cumulative normal probability on the vertical axis and the observed data values on the horizontal axis); * Evaluate the plot for evidence of linearity.

Answer 34

* Recall the binomial distribution: * n independent trials * probability of success on any given trial = p * Random variable X: * X_i =1 if the i^th trial is “success” * X_i =0 if the i^th trial is “failure” **E[X]= μ= np** **Var(X)=σ²= np(1-p)**

Answer 35

Used to model the length of time between two occurrences of an event (the time between arrivals)用于模拟两次事件发生之间的时间长度（到达之间的时间）

Answer 36

A financial portfolio can be viewed as a linear combination of separate financial instruments

Answer 37

Z值在表中查F（Z）的值

Answer 38

Contents of this chapter: * Confidence Intervals for the **Population Mean, μ** * when Population Variance σ² is Known * when Population Variance σ² is Unknown * Confidence Intervals for the **Population Proportion, P** (large samples) * Confidence interval estimates for the **variance** of a normal population * Finite population corrections * Sample-size determination

Answer 39

* An **estimator** of a population parameter is * a random variable that depends on sample information . . . * whose value provides an approximation to this unknown parameter * A specific value of that random variable is called an **estimate**

Answer 40

* A point estimate is a single number, * a confidence interval provides additional information about variability

Answer 41

* Suppose there are several unbiased estimators of θ * The **most efficient estimator** or the **minimum variance unbiased estimator** of θ is the unbiased estimator with the **smallest variance**

Answer 42

How much uncertainty is associated with a point estimate of a population parameter? * An interval estimate provides more information about a population characteristic than does a point estimate * Such interval estimates are called **confidence interval estimates.** * An interval gives a range of values: * Takes into consideration variation in sample statistics from sample to sample * Based on observation from 1 sample * Gives information about closeness to unknown population parameters * Stated in terms of level of confidence * Can never be 100% confident

Answer 43

* If P(a \<θ \< b) = 1 -α, then the interval from a to b is called a 100(1 - α)% confidence interval of θ. * The quantity 100(1 - α)% is called theconfidence level of the interval * α is between 0 and 1 * In repeated samples of the population, the true value of the parameter θ would be contained in 100(1 -α )% of intervals calculated this way. * The confidence interval calculated in this manner is written as a \< θ\< b with 100(1 - α)% confidence

Answer 44

The general form for all confidence intervals is:

Answer 45

* Assumptions * Population variance σ² is known * Population is normally distributed * If population is not normal, use large sample

Answer 46

Commonly used confidence levels are 90%, 95% and 99%

Answer 47

If the population standard deviation σ is unknown, we can **substitute the sample standard deviation, s** This introduces extra uncertainty, since s is variable from sample to sample So we **use the t distribution** instead of the normal distribution

Answer 48

* Goal: Form a confidence interval for the population variance, σ² * The confidence interval is based on the sample variance, s² * Assumed: the population is normally distributed

Answer 49

If the sample size is more than 5% of the population size (and sampling is without replacement) then a finite population correction factor must be used when calculating the standard error. Suppose sampling is without replacement and the sample size is large relative to the population size. Assume the population size is large enough to apply the central limit theorem. Apply the finite population correction factorwhen estimating the population variance

Answer 50

Consider a simple random sample of size n from a population of size N The quantity to be estimated is the population total Nμ An unbiased estimation procedure for the population total Nμ yields the point estimate Nx^-

Answer 51

A hypothesis is a claim (assumption) about a population parameter: * population mean * Example: The mean monthly cell phone bill ofthiscityis μ=$52 * population proportion * Example: The proportion of adults in this city with cell phones is p = .88

Answer 52

* Is the opposite of the null hypothesis * e.g., The average number of TV sets in U.S. homes is not equal to 3 ( H₁: μ ≠ 3 ) * Challenges the status quo * Never contains the “=” , “≤” or “≥” sign * May or may not be supported * **Is generally the hypothesis that the researcher is trying to support**

Answer 53

* Defines the unlikely values of the sample statistic if the null hypothesis is true * Defines **rejection region** of the sampling distribution * Is designated by **α** , (level of significance) * Typical values are 0.01, 0.05, or 0.10 * Is selected by the researcher at the beginning * Provides the **critical value(s)** of the test

Answer 54

**Type I Error** * Reject a true null hypothesis * Considered a serious type of error _The probability of Type I Error is **α**_ * Called **level of significance** of the test * Set by researcher in advance **Type II Error** * Fail to reject a false null hypothesis * _The probability of Type II Error is β_

Answer 55

* p-value: Probability of obtaining a test statistic more extreme ( ≤ or ≥) than the observed sample value given H₀ is true * Also called observed level of significance * Smallest value of for which H₀ can be rejected

Answer 56

There is only one critical value, since the rejection area is in only one tail

Answer 57

* Involves categorical variables * Two possible outcomes * “Success” (a certain characteristic is present) * “Failure” (the characteristic is not present) * Fraction or proportion of the population in the“success” category is denoted by P * Assume sample size is large

Answer 58

p-Value Solution

Answer 59

If the true mean is μ\* = 50, The probability of Type II Error = β = 0.1539 The power of the test = 1 – β = 1 – 0.1539 = 0.8461

Answer 60

**Goal:** Form a confidence interval for the difference between two population means, μ_x – μ_y * Different populations * Unrelated * Independent * Sample selected from one population has no effect on the sample selected from the other population * Normally distributed

Answer 61

Goal: Test hypotheses for the difference between two population proportions, **P_x – P_y** Assumptions: Both sample sizes are large, **nP(1 – P) \> 5**

Answer 62

* A test with low power can result from: * Small sample size * Large variances in the underlying populations * Poor measurement procedures * If sample sizes are large it is possible to find significant differences that are not practically important * Researchers should select the appropriate level of significance before computing p-values

Answer 63

* Compared two dependent samples (paired samples) * Performed paired sample t test for the mean difference * Compared two independent samples * Performed z test for the differences in two means * Performed pooled variance t test for the differences in two means * Compared two population proportions * Performed z-test for two population proportions * Performed F tests for the difference between two population variances * Used the F table to find F critical values

Answer 64

* The coefficients b₀ and b₁, and other regression results in this chapter, will be found using a computer * Hand calculations are tedious * Statistical routines are built into Excel * Other statistical analysis software can be used

Answer 65

* b₀ is the estimated average value of y when the value of x is zero (if x = 0 is in the range of observed x values) * b₁ is the estimated change in the average value of y as a result of a one- unit change in x

Answer 66

The coefficient of determination, R², for a simple regression is equal to the simple correlation squared. R²= r²

Statistics& Financial Modelling Flashcards

Quantitative Finance