Class Test 2 Flashcards

1
Q

What is the period of a poisson?

A

Any quantity such as time, length as long as the rate is fixed for the experiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a poisson modeling?

A

The number of times an event happens in a specified period.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two main properties of poisson random variables?

A

Probability that an event occures is the same for intervals of the same size.
Intervals do not overlap.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define random variable.

A

A random variable is a variable which assumes numerical values representing the outcome of an experiment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a continuous random variable?

A

A random variable which can assume an infinite number of numerical values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the probability of exactly X for all continuous random variables?

A

0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does p.d.f. stand for?

A

probability density function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define the conditions for p.d.f.

A

The output of the pdf is always greater than or equal to zero.

When calculating the p.d.f. of a function f(x) the area under the curve is used. Hence integrating to infinity on f(x) should equal 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Cumulative distribution function for uniform distribution?

A

0 if x < a
x - a / b - a if a <= x <= b
1 if x > b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Cumulative distribution function for exponential distribution.

A

P(X <= x) = 1 - e^(-lambda * x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If we have already waited a units for the event, what is the probability if we wait another b units for an exponential random variable?

A

The same as the probability of waiting b units because of the memoryless property.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain the relationship between a poisson and an exponential random variable.

A

The time between consecutive events of a Poisson process follows an exponential distribution with the same rate lambda.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Differentiate between poisson and exponential.

A

Exponential measures wait time between events.

Poisson measures events per period of time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Differentiate between exponential and weibull distributions.

A

Weibull is the same but overcomes the memoryless property. So the probability of waiting b after waiting a is no the same as the probability of waiting b.

Allows the failure probability to vary with time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a uniform distribution X~U(a,b)?

A

A distribution from a lower to b higher where all possibilities are equally likely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a weibull distribution X~Weibull(a, β)?

A

a is the shape parameter, is the rate at whih the probability density decreases with respect to X.

β is the scale parameter which determines the size of the values of X for which the distribution is most concentrated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Cumulative distribution function for Weibull.

A

P(X <= x) = 1 - e^(- x / β)^a

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the affect of each of the following on the graph?
(a) Increase mean
(b) Decrease mean
(c) Increase variance
(d) Decrease variance

A

(a) Shape the same, but location shifts to the right. (left-skewed)
(b) Shape the same, but location shifts to the left. (right-skewed)
(c) Shape flattened, more spread out but location the same
(d) Shape narrowed, more concentrated but location the same.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a standard normal distribution?

A

A normal distribution with mean 0 and variance 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

How to calculate? (a)P(X < -a)
(b)P(X > a)
(c)P(X > -a)

A

(a)=P(X > a) = 1 - P(X < a)
(b)=1 - P(X < a)
(c)=P(X < a)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Z table formula?

A

Z = X - u / σ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

For a very large n, what other distribution approximates the normal distribution?

A

Binomial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is a sample statistic?

A

Sample statistics summarise random variable samples so are also subject to randomness.

T(x)=T(x1, .. xn)

where x is a random sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

On what does the distribution of the sample statistic depend?

A

sample size n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is a statistical estimator?
Is when a sample statistic is used to estimate a parameter θ of the population.
26
What are the properties of T(X)?
Unbiasdness: How close the average T is to the true parameter. Standard Deviation: Precision of T, low variability is good as centered on true value.
27
How are estimators denoted?
^ symbol above letter.
28
Central Limit Theorem Formula. Explain.
Normal(u, s.d.^2 / n) Distribution of sample mean approximation, even when we dont know the type of distribution of X.
29
If X is not normal does the CLT still apply?
Yes, if n is sufficiently large than X will be approx. normal for any distribution.
30
Expected value of continuous distribution?
integral by x
31
Variance of continuous distribution?
integral of x^2 - e[x]^2
32
What is a confidence interval?
A confidence interval for a population parameter is an interval which almost certainly contains the true parameter value.
33
Formula for confidence interval by central limit theorem. What is a?
[X - Z(a/2) * s / root(n) , X + Z(a/2) * s / root(n)] The decimal value for the amount of data we are allowed to have outside the interval.
34
What is the question which hypothesis testing seeks to answer?
Is the relationship observed in the sample clear enough to be called statistically significant, or could it have been due to chance.
35
What is the null hypothesis (H0)?
Usually says that nothing changed or happened. The status quo, or what is currently believed.
36
What is the alternative hypothesis (Ha)?
The complementary hypothesis. Often is the alternative that challenges the status quo, in an experiment is what the researcher wants to prove.
37
What does a p-value do?
Assumes that the null hypothesis is true, how likely would it be to obtain results at least as extreme as we have observed.
38
What are the two outcomes of hypothesis testing?
Fail to reject the null hypothesis. Reject the null hypothesis.
39
Distinguish between a type 1 and type 2 error.
Type 1: Rejecting the null hypothesis H0, when it is in fact true is a type 1 error.(False Positive) Type 2: Accepting the null hypothesis H0 when it is in fact false is a type 2 error.(False Negative)
40
What is a Z-test? Formula.
Hypothesis testing on sample mean. z = x - u / σ / root(n)
41
How do u know if a weibell increases or decreases with time?
If a is greater than one it increases with time If a is less than one it decreases with time
42
When making a confidence interval for 99% what is the decimal Z score used?
2.58. (So round up!!!)
43
Explain why you would use a t-test rather than a z-test?
The population standard deviation is unknown, so we estimate it with the standard deviation of the sample. Also valid for smaller samples.
44
If you have the statistics summary generated by python, how do you find (a) the fitted line (b) the correlation coefficent (c) the coefficient of determination (d) a t-test to assess the utility of the linear regression model.
Fitted Line: Y = Intercept + x-coefficient/slope * X Correlation Coefficient: Root(R^2) Coefficient of Determination: R^2 T-test: Take x-coefficient/slope and find associated p-value. If p-value < a, we have a significant result. We reject the null hypothesis that the slope coefficent is statistically equal to zero, meaning that we have a significant relationship between house size and price.
45
How do you caculate Sxx and Sxy of the data?
Sxx = sum of all(xi - X bar)^2 Sxy = sum of all(xi - X bar)(yi - Y bar)
46
How do you calculate B0 intercept and B1 advertising?
B0 (intercept) = y bar - B1 * x bar B1 (advertising) = Sxy / Sxx
47
Which goes on the x-axis in a scatterplot independent variable or dependent variable?
Independent Variable
48
Give a formula for correlation coefficent?
r = Sxy / root(Sxx * Syy)
49
If we have a scatterplot and histogram of the residuals, what can we say about the model's assumptions?
To Satisfy model's assumptions: Histogram: Centered at 0 Bell shaped form shows is approx. normal Scatterplot: Residuals are randomly spread around zero meaning that the assumption of constant variance is satisfied.
50
What is v? How is it calculated?
degrees of freedom. n - 1.
51
Explain the difference between a deterministic model and a probabilistic model.
Deterministic: The value of one variable Y is completely determined by the value of another variable X. Probabilistic: Allows for unexplained variation or random error. Consists of a deterministic component and a random error component.
52
What is the general form of the probabilistic model?
Y = deterministic component + Random Error Y is the dependent variable X is the explanatory variable (deterministic component)
53
What type of variables are X and Y?
X is an independent or predictor variable or explanatory Y is a dependent or response variable
54
Correlation coefficent explain different values?
r near 0, no correlation r close to 1, strong positive correlation r close to -1, strong negative correlation
55
What form does the deterministic component of the probabilistic model take?
B0 + B1X
56
What assumption is made about the random error component of the probabilitstic model take?
- Error follows normal distribution. - Has mean 0. - Variance is sigma^2. (Variance either side doesn't increase or decrease, rectangle shaped)
57
How is the test statistic calculated if I wanted to find p-value for linear regression?
t = B1/ root(MSE / Sxx)
58
Give formula for MSE.
MSE = SSE / (n-2)
59
Explain how to calculate SSE.
Take the difference between the fitted line and the observed value, square it and add it to the result of the same calculation for every point. I.e. SSE = n Sum i=1 (Observed Point i - Estimated point i)^2
60
What does SSE and MSE stand for?
Sum of Squared Errors Mean Squared Error
61
What is the null and alternative hypothesis for linear regression?
H0: B1 = 0; Y does not depend on X HA: B1 != 0; There is a linear relationship between the two variables
62
What is coefficient of determination? How is it calculated?
R^2. correlation coefficent^2 or (Sxy)^2 / Sxx * Syy