Stats Flashcards
The purpose of experimentation?
Comparing Alternatives
Identifying the Significant Inputs (Factors) affecting an Output (Response) - separating the
vital few from the trivial many
Achieving an Optimal Process Output (Response)
Reducing Variability
Minimizing, Maximizing, or Targeting an Output
Achieve product & process robustness
Standard Deviation = ?
properties?
σ = sqrt(sum(xi - x)^2/(n-1)
Always positive or 0
Affected by outliers
Same units as the data.
Define Random Variable
A random variable is a variable whose possible values are numerical outcomes of a random phenomenon. There are two types of random variables, i.e., discrete and continuous
Define Probability of distribution
A probability of distribution is a list of possible values of a random variable together with their probabilities
Define Binomial Distribution
A binomial distribution is a frequency distribution of the possible number of successful outcomes in a given number of trials in each of which there is the same probability of success.
State and define the two types of variables.
Discrete random variable - a variable which can only take a countable number of values.
Continuous random variable: a random variable takes on values within an interval or it has so many possible values that they might as well be considered continuous.
Properties of normal distribution?
- Bell curve
- Area = unity
- Symmetrical
- Middle = mean
- Saddle point is when it turns from concave down to concave up and is 1 SD away from u
- Almost all points are within 3 Sd.
What is the standard normal distriubution?
- mean = 0
- SD = 1
- The normal random variable of a standard normal distribution is called a standard score or z-score.
What is sampling distribution?
The sampling distribution of the sample means gives all the possible values of the sample mean and quantifies how often they occur
How to build the sampling distribution of the sample mean?
1) Take a sample of values from random variable X (population) 2) Calculate the mean of the sample
3) Repeat step 1) and 2) over and over again
All the sample means xbar result in a new population which is denoted using random variable Xbar
Describe the properties of mean of the sampling distribution
- A sampling distribution represents averages that are based on samples and NOT individual values from a population
- A sampling distribution is nothing bu a distribution, so that it has its own shape, centre and variability
The mean of sampling distribution Xbar is denoted a uxbar
How do the standard errors pf a sampling distribution vary with SD and n
As n increases standard errors decrease
As SD increases standard errors increase
What is the shape of a sampling distribution if:
- the distribution X is normal?
- The distribution of X is Un-known or not-normal?
- Xbar is normal
- Xbar can be approximated with a normal distribution according to the Central Limit theorem (CLM)
What does the Central limit theorem state?
If you have a population with mean u and SD and take sufficiently large random samples (usually n>30) from the population itself, then the sample means will be approx normally distributed.
Define confidence intervals.
A range of values so defined that there is a specified probability that the value of a parameter lies within it.
e.g.
confidence Interval = sample statistic ± margin of error
What is the margin of error affected by?
Confidence level - z*
Sample size - n
Variation in population - SD
z* * (SD/sqrt(n))
Define confidence level.
The probability that the value of a parameter falls within a specified range of values.
Corresponds to the percentage of the time the result would be correct if numerous random samples were taken
What is a hypothesis test?
What are the two types?
Its a procedure that uses data from a sample to confirm or deny a claim about the population.
1) Null hypothesis (Ho) - Ho is true unless data and statistics demonstrate otherwise.
2) Research (or alternative) hypothesis - Ha
State and describe the two types of hypothesis
When is open chosen over the other?
1) Null hypothesis (Ho) - Ho is tryu unless data and statistics demonstrate otherwise.
2) Research (or alternative) hypothesis (Ha) - Population parameter is: Not equal to Ho, Larger than Ho or Smaller than Ho.
Ho is rejected in favour of Ha when we have found a statistically significant result.
What does the correlation coefficient tell you about the bivariate data set?
r = -1 - perfect negative linear relationship
r approaching -1 - strong negative linear relationship
r approaching 0 - no linear relationship.
r approaching 1 - strong positive linear relationship
r = 1 - perfect positive linear relationship.