Correlation And Hypothesis Testing Flashcards by Phoebe L

What type of correlation is the PMCC is close to 1?

Positive linear correlation

How well did you know this?

Not at all

Perfectly

What type of correlation if the PMCC is close to -1?

Negative linear Negative linear

How well did you know this?

Not at all

Perfectly

What does r mean (hypothesis testing)?

PMCC for a sample

How well did you know this?

Not at all

Perfectly

What does p mean (hypothesis testing)?

PMCC for the whole population

How well did you know this?

Not at all

Perfectly

“Test for a positive correlation”
Which tail do you use?

Positive (upper) one tailed

How well did you know this?

Not at all

Perfectly

“Evidence for some correlation” “No evidence for some correlation”
What tailed is used?

Two tailed (half the significance level)

How well did you know this?

Not at all

Perfectly

Hypothesis for negative (lower) one tail

HO: p = 0
H1: p < 0
(PMCC: If the value given is smaller than the negative value from the table, reject H0 so there’s enough evidence)

How well did you know this?

Not at all

Perfectly

“Test for a negative correlation”
What tailed is used?

Negative (lower) one tail

How well did you know this?

Not at all

Perfectly

Hypothesis for positive (upper) one tail

H0: p = 0
H1: p > 0
(PMCC: If the value given is larger than the value from the table, reject H0 so there’s enough evidence)

How well did you know this?

Not at all

Perfectly

Hypothesis for some correlation two tail

H0: p = 0
H1: p ≠ 0
(PMCC if neg value from table < r > value from table, reject H0 so there’s enough evidence)

How well did you know this?

Not at all

Perfectly

To find the critical value

Use PMCC table

How well did you know this?

Not at all

Perfectly

If the value is within the critical region…

It’s significant meaning you reject H0 so there’s enough evidence to suggest there’s a neg correlation/pos correlation/some correlation/an increase/a decrease etc

How well did you know this?

Not at all

Perfectly

Test statistic

Used to test the hypothesis. It could be the result of the experiment calculated from the exampple

How well did you know this?

Not at all

Perfectly

Null hypothesis H0

Hypothesis you assume to be correct

How well did you know this?

Not at all

Perfectly

Alternate hypothesis

Tells you about the parameter if your assumption is shown to be wrong

How well did you know this?

Not at all

Perfectly

Hypothesis test

A statement made about the value of a population parameter. It uses a sample to determine whether to reject H0

How well did you know this?

Not at all

Perfectly

Critical value

The first value to fall inside the critical region

How well did you know this?

Not at all

Perfectly

Critical regions

A region of the probability distribution which, if the test statistic falls within, you reject the null

How well did you know this?

Not at all

Perfectly

Acceptance region

The area in which we accept the null hypothesis

How well did you know this?

Not at all

Perfectly

“Test for an increase/improvement in…”
Which tail is used?

Upper one tail

How well did you know this?

Not at all

Perfectly

“Test for a decrease/an over-estimate….”
Which tail is used?

Lower one tail

How well did you know this?

Not at all

Perfectly

“Test for a change in….”
Which tail is used?

Two tailed

How well did you know this?

Not at all

Perfectly

PMCC on calculator (from a given table)

Menu 6
2
3 .Type values
Optn
4

How well did you know this?

Not at all

Perfectly

Critical value on calculator

Menu 7
Scroll down 1
2 (for testing whether a given variable is significant), 1 (for finding the critical region)

How well did you know this?

Not at all

Perfectly

When to use binomial or cumulative probability?

Binomial P(X = 4) Cumulative P(X<4) P(X≤4) etc

Binomial distribution/probability

1. Menu 7 2. 1 3. 2

Cumulative probability

Use table 1. Menu 7 2. Scroll opt 1 3. 2 (testing a variable)

Comment on the suitability of the binomial distribution model

The probability is lower/higher than the expected value which suggests the model is not accurate

Suggest one improvement for the distribution model

A non uniform distribution

Requirements of a binomial distribution

1: The number of observations n is fixed. 2: Each observation is independent. 3: Each observation represents one of two outcomes ("success" or "failure" 4) there is fixed probability

Requirements of a normal distribution

1) The mean, median and mode are exactly the same. 2) The distribution is symmetric about the mean—half the values fall below the mean and half above the mean. 3) The distribution can be described by two values: the mean and the standard deviation.

Finding mean from binomial distribution

Finding variance from binomial distribution

np(1-p) If 1-p is negative the just use np

Later it was discovered that the local scout group visited the supermarket that afternoon to buy food for their camping trip. (f) Comment on the validity of the model used to obtain the answer to part (e), giving a reason for your answer

The 20 customers are independent & the members of the scout group may invalidate this so binomial distribution would not be valid

When testing a value against a hypothesis to see if there’s change/improvement.

- decrease P(X<_8) - change/increase P(X>_8)

P value for two tailed test

Times the probability by 2

PMCC measures…

how strong the correlation between two variables is.

Z for normal distribution

X-U/o

_ X ~ N

(u, (o/root)^2

Normal distributions

X~N (u, o^2)

Normal distribution significant figures

- table = 4 d.p - calculator = 3 d.p (State whether your using a table or calculator)

Standard normal distribution

Z~N (0.1)^2 Z=X-u/o

Normal to standard

X~B (50, 4^2) P(X<53) P(Z<53-50/4) = P (Z<0.75) 0\(0.75)

The Central Limit Theorem

Can use mean full time and mean part time ~ Normal

State an assumption you’ve used (when using variance)

Variance of sample = variance of pop.

Text whether or not there is evidence that the PMCC is positive

Positive upper tail

Two condition under which the normal distribution may be used as an approximation to the binomial distribution

Number of trials is large and probability of success is close to 0.5

If differences in mean is greater than differences in standard deviation

Sizes of standard deviations are small compared with the difference in mean temperatures making it more likely that the difference in means is significant

Explain why it is reasonable to model the daily mean pressure for Beijing, during May to August using a normal distribution.

It’s bell shaped

give a reason why we cannot say there is no chance of a hurricane in Beijing during May to August.

The tails of a Normal distribution are infinite.

When to use upper and lower bounds for distribution values?

Only when using np, np(1-p)

How to show that the distribution of T is not discrete uniform distribution?

Show that the probabilities of the outcome aren’t equal

y=ax^n

logy=loga+nlogx

Y=ab^x

Logy=loga+xlogb

State, giving a reason, whether or not the correlation coefficient is consistent with Tess’a suggestion

Since r is close to -1 it is consistent (ie has strong correlation)

The linear regression equation is w 10 755- 171 t. Give an interpretation of the gradient of this regression equation

As t increases, w decrease

Subjects have a negative correlation. Given that on a day the humidity was high, what would expect the No. hours of sunshine to be?

Lower than average

Explain why this normal distribution may not be good model for T?

The model suggests non-negligible profitability of T values < 0 which is impossible

Give an interpretation of the correlation

Analyse the correlation using the variables

When to use (np, square root np(1-p))

When asks for suitable approximation or normal approximation

What data should be used when asked about 'typical' or 'average'?

Mean and median (location of the data)

What data to use when asked about how 'spread out' the data is?

- Calculate standard deviation, range & interquartile range - describe variability of the data

Describe the shape of the data

- how many peaks or modes - symmetric or asymmetric - skew (is there a long tail to the left or right)

What type of distribution do we have a sample of data?

frequency distribution

What type of distribution do we have the entire population?

Probability distribution

Relationship between mean and median regarding symmetry

If it's symmetric, we can expect the mean and median to be about the same

What is unimodal?

One peak

Examples of variables that are positively skewed

Waiting times Household income

Examples of variables that are negatively skewed

Satisfaction measures Retirement age

Examples of variables that are symmetric

Height Weight

When is poisson distribution used?

- to describe rare events & discrete occurrences over an interval of time - independent (in non overlapping intervals) - the range is form 0 onwards - constant expected no. occurence

Examples of when poisson distribution would be used

- no. random arrivals per some time interval (customers arrivals to a store on weekday mornings) - queuing theory - rare blood disease

Correlation And Hypothesis Testing Flashcards

(72 cards)