Exam Revision Flashcards

1
Q

Statistics

A

Statistics is the branch of mathematics that examines ways to process and analyse data. Statistics provides procedures to collect and transform data in ways that are useful to business decision makers. To understand anything about statistics, you first need to understand the meaning of a variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

4 fundamental terms of statistics

A

Population
Sample
Parameter
Statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Population

A

A population consists of all the members of a group about which you want to
draw a conclusion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sample

A

A sample is the portion of the population selected for analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Parmeter

A

A parameter is a numerical measure that describes a characteristic of a
population (measures used to describe a population) GREEK LETTERS REFER
TO A PARAMETER

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Statistic

A

A statistic is a numerical measure that describes a characteristic of a sample
(measures calculated from sample data) ROMAN LETTERS REFER TO
STATISTICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

2 types of statistics

A

Descriptive statistics

Inferential statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Descriptive statistics

A

Collecting, summarising and presenting data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Inferential statistics

A

Drawing conclusions about a population based on sample data/results (i.e. estimating a parameter based on a statistic such as hypothesis testing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

2 types of data

A

Categorical (defined categories)

Numerical (quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

2 types of numerical variables

A

Discrete (counted items)

Continuous (measured characteristics)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

4 levels of Measurement and Measurement Scales from highest to lowest

A

Ratio data
Interval data
Ordinal data
Nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Ratio data

A

Differences between measurements are meaningful and a true zero
exists

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Interval data

A

Differences between measurements are meaningful but no true zero
exists (has negatives)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Ordinal data

A

Ordered categories (rankings, order or scaling)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Nominal data

A

Categories (no ordering or direction)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

4 measures used to describe data

A

Central tendency
Quartiles
Variation
Shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

4 measures of central tendency

A

Arithmetic mean
Median
Mode
Geometric mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

5 measures of variation

A
Range 
Interquartile range 
Variance
Standard deviation 
Coefficient of variation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

1 measure of shape

A

Skewness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Arithmetic mean

A

Arithmetic mean is summing up the observations and dividing by the number of observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Median and mode extreme values

A

The median is not sensitive to extreme values and the mean is sensitive to extreme values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Sigma

A

Sigma is short for adding up the values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Median

A

In an ordered array, the median is the middle number (50% above and 50%below). It’s main advantage over the arithmetic mean is that it is not affected by extreme values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Mode

A

A measure of central tendency. Value that occurs most often (the most frequent). Not affected by extreme values. Never use the mode by itself, always use in conjunction with median or mean. Unlike mean and median, there may be no unique (single) mode for a given data set. Used for either numerical or categorical (nominal) data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Quartiles

A

Quartiles split the ranked data into four segments, with an equal number of values per segment. The first quartile, Q1, is the value for which 25% of the observations are smaller and 75% are larger. The second quartile, Q2, is the same as the median (50% are smaller, 50% are larger). Only 25% of the observations are greater than the third quartile, Q3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Measures of variation

A

Measures of variation give information on the spread or variability of the data values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Interquartile range

A

Like the median and Q1 and Q2, the IQR is a resistant summary measure (resistant to the presence of extreme values) Eliminates outlier problems by using the interquartile range, as high- and low-valued observations are removed from calculations. IQR = 3rd quartile – 1st quartile. IQR = Q3 - Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Sample variance

A

Measures average scatter around the mean. Units are also squared. This measure tells you the average deviation of the mean. The reason we square the values is because some are negative and some are positive. The sample variance is the squared average difference between the mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Sample standard deviation

A

Most commonly used measure of variation. Shows variation about the mean. Has the same units as the original data. It can be considered a measure of uncertainty.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Coefficient of variation

A

Measures relative variation i.e. shows variation relative to mean. Can be used to compare two or more sets of data measured in different units. Always expressed as percentage (%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

The Z score

A

The difference between a given observation and the mean, divided by the standard deviation. A Z score of 2.0 means that a value is 2.0 standard deviations from the mean. A Z score above 3.0 or below -3.0 is considered an outlier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

The shape of a distribution

A

Describes how data are distributed. Measures of shape are symmetric or skewed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Left skewed and right skewed

A

When the data is left or negatively skewed the distance between the q1 and q2 is greater than the distance between q2 and q3. The reverse applies for right or positively skewed data. If the data is symmetric the distances are the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What does a box and whisker plot show

A

Box and whisker plot show location, spread and shape.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Population variance

A

the average of the squared deviations of values from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Population standard deviation

A

shows variation about the mean. is the square root of the population variance. has the same units as the original data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Covariance

A

The sample covariance measures the strength of the linear relationship between two numerical variables. Only concerned with the direction of the relationship. No causal effect is implied. Is affected by units of measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Correlation

A

Measures the relative strength of the linear relationship between two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Features of correlation coefficient

A

Also called Standardised Covariance i.e. invariant to units of measure. Ranges between –1 and 1. The closer to –1, the stronger the negative linear relationship
The closer to 1, the stronger the positive linear relationship. The closer to 0, the weaker the linear relationship

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

5 number summary

A

Numerical data summarised by quartiles. Xsmallest Q1 Median Q3 Xlargest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

3 approaches to assessing probability

A

a priori
Empirical
Subjective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

a priori

A

Classical probability. Based on prior knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Empirical

A

Classical probability. Based on observed data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Classical probability. Based on observed data

A

Subjective probability. Based on individual judgment or opinion about the probability of occurrence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Probability

A

a numerical value that represents the chance, likelihood, possibility that an event will occur (always between 0 and 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Discrete probability

A

A discrete probability can only take certain values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

4 essential properties of the binomial distribution

A

A fixed number of observations

Two mutually exclusive and collectively exhaustive events

Constant probability for each observation

Observations are independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Index numbers

A

Index numbers allow relative comparisons over time. Index
numbers are reported relative to a Base Period Index. Base period index = 100 by
definition. Used for an individual item or measurement.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Which price index to use

A

Paasche is more accurate but more difficult to achieve.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Characteristics of the normal distribution

A

Bell-shaped

Symmetrical

Mean, median and mode are equal

Central location is determined by the mean

Spread is determined by the standard deviation (IT IS THE POPULATION STANDARD DEVIATION)

The random variable x has an infinite theoretical range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

What is the height of the curve a measure of

A

Probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

What must the area under the curve be

A

1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Calculate descriptive numerical measures to determine nornality

A

Do the mean and median have similar values? (Remember there may be no unique mode or there may be multiple modes.)
Is the interquartile range approximately 1.33 times the standard deviation?
Is the range approximately 6 times the standard deviation?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Calculate standard deviation to determine normality

A

Do approximately 2/3 of the observations lie within mean 1 standard deviation?
Do approximately 80% of the observations lie within mean 1.28 standard deviations?
Do approximately 95% of the observations lie within mean 2 standard deviations?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Continuous probability density function

A

Mathematical expression that defines the distribution of the values for a continuous random variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Sampling distribution

A

A sampling distribution is a distribution of all of the possible values of a statistic for a given size sample selected from a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Standard error of the mean

A

Different samples of the same size from the same population will yield different sample means.
A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean. Note that the standard error of the mean decreases as the sample size increases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

If the population is not normal

A

We can apply the Central Limit Theorem, which states that regardless of the shape of individual values in the population distribution, as long as the sample size is large enough (generally n ≥ 30) the sampling distribution of XBAR will be approximately normally distributed with:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

Sampling Distribution of the Proportion

A

Selecting all possible samples of a certain size, the distribution of all possible sample proportions is the sampling distribution of the proportion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

Simple random sampling

A

Every individual or item from the frame (N) has an equal chance of being selected (1/N).

Selection may be with replacement or without replacement.

Samples can be obtained from a table of random numbers or computer random number generators.

Simple to use but may not be a good representation of the population’s underlying characteristics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

Systematic sampling

A

Divide frame of N individuals into n groups of k individuals: k = N/n.

Randomly select one individual from the 1st group.

Select every kth individual thereafter.

Like simple random sampling, simple to use but may not be a good representation of the population’s underlying characteristics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

Stratified sampling

A

Divide population into two or more subgroups (called strata) according to some common characteristic.

A simple random sample is selected from each subgroup, with sample sizes proportional to strata sizes – called proportionate stratified sampling.

Samples from subgroups are combined into one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

Stratified sampling pros

A

More efficient than simple random sampling or systematic sampling because of assured representation of items across entire population.
Homogeneity of items within each stratum provides greater precision in the estimates of underlying population parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Cluster samples

A

Population is divided into several ‘clusters’, each representative of the population e.g. postcode areas, electorates etc.

A simple random sample of clusters is selected:
All items in the selected clusters can be used, or items can be chosen from a cluster using another probability sampling technique.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

Cluster sampling pros

A

More cost effective than random sampling, especially if population is geographically widespread.
Often requires a larger sample size compared to simple random sampling or stratified sampling for same level of precision.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Survey errors

A

Coverage error – appropriate or adequate frame?
Non-response error – results in non-response bias.
Measurement error – ambiguous wording, halo effect or respondent error.
Sampling error – always exists and is the difference between sample statistic and population parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

Point estimate

A

A point estimate is the value of a single sample statistic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

Confidence interval

A

A confidence interval provides a range of values constructed around the point estimate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Confidence interval estimation

A

An interval gives a range of values: Takes into consideration variation in sample statistics from sample to sample. Based on observations from 1 sample.
Gives information about closeness to unknown population parameters.
Stated in terms of level of confidence. Can never be 100% confident.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

A relative frequency interpretation

A

In the long run, 90%, 95% or 99% of all the confidence intervals that can be constructed (in repeated samples) will contain the unknown true parameter.

72
Q

Confidence Interval for μ (σ Known) assumptions

A

Assumptions:
Population standard deviation σ is known
Population is normally distributed
If population is not normal, use Central Limit Theorem.

73
Q

Will the true average always be in the middle of the confidence interval

A

Not necessarily. , A good but not perfect measure

74
Q

Confidence interval for μ (σ Unknown)

A

If the population standard deviation σ is unknown, we can substitute the sample standard deviation, S.
This introduces extra uncertainty, since S is variable from sample to sample.

So we use the Student t distribution instead of the normal distribution:
The t value depends on degrees of freedom denoted by sample size minus 1 i.e. (d.f = n - 1).

d.f are number of observations that are free to vary after sample mean has been calculated.

75
Q

Degrees of freedom

A

: Number of observations that are free to

vary after sample mean has been calculated

76
Q

Confidence interval example interpretation

A

We are 95% confident that the true percentage of left-handers in the population is between 0.1651 and 0.3349 i.e.:

Although the interval from 0.1651 to 0.3349 may or may not contain the true proportion, 95% of intervals formed from repeated samples of size 100 in this manner will contain the true proportion.

77
Q

Sampling error

A

The required sample size can be found to reach a desired margin of error (e) with a specified level of confidence (1 - alpha).

The margin of error is also called a sampling error:
The amount of imprecision in the estimate of the population parameter.
The amount added and subtracted to the point estimate to form the confidence interval.

78
Q

Rule for rounding confidence intervals

A

Always round up (sideways)

79
Q

Hypothesis

A

A hypothesis is a statement (assumption) about a population parameter

80
Q

The Null Hypothesis, H0

A

States the belief or assumption in the current situation (status quo)

Begin with the assumption that the null hypothesis is true
(similar to the notion of innocent until proven guilty)

Refers to the status quo

Always contains ‘=‘, ‘≤’ or ‘’ sign

May or may not be rejected

Is always about a population parameter; e.g. μ, not about a sample statistic

81
Q

The Alternative Hypothesis, H1

A

Is the opposite of the null hypothesis
e.g. The average number of TV sets in Australia
homes is not equal to 3 ( H1: μ ≠ 3 )

Challenges the status quo

Can only can contain either the ‘’ or ‘≠’ sign

May or may not be proven

Is generally the claim or hypothesis that the researcher is trying to prove

82
Q

Errors in making decisions (Hypothesis testing)

A

Type I error
Reject a true null hypothesis
Considered a serious type of error

Type II error
Fail to reject a false null hypothesis

83
Q

The probability of errors

A

The probability of Type I error is alpha
Called level of significance of the test; i.e. 0.01, 0.05, 0.10
Set by the researcher in advance

The probability of Type II error is β

84
Q

p-value approach to testing

A

p-value: Probability of obtaining a test statistic more extreme
( ≤ or ) than the observed sample value, given H0 is true

Also called observed level of significance
Smallest value of  for which H0 can be rejected
Obtain the p-value from Table E.2 or computer
If p-value < alpha , reject H0
If p-value >= alpha , do not reject H0

85
Q

Regression analysis

A

Regression analysis is used to:
predict the value of a dependent variable (Y) based on the value of at least one independent variable (X)
explain the impact of changes in an independent variable on the dependent variable

86
Q

Dependent variable (y)

A

Dependent variable (Y): the variable we wish to predict or explain (response variable)

87
Q

Independent variable (x)

A

Independent variable (X): the variable used to explain the dependent variable (explanatory variable)

88
Q

Simple linear regression

A

Only one independent variable, X
Relationship between X and Y is described by a linear function
Changes in Y are assumed to be caused by changes in X

89
Q

b0 and b1

A

b0 and b1 are obtained by finding the values of b0 and b1 that minimise the sum of the squared differences between actual values (Y) and predicted values ( )

90
Q

b0

A

b0 is the estimated average value of Y when the value of X is zero

91
Q

b1

A

b1 is the estimated change in the average value of Y as a result of a one-unit change in X

92
Q

Coefficient of Determination, r2

A

The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable
The coefficient of determination is also called r-squared and is denoted as r2

93
Q

ASSUMPTIONS OF REGRESSION

A

Linearity of the relationship

Independence of error values

Normality of error values

constant variance of the errors of the probability distribution

Check these assumptions by examining residuals

94
Q

residual for observation

A

The residual for observation i, ei, is the difference between its observed and predicted value

95
Q

Idea of the multiple regression model

A

Examine the linear relationship between

1 dependent (Y) & 2 or more independent variables (Xi).

96
Q

Why we need Adjusted r^2

A

r2 never decreases when a new X variable is added to the model.
This can be a disadvantage when comparing models.

What is the net effect of adding a new variable?
We lose a degree of freedom when a new X variable is added.
Did the new X variable add enough explanatory power to offset the loss of one degree of freedom?

97
Q

Adjusted r^2

A

Shows the proportion of variation in Y explained by all X variables adjusted for the number of X variables used.

Penalises excessive use of unimportant independent variables.
Smaller than r2
Useful in comparing among models.

98
Q

F Test for Overall Significance of the Model:

A

Shows if there is a linear relationship between all of the X variables considered together and Y.

99
Q

multiple regression assumptions

A

The errors are normally distributed.

Errors have a constant variance.

The model errors are independent.

100
Q

Using dummy variables

A

A dummy variable is a categorical explanatory variable with two levels:
yes or no, on or off, male or female
coded as 0 or 1

Regression intercepts are different if the variable is significant.

Assumes equal slopes for other variables.

If more than two levels, the number of dummy variables needed is number of levels minus 1.

101
Q

Time-series data and plot

A

Numerical data obtained at regular time intervals.
The time intervals can be annually, quarterly, daily, hourly etc.
A time-series plot is a two-dimensional plot of time series data.
The vertical axis measures the variable of interest.
The horizontal axis corresponds to the time periods.

102
Q

Classical Multiplicative Time-series Model Components

A

Trend component
Seasonal component
Cyclical component
Irregular component

103
Q

Trend component

A

Long-run increase or decrease over time (overall upward or downward movement).
Data taken over a long period of time.
Trend can be upward or downward.
Trend can be linear or non-linear.

104
Q

Seasonal component

A

Short-term regular wave-like patterns.
Observed within 1 year.
Often monthly or quarterly.

105
Q

Cyclical component

A

Long-term wave-like patterns.
Usually occur every 2-10 years.
Often measured peak to peak or trough to trough.

106
Q

Irregular component

A

Unpredictable, random, ‘residual’ fluctuations.

Due to random variations of:
Nature.
Accidents or unusual events.

‘Noise’ in the time series.

Usually short duration and non-repeating.

107
Q

Smoothing the Annual Time Series – Moving Averages

A

A series of arithmetic means over time.

Calculate moving averages to get an overall impression of the pattern of movement over time.

Moving averages can be used for smoothing: averages of consecutive time-series values for a chosen period of length (L).

Result dependent upon choice of L (length of period for computing means).
Examples:
For a 5 year moving average, L = 5.
For a 7 year moving average, L = 7 etc.

108
Q

PHOTOS 1-8

A

Frequency distribution, histogram and graphing

109
Q

PHOTO 9

A

CV

110
Q

PHOTO 10

A

SKEWNESS

111
Q

PHOTOS 11-12

A

EMPIRICAL RULE

112
Q

PHOTOS 13-14

A

BOX AND WHISKER

113
Q

PHOTOS 15-18

A

BAYES THEOREM

114
Q

PHOTOS 19-22

A

INVESTMENT RETURNS

115
Q

PHOTOS 23-24

A

PORTFOLIO RETURN AND RISK

116
Q

PHOTO 25

A

INDEX NUMBERS INTERPRETATION

117
Q

GO OVER DECISION MAKING FLASHCARDS AND PHOTOS

A

ALMOST 4 MONTHS WITH SAM!!! SHE’S SO INCREDIBLE AND MAKES ME SO HAPPY!!!!

118
Q

PHOTOS 26-27

A

NORMAL PROBABILITY PLOT

119
Q

PHOTOS 28-29

A

TUTORS NORMAL DISTRIBUTION EXAMPLE

120
Q

PHOTO 30

A

STANDARD ERROR OF THE MEAN

121
Q

PHOTO 31

A

SAMPLING DISTRIBUTION PROPERTIES

122
Q

PHOTOS 32-33

A

CENTRAL LIMIT THEOREM

123
Q

PHOTO 34

A

CONFIDENCE INTERVAL ESTIMATION PROCESS

124
Q

PHOTOS 35-36

A

CONFIDENCE INTERVAL EXAMPLE

125
Q

PHOTOS 37-41

A

DETERMINING SAMPLE SIZE

126
Q

PHOTO 42

A

OUTCOMES AND PROBABILITIES OF HYPOTHESIS TESTING

127
Q

PHOTO 43

A

2 TAIL TESTS

128
Q

PHOTO 44-45

A

P VALUE 2 TAIL TESTS

129
Q

PHOTOS 46-47

A

1 TAIL TESTS

130
Q

PHOTO 48

A

P VALUE 1 TAIL

131
Q

PHOTOS 49-50

A

HYPOTHESIS TESTING FOR THE PROPORTION

132
Q

PHOTOS 51-52

A

SIMPLE REGRESSION MODEL AND EQUATION

133
Q

PHOTOS 53-58

A

SIMPLE REGRESSION EXAMPLE

134
Q

PHOTO 59

A

INTERPOLATION V EXTRAPOLATION

135
Q

PHOTO 60

A

EXAMPLES OF R2

136
Q

PHOTOS 61-62

A

COMPARING STANDARD ERRORS

137
Q

PHOTOS 63-65

A

F TEST FOR SIGNIFICANCE

138
Q

PHOTO 66

A

CONFIDENCE INTERVAL ESTIMATE FOR THE SLOPE

139
Q

PHOTOS 67-68

A

MULTIPLE REGRESSION MODEL AND EQUATION

140
Q

PHOTOS 69-73

A

MULTIPLE REGRESSION EXAMPLE

141
Q

PHOTO 74-75

A

ADJUSTED R2

142
Q

PHOTOS 76-79

A

SIGNIFICANCE F TEST MULTIPLE

143
Q

PHOTOS 80-83

A

ARE INDIVIDUAL VARIABLES SIGNIFICANT

144
Q

PHOTOS 84-85

A

CONFIDENCE INTERVAL ESTIMATE FOR THE SLOPE MULTIPLE

145
Q

PHOTOS 86-91

A

DUMMY VARIABLES

146
Q

PHOTOS 92-94

A

INTERACTION BETWEEN VARIABLES

147
Q

PHOTOS 95-96

A

TREND AND SEASONAL COMPONENT

148
Q

PHOTOS 97-98

A

MULTIPLICATIVE TIME SERIES MODEL

149
Q

PHOTOS 99-102

A

MOVING AVERAGES

150
Q

PHOTO 103

A

LEAST SQUARES TREND FITTING

151
Q

PHOTO 104

A

QUADRATIC FORM TREND FORECASTING

152
Q

PHOTOS 105-106

A

EXPONENTIAL TREND FORECASTING

153
Q

PHOTOS 107-108

A

MODEL SELECTION

154
Q

PHOTO 109

A

RESIDUAL ANALYSIS FORECASTING

155
Q

PHOTO 110

A

FORECASTING WITH SEASONAL DATA

156
Q

PHOTOS 111-114

A

QUARTERLY MODEL

157
Q

As an aid to the establishment of personnel requirements, the director of a hospital wishes to estimate the mean number of people who are admitted to the emergency room during a 24-hour period. The director randomly selects 64 different 24-hour periods and determines the number of admissions for each. For this sample, = 19.8 and s2 = 25. Which of the following assumptions is necessary in order for a confidence interval to be valid?

A

No assumptions are necessary (Central limit theorem)

158
Q

It is desired to estimate the average total compensation of CEOs in the Service industry. Data were randomly collected from 18 CEOs and the 95% confidence interval was calculated to be ($2,181,260, $5,836,180). Which of the following interpretations is correct?

A

We are 95% confident that the average total compensation of all CEOs in the Service industry falls in the interval $2,181,260 to $5,836,180.

159
Q

The power of a statistical test is

A

the probability of rejecting H0 when it is false.

160
Q

Statistical independence determination

A

P(A intersection B) = {P}(A) * {P}(B).

161
Q

Implications of increasing the sample size (sampling distributions - normal distribution)

A

With the sample size increasing from
n
= 25 to
n
= 100, more sample means
will be closer to the distribution mean. The standard error of the sampling
distribution of size 100 is much smaller than that of size 25, so the likelihood
that the sample mean will fall within

0.2 minutes of the mean is much
higher for samples of size 100 (probability = 0.8413) than for samples of
size 25 (probability = 0. 3830).

162
Q

A market researcher states that she has 95% confidence that the mean monthly sales of a product are between $170,000 and $200,000. Explain the meaning of this statement.

A

if all possible samples of the same size
n
are taken, 95% of them include the true
population average monthly sales of the product within the interval developed.
Thus you are 95% confident that this sample is one that does correctly estimate
the true average amount.

163
Q

When can you assume that the sampling distribution is approx normal

A

No. Since the population standard deviation is known and
n
= 50, from the
Central Limit Theorem, we may assume that the sampling distribution of is approximately normal.

164
Q

What does reducing the confidence level do to the confidence interval

A

The reduced confidence level narrows the width of the confidence interval.

165
Q

A stationery store wants to estimate the mean retail value of greeting cards that it has in its inventory. A random sample of 20 greeting cards indicates a mean value of $4.95 and a standard deviation of $0.82.

Interpret the confidence interval and how is this helpful in estimating the value of total inventory

A

The store owner can be 95% confident that the population mean retail value
of greeting cards that the store has in its inventory is somewhere between
$4.56 and $5.34. The store owner could multiply the ends of the confidence
interval by the number of cards to estimate the total value of his inventory.

166
Q

Interpret a proportion confidence interval

A

You are 95% confident that the population proportion of employers who
have used a recruitment service within the past two months to find new
staff is between 0.17 and 0.24.
You are 99% confident that the population proportion of employers who
have used a recruitment service within the past two months to find new
staff is between 0.17 and 0.25.

167
Q

What happens to the confidence interval when you increase the level of confidence

A

When the level of confidence is increased, the confidence interval becomes
wider. The loss in precision reflected as a wider confidence interval is the
price you have to pay to achieve a higher level of confidence.

168
Q

When do you reject the null hypothesis

A

Decision rule: Reject if smaller than lower bound or greater than upperbound

169
Q

p value interpretation

A

photo in favourites on phone 31/5/2018

170
Q

Interpretation of hypothesis testing answer

A

There is enough evidence to conclude the population mean delivery time
has been reduced below the previous value of 25 minutes, at the 5% level
of significance.

171
Q

p-
value
=
0.0047 interpretation

A

Since
p-
value = 0.0047 is less than alpha there is
enough evidence to conclude the population mean delivery time has been
reduced below the previous value of 25 minutes.

172
Q

What does increasing the sample size do in regards to hypothesis testing and proportions

A

A larger sample size implies that there is more information about the population and reduces the standard error (variation) of the sample proportion

173
Q

Conditions of hypothesis testing when it isnt exactly normal

A

The samples used need to be random. As the sample size is large the condtions that np>5 and n(1-p) need t be met

174
Q

What do you need to know to perform the t test on the population mean

A

You must assume the the observed sequence in which the data were collected is random and that the data are approx normally distributed

175
Q

forecasting questions

A

Photos on phone album