Exam Revision Flashcards

1
Q

Statistics

A

Statistics is the branch of mathematics that examines ways to process and analyse data. Statistics provides procedures to collect and transform data in ways that are useful to business decision makers. To understand anything about statistics, you first need to understand the meaning of a variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

4 fundamental terms of statistics

A

Population
Sample
Parameter
Statistic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Population

A

A population consists of all the members of a group about which you want to
draw a conclusion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sample

A

A sample is the portion of the population selected for analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Parmeter

A

A parameter is a numerical measure that describes a characteristic of a
population (measures used to describe a population) GREEK LETTERS REFER
TO A PARAMETER

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Statistic

A

A statistic is a numerical measure that describes a characteristic of a sample
(measures calculated from sample data) ROMAN LETTERS REFER TO
STATISTICS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

2 types of statistics

A

Descriptive statistics

Inferential statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Descriptive statistics

A

Collecting, summarising and presenting data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Inferential statistics

A

Drawing conclusions about a population based on sample data/results (i.e. estimating a parameter based on a statistic such as hypothesis testing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

2 types of data

A

Categorical (defined categories)

Numerical (quantitative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

2 types of numerical variables

A

Discrete (counted items)

Continuous (measured characteristics)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

4 levels of Measurement and Measurement Scales from highest to lowest

A

Ratio data
Interval data
Ordinal data
Nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Ratio data

A

Differences between measurements are meaningful and a true zero
exists

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Interval data

A

Differences between measurements are meaningful but no true zero
exists (has negatives)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Ordinal data

A

Ordered categories (rankings, order or scaling)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Nominal data

A

Categories (no ordering or direction)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

4 measures used to describe data

A

Central tendency
Quartiles
Variation
Shape

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

4 measures of central tendency

A

Arithmetic mean
Median
Mode
Geometric mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

5 measures of variation

A
Range 
Interquartile range 
Variance
Standard deviation 
Coefficient of variation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

1 measure of shape

A

Skewness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Arithmetic mean

A

Arithmetic mean is summing up the observations and dividing by the number of observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Median and mode extreme values

A

The median is not sensitive to extreme values and the mean is sensitive to extreme values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Sigma

A

Sigma is short for adding up the values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Median

A

In an ordered array, the median is the middle number (50% above and 50%below). It’s main advantage over the arithmetic mean is that it is not affected by extreme values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Mode
A measure of central tendency. Value that occurs most often (the most frequent). Not affected by extreme values. Never use the mode by itself, always use in conjunction with median or mean. Unlike mean and median, there may be no unique (single) mode for a given data set. Used for either numerical or categorical (nominal) data.
26
Quartiles
Quartiles split the ranked data into four segments, with an equal number of values per segment. The first quartile, Q1, is the value for which 25% of the observations are smaller and 75% are larger. The second quartile, Q2, is the same as the median (50% are smaller, 50% are larger). Only 25% of the observations are greater than the third quartile, Q3
27
Measures of variation
Measures of variation give information on the spread or variability of the data values
28
Interquartile range
Like the median and Q1 and Q2, the IQR is a resistant summary measure (resistant to the presence of extreme values) Eliminates outlier problems by using the interquartile range, as high- and low-valued observations are removed from calculations. IQR = 3rd quartile – 1st quartile. IQR = Q3 - Q1
29
Sample variance
Measures average scatter around the mean. Units are also squared. This measure tells you the average deviation of the mean. The reason we square the values is because some are negative and some are positive. The sample variance is the squared average difference between the mean.
30
Sample standard deviation
Most commonly used measure of variation. Shows variation about the mean. Has the same units as the original data. It can be considered a measure of uncertainty.
31
Coefficient of variation
Measures relative variation i.e. shows variation relative to mean. Can be used to compare two or more sets of data measured in different units. Always expressed as percentage (%)
32
The Z score
The difference between a given observation and the mean, divided by the standard deviation. A Z score of 2.0 means that a value is 2.0 standard deviations from the mean. A Z score above 3.0 or below -3.0 is considered an outlier
33
The shape of a distribution
Describes how data are distributed. Measures of shape are symmetric or skewed
34
Left skewed and right skewed
When the data is left or negatively skewed the distance between the q1 and q2 is greater than the distance between q2 and q3. The reverse applies for right or positively skewed data. If the data is symmetric the distances are the same
35
What does a box and whisker plot show
Box and whisker plot show location, spread and shape.
36
Population variance
the average of the squared deviations of values from the mean
37
Population standard deviation
shows variation about the mean. is the square root of the population variance. has the same units as the original data
38
Covariance
The sample covariance measures the strength of the linear relationship between two numerical variables. Only concerned with the direction of the relationship. No causal effect is implied. Is affected by units of measurement
39
Correlation
Measures the relative strength of the linear relationship between two variables
40
Features of correlation coefficient
Also called Standardised Covariance i.e. invariant to units of measure. Ranges between –1 and 1. The closer to –1, the stronger the negative linear relationship The closer to 1, the stronger the positive linear relationship. The closer to 0, the weaker the linear relationship
41
5 number summary
Numerical data summarised by quartiles. Xsmallest Q1 Median Q3 Xlargest
42
3 approaches to assessing probability
a priori Empirical Subjective
43
a priori
Classical probability. Based on prior knowledge
44
Empirical
Classical probability. Based on observed data
45
Classical probability. Based on observed data
Subjective probability. Based on individual judgment or opinion about the probability of occurrence
46
Probability
a numerical value that represents the chance, likelihood, possibility that an event will occur (always between 0 and 1)
47
Discrete probability
A discrete probability can only take certain values.
48
4 essential properties of the binomial distribution
A fixed number of observations Two mutually exclusive and collectively exhaustive events Constant probability for each observation Observations are independent
49
Index numbers
Index numbers allow relative comparisons over time. Index numbers are reported relative to a Base Period Index. Base period index = 100 by definition. Used for an individual item or measurement.
50
Which price index to use
Paasche is more accurate but more difficult to achieve.
51
Characteristics of the normal distribution
Bell-shaped Symmetrical Mean, median and mode are equal Central location is determined by the mean Spread is determined by the standard deviation (IT IS THE POPULATION STANDARD DEVIATION) The random variable x has an infinite theoretical range
52
What is the height of the curve a measure of
Probability
53
What must the area under the curve be
1
54
Calculate descriptive numerical measures to determine nornality
Do the mean and median have similar values? (Remember there may be no unique mode or there may be multiple modes.) Is the interquartile range approximately 1.33 times the standard deviation? Is the range approximately 6 times the standard deviation?
55
Calculate standard deviation to determine normality
Do approximately 2/3 of the observations lie within mean 1 standard deviation? Do approximately 80% of the observations lie within mean 1.28 standard deviations? Do approximately 95% of the observations lie within mean 2 standard deviations?
56
Continuous probability density function
Mathematical expression that defines the distribution of the values for a continuous random variable.
57
Sampling distribution
A sampling distribution is a distribution of all of the possible values of a statistic for a given size sample selected from a population.
58
Standard error of the mean
Different samples of the same size from the same population will yield different sample means. A measure of the variability in the mean from sample to sample is given by the Standard Error of the Mean. Note that the standard error of the mean decreases as the sample size increases.
59
If the population is not normal
We can apply the Central Limit Theorem, which states that regardless of the shape of individual values in the population distribution, as long as the sample size is large enough (generally n ≥ 30) the sampling distribution of XBAR will be approximately normally distributed with:
60
Sampling Distribution of the Proportion
Selecting all possible samples of a certain size, the distribution of all possible sample proportions is the sampling distribution of the proportion.
61
Simple random sampling
Every individual or item from the frame (N) has an equal chance of being selected (1/N). Selection may be with replacement or without replacement. Samples can be obtained from a table of random numbers or computer random number generators. Simple to use but may not be a good representation of the population’s underlying characteristics.
62
Systematic sampling
Divide frame of N individuals into n groups of k individuals: k = N/n. Randomly select one individual from the 1st group. Select every kth individual thereafter. Like simple random sampling, simple to use but may not be a good representation of the population’s underlying characteristics.
63
Stratified sampling
Divide population into two or more subgroups (called strata) according to some common characteristic. A simple random sample is selected from each subgroup, with sample sizes proportional to strata sizes – called proportionate stratified sampling. Samples from subgroups are combined into one.
64
Stratified sampling pros
More efficient than simple random sampling or systematic sampling because of assured representation of items across entire population. Homogeneity of items within each stratum provides greater precision in the estimates of underlying population parameters.
65
Cluster samples
Population is divided into several ‘clusters’, each representative of the population e.g. postcode areas, electorates etc. A simple random sample of clusters is selected: All items in the selected clusters can be used, or items can be chosen from a cluster using another probability sampling technique.
66
Cluster sampling pros
More cost effective than random sampling, especially if population is geographically widespread. Often requires a larger sample size compared to simple random sampling or stratified sampling for same level of precision.
67
Survey errors
Coverage error – appropriate or adequate frame? Non-response error – results in non-response bias. Measurement error – ambiguous wording, halo effect or respondent error. Sampling error – always exists and is the difference between sample statistic and population parameter.
68
Point estimate
A point estimate is the value of a single sample statistic.
69
Confidence interval
A confidence interval provides a range of values constructed around the point estimate.
70
Confidence interval estimation
An interval gives a range of values: Takes into consideration variation in sample statistics from sample to sample. Based on observations from 1 sample. Gives information about closeness to unknown population parameters. Stated in terms of level of confidence. Can never be 100% confident.
71
A relative frequency interpretation
In the long run, 90%, 95% or 99% of all the confidence intervals that can be constructed (in repeated samples) will contain the unknown true parameter.
72
Confidence Interval for μ (σ Known) assumptions
Assumptions: Population standard deviation σ is known Population is normally distributed If population is not normal, use Central Limit Theorem.
73
Will the true average always be in the middle of the confidence interval
Not necessarily. , A good but not perfect measure
74
Confidence interval for μ (σ Unknown)
If the population standard deviation σ is unknown, we can substitute the sample standard deviation, S. This introduces extra uncertainty, since S is variable from sample to sample. So we use the Student t distribution instead of the normal distribution: The t value depends on degrees of freedom denoted by sample size minus 1 i.e. (d.f = n - 1). d.f are number of observations that are free to vary after sample mean has been calculated.
75
Degrees of freedom
: Number of observations that are free to | vary after sample mean has been calculated
76
Confidence interval example interpretation
We are 95% confident that the true percentage of left-handers in the population is between 0.1651 and 0.3349 i.e.: Although the interval from 0.1651 to 0.3349 may or may not contain the true proportion, 95% of intervals formed from repeated samples of size 100 in this manner will contain the true proportion.
77
Sampling error
The required sample size can be found to reach a desired margin of error (e) with a specified level of confidence (1 - alpha). The margin of error is also called a sampling error: The amount of imprecision in the estimate of the population parameter. The amount added and subtracted to the point estimate to form the confidence interval.
78
Rule for rounding confidence intervals
Always round up (sideways)
79
Hypothesis
A hypothesis is a statement (assumption) about a population parameter
80
The Null Hypothesis, H0
States the belief or assumption in the current situation (status quo) Begin with the assumption that the null hypothesis is true (similar to the notion of innocent until proven guilty) Refers to the status quo Always contains ‘=‘, ‘≤’ or ‘’ sign May or may not be rejected Is always about a population parameter; e.g. μ, not about a sample statistic
81
The Alternative Hypothesis, H1
Is the opposite of the null hypothesis e.g. The average number of TV sets in Australia homes is not equal to 3 ( H1: μ ≠ 3 ) Challenges the status quo Can only can contain either the ‘’ or ‘≠’ sign May or may not be proven Is generally the claim or hypothesis that the researcher is trying to prove
82
Errors in making decisions (Hypothesis testing)
Type I error Reject a true null hypothesis Considered a serious type of error Type II error Fail to reject a false null hypothesis
83
The probability of errors
The probability of Type I error is alpha Called level of significance of the test; i.e. 0.01, 0.05, 0.10 Set by the researcher in advance The probability of Type II error is β
84
p-value approach to testing
p-value: Probability of obtaining a test statistic more extreme ( ≤ or ) than the observed sample value, given H0 is true Also called observed level of significance Smallest value of  for which H0 can be rejected Obtain the p-value from Table E.2 or computer If p-value < alpha , reject H0 If p-value >= alpha , do not reject H0
85
Regression analysis
Regression analysis is used to: predict the value of a dependent variable (Y) based on the value of at least one independent variable (X) explain the impact of changes in an independent variable on the dependent variable
86
Dependent variable (y)
Dependent variable (Y): the variable we wish to predict or explain (response variable)
87
Independent variable (x)
Independent variable (X): the variable used to explain the dependent variable (explanatory variable)
88
Simple linear regression
Only one independent variable, X Relationship between X and Y is described by a linear function Changes in Y are assumed to be caused by changes in X
89
b0 and b1
b0 and b1 are obtained by finding the values of b0 and b1 that minimise the sum of the squared differences between actual values (Y) and predicted values ( )
90
b0
b0 is the estimated average value of Y when the value of X is zero
91
b1
b1 is the estimated change in the average value of Y as a result of a one-unit change in X
92
Coefficient of Determination, r2
The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable The coefficient of determination is also called r-squared and is denoted as r2
93
ASSUMPTIONS OF REGRESSION
Linearity of the relationship Independence of error values Normality of error values constant variance of the errors of the probability distribution Check these assumptions by examining residuals
94
residual for observation
The residual for observation i, ei, is the difference between its observed and predicted value
95
Idea of the multiple regression model
Examine the linear relationship between 1 dependent (Y) & 2 or more independent variables (Xi).
96
Why we need Adjusted r^2
r2 never decreases when a new X variable is added to the model. This can be a disadvantage when comparing models. What is the net effect of adding a new variable? We lose a degree of freedom when a new X variable is added. Did the new X variable add enough explanatory power to offset the loss of one degree of freedom?
97
Adjusted r^2
Shows the proportion of variation in Y explained by all X variables adjusted for the number of X variables used. Penalises excessive use of unimportant independent variables. Smaller than r2 Useful in comparing among models.
98
F Test for Overall Significance of the Model:
Shows if there is a linear relationship between all of the X variables considered together and Y.
99
multiple regression assumptions
The errors are normally distributed. Errors have a constant variance. The model errors are independent.
100
Using dummy variables
A dummy variable is a categorical explanatory variable with two levels: yes or no, on or off, male or female coded as 0 or 1 Regression intercepts are different if the variable is significant. Assumes equal slopes for other variables. If more than two levels, the number of dummy variables needed is number of levels minus 1.
101
Time-series data and plot
Numerical data obtained at regular time intervals. The time intervals can be annually, quarterly, daily, hourly etc. A time-series plot is a two-dimensional plot of time series data. The vertical axis measures the variable of interest. The horizontal axis corresponds to the time periods.
102
Classical Multiplicative Time-series Model Components
Trend component Seasonal component Cyclical component Irregular component
103
Trend component
Long-run increase or decrease over time (overall upward or downward movement). Data taken over a long period of time. Trend can be upward or downward. Trend can be linear or non-linear.
104
Seasonal component
Short-term regular wave-like patterns. Observed within 1 year. Often monthly or quarterly.
105
Cyclical component
Long-term wave-like patterns. Usually occur every 2-10 years. Often measured peak to peak or trough to trough.
106
Irregular component
Unpredictable, random, ‘residual’ fluctuations. Due to random variations of: Nature. Accidents or unusual events. ‘Noise’ in the time series. Usually short duration and non-repeating.
107
Smoothing the Annual Time Series – Moving Averages
A series of arithmetic means over time. Calculate moving averages to get an overall impression of the pattern of movement over time. Moving averages can be used for smoothing: averages of consecutive time-series values for a chosen period of length (L). Result dependent upon choice of L (length of period for computing means). Examples: For a 5 year moving average, L = 5. For a 7 year moving average, L = 7 etc.
108
PHOTOS 1-8
Frequency distribution, histogram and graphing
109
PHOTO 9
CV
110
PHOTO 10
SKEWNESS
111
PHOTOS 11-12
EMPIRICAL RULE
112
PHOTOS 13-14
BOX AND WHISKER
113
PHOTOS 15-18
BAYES THEOREM
114
PHOTOS 19-22
INVESTMENT RETURNS
115
PHOTOS 23-24
PORTFOLIO RETURN AND RISK
116
PHOTO 25
INDEX NUMBERS INTERPRETATION
117
GO OVER DECISION MAKING FLASHCARDS AND PHOTOS
ALMOST 4 MONTHS WITH SAM!!! SHE'S SO INCREDIBLE AND MAKES ME SO HAPPY!!!!
118
PHOTOS 26-27
NORMAL PROBABILITY PLOT
119
PHOTOS 28-29
TUTORS NORMAL DISTRIBUTION EXAMPLE
120
PHOTO 30
STANDARD ERROR OF THE MEAN
121
PHOTO 31
SAMPLING DISTRIBUTION PROPERTIES
122
PHOTOS 32-33
CENTRAL LIMIT THEOREM
123
PHOTO 34
CONFIDENCE INTERVAL ESTIMATION PROCESS
124
PHOTOS 35-36
CONFIDENCE INTERVAL EXAMPLE
125
PHOTOS 37-41
DETERMINING SAMPLE SIZE
126
PHOTO 42
OUTCOMES AND PROBABILITIES OF HYPOTHESIS TESTING
127
PHOTO 43
2 TAIL TESTS
128
PHOTO 44-45
P VALUE 2 TAIL TESTS
129
PHOTOS 46-47
1 TAIL TESTS
130
PHOTO 48
P VALUE 1 TAIL
131
PHOTOS 49-50
HYPOTHESIS TESTING FOR THE PROPORTION
132
PHOTOS 51-52
SIMPLE REGRESSION MODEL AND EQUATION
133
PHOTOS 53-58
SIMPLE REGRESSION EXAMPLE
134
PHOTO 59
INTERPOLATION V EXTRAPOLATION
135
PHOTO 60
EXAMPLES OF R2
136
PHOTOS 61-62
COMPARING STANDARD ERRORS
137
PHOTOS 63-65
F TEST FOR SIGNIFICANCE
138
PHOTO 66
CONFIDENCE INTERVAL ESTIMATE FOR THE SLOPE
139
PHOTOS 67-68
MULTIPLE REGRESSION MODEL AND EQUATION
140
PHOTOS 69-73
MULTIPLE REGRESSION EXAMPLE
141
PHOTO 74-75
ADJUSTED R2
142
PHOTOS 76-79
SIGNIFICANCE F TEST MULTIPLE
143
PHOTOS 80-83
ARE INDIVIDUAL VARIABLES SIGNIFICANT
144
PHOTOS 84-85
CONFIDENCE INTERVAL ESTIMATE FOR THE SLOPE MULTIPLE
145
PHOTOS 86-91
DUMMY VARIABLES
146
PHOTOS 92-94
INTERACTION BETWEEN VARIABLES
147
PHOTOS 95-96
TREND AND SEASONAL COMPONENT
148
PHOTOS 97-98
MULTIPLICATIVE TIME SERIES MODEL
149
PHOTOS 99-102
MOVING AVERAGES
150
PHOTO 103
LEAST SQUARES TREND FITTING
151
PHOTO 104
QUADRATIC FORM TREND FORECASTING
152
PHOTOS 105-106
EXPONENTIAL TREND FORECASTING
153
PHOTOS 107-108
MODEL SELECTION
154
PHOTO 109
RESIDUAL ANALYSIS FORECASTING
155
PHOTO 110
FORECASTING WITH SEASONAL DATA
156
PHOTOS 111-114
QUARTERLY MODEL
157
As an aid to the establishment of personnel requirements, the director of a hospital wishes to estimate the mean number of people who are admitted to the emergency room during a 24-hour period. The director randomly selects 64 different 24-hour periods and determines the number of admissions for each. For this sample, = 19.8 and s2 = 25. Which of the following assumptions is necessary in order for a confidence interval to be valid?
No assumptions are necessary (Central limit theorem)
158
It is desired to estimate the average total compensation of CEOs in the Service industry. Data were randomly collected from 18 CEOs and the 95% confidence interval was calculated to be ($2,181,260, $5,836,180). Which of the following interpretations is correct?
We are 95% confident that the average total compensation of all CEOs in the Service industry falls in the interval $2,181,260 to $5,836,180.
159
The power of a statistical test is
the probability of rejecting H0 when it is false.
160
Statistical independence determination
P(A intersection B) = {P}(A) * {P}(B).
161
Implications of increasing the sample size (sampling distributions - normal distribution)
With the sample size increasing from n = 25 to n = 100, more sample means will be closer to the distribution mean. The standard error of the sampling distribution of size 100 is much smaller than that of size 25, so the likelihood that the sample mean will fall within  0.2 minutes of the mean is much higher for samples of size 100 (probability = 0.8413) than for samples of size 25 (probability = 0. 3830).
162
A market researcher states that she has 95% confidence that the mean monthly sales of a product are between $170,000 and $200,000. Explain the meaning of this statement.
if all possible samples of the same size n are taken, 95% of them include the true population average monthly sales of the product within the interval developed. Thus you are 95% confident that this sample is one that does correctly estimate the true average amount.
163
When can you assume that the sampling distribution is approx normal
No. Since the population standard deviation is known and n = 50, from the Central Limit Theorem, we may assume that the sampling distribution of is approximately normal.
164
What does reducing the confidence level do to the confidence interval
The reduced confidence level narrows the width of the confidence interval.
165
A stationery store wants to estimate the mean retail value of greeting cards that it has in its inventory. A random sample of 20 greeting cards indicates a mean value of $4.95 and a standard deviation of $0.82. Interpret the confidence interval and how is this helpful in estimating the value of total inventory
The store owner can be 95% confident that the population mean retail value of greeting cards that the store has in its inventory is somewhere between $4.56 and $5.34. The store owner could multiply the ends of the confidence interval by the number of cards to estimate the total value of his inventory.
166
Interpret a proportion confidence interval
You are 95% confident that the population proportion of employers who have used a recruitment service within the past two months to find new staff is between 0.17 and 0.24. You are 99% confident that the population proportion of employers who have used a recruitment service within the past two months to find new staff is between 0.17 and 0.25.
167
What happens to the confidence interval when you increase the level of confidence
When the level of confidence is increased, the confidence interval becomes wider. The loss in precision reflected as a wider confidence interval is the price you have to pay to achieve a higher level of confidence.
168
When do you reject the null hypothesis
Decision rule: Reject if smaller than lower bound or greater than upperbound
169
p value interpretation
photo in favourites on phone 31/5/2018
170
Interpretation of hypothesis testing answer
There is enough evidence to conclude the population mean delivery time has been reduced below the previous value of 25 minutes, at the 5% level of significance.
171
p- value = 0.0047 interpretation
Since p- value = 0.0047 is less than alpha there is enough evidence to conclude the population mean delivery time has been reduced below the previous value of 25 minutes.
172
What does increasing the sample size do in regards to hypothesis testing and proportions
A larger sample size implies that there is more information about the population and reduces the standard error (variation) of the sample proportion
173
Conditions of hypothesis testing when it isnt exactly normal
The samples used need to be random. As the sample size is large the condtions that np>5 and n(1-p) need t be met
174
What do you need to know to perform the t test on the population mean
You must assume the the observed sequence in which the data were collected is random and that the data are approx normally distributed
175
forecasting questions
Photos on phone album