Midterm 2 Flashcards

1
Q

Know how sample distribution is related to the population distribution

A

Population Distribution: Frequency distribution of all elements in a population, described by parameters such as population mean (μ) and standard deviation (σ).
Sample Distribution: Distribution derived from a sample, described by sample statistics such as sample mean (̄x) and standard deviation (s).
Parameters (μ, σ) describe the population, while statistics (̄x, s) describe the sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Know the difference between statistics (e.g., sample mean , sample variance , sample correlation ) and parameters (e.g., population mean , population variance , population correlation ). (Lesson 6-1 slides, pp.14-19)

A

Population distribution - frequency distribution
of all elements (people) of the population
- A smooth line
- Population Mean denoted by the Greek letter μ
- Population Standard Deviation denoted by the Greek
letter σ
- μ and σ are called parameters; they are unknown –
we can only guess about them

Sample distribution - frequency distribution (histogram) of all elements (people) in your sample
It is known exactly once you have your sample
- Not a smooth line
- Sample Mean denoted by 𝑋̅ (XBAR)
- Sample Standard Deviation denoted by s
- 𝑋̅ (XBAR) and s are called statistics; they are known – we use them to guess about the population distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Know why use sample distribution to study population distribution in statistics. (Lesson 6-1 slides, pp.11-13)

A
  • Descriptive statistics describes data in a sample
  • Inferential statistics uses data from samples and make inferences/generalizations about a population
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Know the concept of outliers and how to identify outliers. (Lesson 6-1 slides, pp.34, 39)

A
  • Something unusual or rare

How to identify outliers?

Sorting the data to find (when the data is small)

Using z-scores (only if population distribution is normal)
>+3 or <-3

Graphing the data
Histogram
Scatter plots
Boxplots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is normal distribution?

A

The Normal Distribution is a Probability Distribution
An example of population distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which parameters are used to define a normal distribution?

A

Mean and Standard Deviation tell you shape:
Mean = m, Standard Deviation = s
Know μ and σ can compute probability values

Normal Distribution is denoted as 𝑁(𝜇,𝜎^2 )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the shape of normal distribution? (Lesson 6-1 slides, pp.22)

A

“Bell shaped” and Symmetric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Know how to compute Z score.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the distribution of Z score? (Lesson 6-1 slides, pp.30-32)

A

Intuition: measures how far is an observation from the mean
How many standard deviations is an observation away from the mean
A way of measuring how unusual an observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is sampling frame? (Lesson 6-1 slides, pp.44)

A

Sampling Frame – the source material or device from which the sample
may be drawn
o Working population
o Mailing lists – Database Marketers
o Phone book

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the three types of sampling errors? (Lesson 6-1 slides, pp.43-45)

A
  • Sampling Frame Error
    o A frame error occurs when the wrong sub-population is used to select a sample
    o 1936 presidential election between Franklin D. Roosevelt and Alf Landon
  • Literary Digest favored Landon
  • Gallup predicted Roosevelt’s win with small sample
    o How to reduce: understand research question before selecting the sample
  • Random Sampling Error
    o The error caused by a particular sample not being representative of the population
    of interest due to random variation
    o Even randomized samples will have some sampling error since it is only an
    approximation of the whole population
    o How to reduce: random selection, increase the sample size
  • Nonresponse Error
    o Happens when there is a significant difference between those who responded to the
    survey and those who did not
    o How to reduce: design a better survey (e.g., funnel approach, projective techniques,
    counter-biasing statement, etc.)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the commonly used sampling techniques (e.g., 4 probability sampling techniques, 4 nonprobability sampling techniques)?

A
  • Simple Random Sampling
    o Everyone gets equal likelihood of being selected

Systematic samplingis a type of probabilitysamplingmethod in whichsamplemembers from a larger population are selected according to a random starting point but with a fixed, periodic interval.

Stratified samplingis a method of sampling from a population which can be partitioned into subpopulations.The strata should define a partition of the population. That is, it should becollectively exhaustiveandmutually exclusive: every element in the population must be assigned to one and only one stratum. It ensures each subgroup within the population receives proper representation within thesample.

Cluster sampling, the total population is divided into clusters and random samples are then collected from each group.
_________________________________________________________
* Convenience Sampling
o Obtaining the people that are most conveniently available
o E.g., mall intercept interviews
* Judgement/Expert Sampling
o Experienced individual selects the sample based on
judgment about appropriate characteristics
o Used when the population of interest is very small or
specific
* Quota Sampling
o Selects sample such that various subgroups are
represented. Similar to Stratified sampling, but non-
probability sampling within each quota
* Snowball
o An initial group of respondents is first selected.
o After being interviewed, these respondents are asked to identify others
who belong to the target population of interest.
o Subsequent respondents are selected based on the referrals.
o Commonly used in network marketing, e.g., who are members of your
network? Who among your friends influence your decision the most.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Given the acceptable margin of error and acceptable confidence level, how to determine the sample size required for the study? (Lesson 6-1 slides, pp.51-58)

A

Factors: Acceptable margin of error, confidence level, and population variability.
Example formula: For 90% confidence level with known standard deviation, .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Know the 4 steps to transform responses to numeric data, especially how to code responses in open-ended and closed-ended questions. (Lesson 7-1 slides, pp.6-11)

A

Validation & Editing:
Check for fraud, screening, procedural adherence, and completeness.

Coding:
Closed-ended: Assign numerical values to responses.
Open-ended: Group responses, assign codes, and tag.

Data Capture:
Convert responses into machine-readable format (e.g., SPSS).

Logical Cleaning:
Use software to detect errors and inconsistencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Know which type of chart needs to be used under different circumstances. (Lesson 7-1 slides, pp.15-17)

A

Line Chart: Time trends.
Pie Chart: Proportions.
Bar Chart: Comparisons between groups.
Stacked/Clustered Bar Chart: Multiple group comparisons.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is sampling distribution?

A

In statistics, a sampling distribution or finite-sample distribution is the probability distribution of a given random-sample-based statistic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Know central limit theorem (i.e., when sample size n is large enough, according to CLT, sampling distribution follows a normal distribution , where is the population mean, is the population variance, no matter what shape the population distribution is.) (Lesson 8-1 slides, pp.10-13, 15-17)

A

For large sample sizes, the sampling distribution of the sample mean approaches normality regardless of population shape.
Mean = μ, variance = σ^2/n.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Know the difference between point estimate and interval estimate (i.e., confidence interval).

A

Point Estimate: Single value estimate (e.g., ̄x).

Interval Estimate: Range of values with a confidence level.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the advantage of interval estimate compared to point estimate? (Lesson 8-1 slides, pp.19-22)

A

Advantage: Accounts for sampling variability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Know how to construct a confidence interval for population mean at 99%, 95% and 90% confidence level.

A

Interpretation: “We are X% confident that the interval contains the true mean.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Know how to interpret the confidence interval in plain words. (Lesson 8-1 slides, pp.28-33)

A
22
Q

Know when to use one sample test in a given context. (Lesson 8-1 slides, pp.37-38)

A
23
Q

Know how to formulate hypotheses (null and alternative). (Lesson 8-1 slides, pp.42-43)

A
24
Q

Know how to calculate the t test statistic for One-Sample Mean Difference. (Lesson 8-1 slides, pp.47-49)

A
25
Q

Know how to find p-value online and make a statistical decision based on p-value and the selected significance level. (Lesson 8-1 slides, pp.54-57)

A
26
Q

Know how to interpret a statistical decision as a business/managerial decision/conclusion. (Lesson 8-1 slides, pp.58)

A
27
Q

Know how to run one-sample test and read the output table in SPSS. (SPSS Lab Session Quiz 2)

A
28
Q

Know when to use two sample test in a given context. (Lesson 8-2 slides, pp.7)

A
29
Q

Know how to formulate hypotheses (null and alternative). (Lesson 8-2 slides, pp.14)

A
30
Q

Know how to calculate the t test statistic for Two-Sample Mean Difference. (Lesson 8-2 slides, pp.15-16)

A
31
Q

Know how to find p-value online and make a statistical decision based on p-value and the selected significance level. (Lesson 8-2 slides, pp.17-18)

A
32
Q

Know how to interpret a statistical decision as a business/managerial decision/conclusion. (Lesson 8-2 slides, pp.18)

A
33
Q

Know how to run two-sample test and read the output table in SPSS. (SPSS Lab Session Quiz 2)

A
34
Q

Know when to use ANOVA in a given context. (Lesson 8-2 slides, pp.20)

A
35
Q

Know how to formulate hypotheses (null and alternative). (Lesson 8-2 slides, pp.22)

A
36
Q

Know the intuition of the F-test for ANOVA. (Lesson 8-2 slides, pp.23-24)

A
37
Q

Know how to interpret a statistical decision as a business/managerial decision/conclusion. (Lesson 8-2 slides, pp.27)

A
38
Q

Know how to run ANOVA test and read the output table in SPSS (Lesson 8-2 slides, pp.25; SPSS Lab Session Quiz 2)

A
39
Q

Know when to use which approach in bivariate analysis. (Lesson 9-1 slides, pp.8)

A
40
Q

Know the meaning of a correlation coefficient and its basic property. (Lesson 9-1 slides, pp.11)

A
41
Q

Know the difference between linear association and non-linear association. (Lesson 9-1 slides, pp.16)

A
42
Q

Know how to interpret a positive, negative and no correlation.

A
43
Q

Know how to identify whether a correlation is positive or negative from a scatter plot. (Lesson 9-1 slides, pp.13-15)

A
44
Q

Know the hypotheses (null and alternative) of the hypothesis testing underlying correlation analysis. (Lesson 9-1 slides, pp.12)

A
45
Q

Know the concept of simple and multiple linear regression and when they can be useful. (Lesson 9-1 slides, pp.27, 30, 49)

A
46
Q

Know how to write out linear regression model and estimated regression line equation. (Lesson 9-1 slides, pp.32, 38, 50, 55)

A
47
Q

Know how to interpret the intercept and slope of the estimated regression line equation in plain words. (Lesson 9-1 slides, pp.32, 39, 55)

A
48
Q

Know how to run the correlation test and read the output table in SPSS. (Lesson 9-1 slides, pp.19-20; SPSS Lab Session Quiz 3)

A
49
Q

Know how to formulate regression model with categorical independent variables and how to interpret their coefficient estimate. (Lesson 9-1 slides, pp.42-43, 45-47)

A
50
Q

Know how to formulate hypotheses (null and alternative) of the hypothesis testing on the intercept and slope parameters underlying linear regression estimation. (Lesson 9-1 slides, pp.37, 56-58)

A
51
Q

Know what the R square means. (Lesson 9-1 slides, pp.53)

A
52
Q

Know how to run linear regression and interpret the output table in SPSS. (Lesson 9-1 slides, pp.35-38, 53-55; SPSS Lab Session Quiz 3)

A