- The spread of data - distance between the min and max values of the variable. - Can use to describe the variability of open-ended questions (Respondents define range by their answers).

L7 - Introduction to inferential statistics Flashcards by Phuong Anh Vu

Measure of central tendency (location)

Mean: average value
Median: exact middle value
Mode: most frequently value

How well did you know this?

Not at all

Perfectly

Measure of dispersion (variability / spread)

Range and standard deviation

How well did you know this?

Not at all

Perfectly

Range

The spread of data - distance between the min and max values of the variable.
Can use to describe the variability of open-ended questions (Respondents define range by their answers).

How well did you know this?

Not at all

Perfectly

Standard deviation

Describes the average distance of the distribution values from the mean.
> indicate the usefulness of the mean as typical value.

How well did you know this?

Not at all

Perfectly

Role of Descriptive analysis

+ Provide summary measures of typical or average values
+ Present data in a digestible format
+ Provide preliminary insights about the distribution of values for each variable
+ Help detect errors in the coding process

How well did you know this?

Not at all

Perfectly

Population (Malhora, 2010)

the complete set of individuals or objects of interest

How well did you know this?

Not at all

Perfectly

Sample (Malhora, 2010)

a subset of population from which information is gathered

How well did you know this?

Not at all

Perfectly

Parameter (Malhora, 2010)

true value of a variable
fixed values referring to the population and are unknown
> It is the same from sample to sample

How well did you know this?

Not at all

Perfectly

Sample statistic (Malhora, 2010)

value of a variable that is estimated from a sample.

- it is hoped to be close to parameter of the population of which the sample is a subset.

How well did you know this?

Not at all

Perfectly

Point estimate (Malhora, 2010)

a single value that is obtained from sample data and is used as the best guess of the corresponding population parameter
> It differs from sample to sample

How well did you know this?

Not at all

Perfectly

Confidence interval

a range into which the true population parameter will fall, assuming a given level of confidence.
CI = sample statistic +- k * standard error

How well did you know this?

Not at all

Perfectly

Standard error parameter (k)

value of desired standard errors for the estimate (ex: k = 1.96 for a 95% CI)

How well did you know this?

Not at all

Perfectly

Hypothesis (Hair, 2017)

an unproven supposition that tentatively explains certain facts or phenomena. It is developed prior to data collection.
> Test are designed to disprove null hypothesis.

How well did you know this?

Not at all

Perfectly

Null hypothesis

If null hypothesis is accepted, we do not have to change the status quo. If cannot rejecting, conclude that it may be true.

How well did you know this?

Not at all

Perfectly

Steps in hypothesis testing (slide)

1) Formulate the hypothesis
2) Decide on test, test statistic
3) Select a significance level
4) Statistical decision (reject or not reject)
5) Conclusion

How well did you know this?

Not at all

Perfectly

Test the hypothesis based on 4 factors:

Study These Flashcards

Type of hypothesis
Number of variables
Scale of measurement
Distribution assumptions

Three types of hypothesis

Study These Flashcards

Specific population characteristics
Contrasts / Comparisons
Associations / Relationships

2 types of Distribution assumptions

Study These Flashcards

Parametric (interval scale, normal bell-shaped distribution) and Nonparametric (nominal and ordinal scale) types of statistic.

Type of scale use what Appropriate statistic: measure of location, spread and statistics technique

Study These Flashcards

Nominal: mode, none, Chi-square
Ordinal: median, percentile, Chi-square
Interval: mean, standard deviation, t-test and ANOVA

Comparing means with Independent vs. Related samples

Study These Flashcards

Means are from independent samples: (ex: coffee drink of female and male)
Means are from related samples: (ex: coffee drink and milk tea drink of female) Since the sample is the same, it is called a paired sample.

Test statistic

Study These Flashcards

serves as a decision maker, since the decision to accept or reject Ho depends on its magnitude (how close the sample comes to the Ho)
an univariate hypothesis test using the t distribution, which is used when the standard deviation is unknown and the sample size is small.

Frequency distribution (Malhora, 2013; Hair, 2017)

Study These Flashcards

a mathematical distribution whose objective is to obtain a count of the number of responses associated with different values of one variable and to express these counts in percentage terms.
descriptive statistics are used to accomplish this task.

Role of frequency distribution (Malhotra, 2013)

Study These Flashcards

Determine the extent of item nonresponse.
Indicate the extent of illegitimate responses.
Detect outlier cases with extreme value.
Indicate the shape of empirical distribution of the variable. By constructing a histogram, we can examine whether the observed distribution is consistent with the assumed distribution.

One-tailed and two-tailed test differences

Study These Flashcards

It is a one-tailed test because the alternative hypothesis is expressed directionally (<= or >).
It is a two-tailed test where the alternative hypothesis is not expressed directionally.

Type I error

sample result as rejecting null hypothesis when in fact it is true. > Significance level: the probability of making Type I error. ( α = 0.05 )

Type II error

sample result as non-rejecting null hypothesis when in fact it is false.

Power of a test ( 1 - β )

the probability of rejecting null hypothesis when it is in fact false and should be rejected.

p value

the probability of observing a value of the t-test as extreme as the value actually observed, assuming that the null hypothesis is true. ( = α )

Reject Ho when:

- l t-test l > l critical value l | - or Probability of t-test < significance level ( α )

Coefficient of variation (CV)

- The ratio of the standard deviation to the mean (%). | - It shows the variability in relation to mean of the population.

Statistics associated with frequency distribution

Measures of location, Measures of dispersion, Measures of shape

Measures of shape

The shape is assessed by examining skewness and kurtosis.

Skewness (Malhotra, 2013)

- Assess the distribution’s symmetry about the mean (mode = mean = median). => Skewness - the tendency of the deviations from the mean to be larger in one direction than in the other.

Kurtosis (Malhotra, 2013)

- A measure of the relative peakedness or flatness of the curve defined by the frequency distribution. - Normal distribution = 0. More peaked >0. Flatter <0.

Calculate t-test

= (sample statistic - hypothesized parameter value) / standard error of the statistic

When to use F-test (Malhora, 2010)

In two independent samples test: Using F test as the statistical test of the equality of the variances of two populations.

t-distribution

- It is similar to the normal distribution in appearance, but it has more area in the tails and less in the center. - An increase in number of df > 2 similar distributions.

L7 - Introduction to inferential statistics Flashcards

(37 cards)