Lesson 3 - Quantitative Analysis (Statistics) Flashcards

1
Q

Population

A

the total number of some entity. The total number of planners preparing for the 2011 AICP exam would be a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sample

A

a subset of the population. For example, 25 candidates out of the total number of planners preparing for the 2011 AICP exam.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Descriptive Statistics

A

describe the characteristics of a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Inferential Statistics

A

determine characteristics of a population based on observations made on a sample from that population. We infer things about the population based on what is observed in the sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Central tendency

A

the typical or representative value of a dataset. There are several ways to report central tendency, including mean, median, and mode.

appropriate measure of central tendency depends on data type and situation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Mean

A

he average of a distribution. The mean of [2, 3, 4, 5] is 3.5.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Weighted mean

A

when there is greater importance placed on specific entries or when the frequency distribution results in a representative value being assigned for each class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Median

A

the middle number of a ranked distribution. The median of [2, 3, 4, 6, 7] is 4.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Mode

A

the most frequent number in a distribution. The modes of [1, 2, 3, 3, 5, 6, 7, 7] are 3 and 7. There can be more than one mode for a data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Nominal data

A

is classified into mutually exclusive groups that lack intrinsic order. Race, social security number, and sex are examples of nominal data. Mode is the only measure of central tendency that can be used for nominal data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Ordinal data

A

has values that are ranked so that inferences can be made regarding the magnitude. However, ordinal data has no fixed interval between values. Educational attainment or a letter grade on a test are examples of ordinal data. Mode and median are the only measures of central tendency that can be used for ordinal data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Interval data

A

is data that has an ordered relationship with a magnitude. For temperature, 30 degrees is not twice as cold as 60 degrees. Mean is the best measure of interval data. Where the data is skewed median can be used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Ratio data

A

has an ordered relationship and equal intervals. Distance is an example of ratio data because 3.2 miles is twice as long as 1.6 miles. Any form of central tendency can be used for this type of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Qualitative Variables

A

can be nominal or ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Quantitative Variables

A

can be interval or ratio.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Continuous Variables

A

can have an infinite number of values, such as 1.1111.

17
Q

Dichotomous Variables

A

can only have two possible values, such as unemployed or employed which are symbolized as 0 and 1.

18
Q

Hypothesis Test

A

allows for a determination of possible outcomes and the interrelationship between variables.

19
Q

Null Hypothesis

A

shown as H0 is a statement that there are no differences. For example, a Null Hypothesis could be that Traffic Calming has no impact on traffic speed.

20
Q

Alternate Hypothesis

A

designated as H1, proposes the relationship - Traffic Calming reduces traffic speed.

21
Q

Normal distribution (data)

A

is one that is symmetrical around the mean. This is a bell curve.

22
Q

Distribution skewed to the right

A

has a few high numbers (outliers) that pull the mean to the right. For example, if there are three $20 million homes in your community, it is likely to skew the mean home value to the right.

23
Q

Distribution skewed to the left

A

has a few low numbers (outliers) that pull the mean to the left. When taking the AICP exam, for instance, a few people may give up and walk out resulting in a few very low scores, which would skew the mean score to the left.

24
Q

Range (dispersion)

A

the simplest measure of dispersion. The range is the difference between the highest and lowest scores in a distribution. The age range of the respondents in a neighborhood survey goes from 18-year-old to 62-year-old. This results in a range of 44.

25
Q

Variance (dispersion)

A

the average squared difference of scores from the mean score of a distribution.Variance is a descriptor of a probability distribution, how far the numbers lie from the mean.

26
Q

Standard Deviation (dispersion)

A

is the square root of the variance. For instance, if we want to know the difference in wages among three employees at a planning department, we need to calculate the mean, variance, and standard deviation. If the employees earn $10, $20, and $35 per hour, the mean is $21.67. This means that employee 1 makes ($10 - $21.67) = $11.67 less than the mean; employee 2 makes ($20 - $21.67) = $1.67 less than the mean; and employee 3 makes ($35 - $21.67) = $13.33 more than the mean.

To compute the variance, we first square each difference and sum it. (11.67)2+ (1.67)2 + (13.33)2 = 136.19 + 2.79 + 177.69 = 316.67. We then divide 316.67 by the number of samples minus 1, which gives us 316.67/(3-1) = $158.33.

The standard deviation is simply the square root of the variance. In this case, the square root of 158.33 is $12.58.

27
Q

Coefficient of Variation (dispersion)

A

measures the relative dispersion from the mean and is measured by taking the standard deviation and dividing by the mean.

28
Q

Standard Error (dispersion)

A

is the standard deviation of a sampling distribution. Standard errors indicate the degree of sampling fluctuation. The larger the sample size the smaller the standard error.

29
Q

Confidence Interval (dispersion)

A

gives an estimated range of values which is likely to include an unknown population parameter. The width of the confidence interval gives us an idea of how uncertain we are about the unknown parameter. A wide interval may indicate that we need more data before we can make a definitive statement. You frequently see confidence intervals provided on the polls. For example, 42% of California residents support one presidential candidate, 36% support another candidate, and 22% undecided, +/- 3%. This 3% is the confidence interval.

30
Q

Chi Square (testing)

A

a non-parametric test statistic that provides a measure of the amount of difference between two frequency distributions. Chi Square is commonly used for probability distributions in inferential statistics. This Chi Square distribution is used to test the goodness of fit of an observed distribution to a theoretical one.

31
Q

z-score (testing)

A

a measure of the distance, in standard deviation units, from the mean. This allows one to determine the likelihood, or probability that something would happen.

32
Q

t-test (testing)

A

allows the comparisons of the means of two groups to determine how likely the difference between the two means occurred by chance. In order to conduct a t-test, one needs to know the number of subjects in each group, the difference between the means of each group, and the standard deviation for each group.

33
Q

ANOVA (testing)

A

an analysis of variance. It studies the relationship between two variables, the first variable must be nominal and the second is interval.

34
Q

Correlation (testing)

A

tests the strength of the relationship between variables. The Correlation Coefficient indicates the type and strength of the relationship between variables, ranging from -1 to 1. The closer to 1 the stronger the relationship between the variables. For example, you would expect a strong correlation coefficient between score on the AICP exam and hours of study. Squaring the correlation coefficient results in an r2

35
Q

Regression (testing)

A

a test of the effect of independent variables on a dependent variable. A regression analysis explores the relationship between variables. For example, AICP Exam Score depends on number of hours studied, years of experience, and educational attainment. The result could show that for every 50 hours studied the score increases by 10%.

36
Q

Sampling Error (testing)

A

occurs when one has taken a sample from a larger population. The sample is not representative of the population as a whole, creating a sampling error.

37
Q

Nonsampling error

A

is one that cannot be explained by the representativeness of the sample. A nonsampling error can occur as a result of respondents misunderstanding a question or misreporting their answer and can also including missing values.