Research & Assessment Methods Flashcards

1
Q

What are the 3 steps of the statistical process?

A
  1. Collect data (sampling, surveys)
  2. Describe & summarize data, look for patterns (descriptive statistics, exploratory data analysis)
  3. Interpret the data (inferential stats, statistical modeling)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Statistical Inference

A

test theories/hypotheses about the data, using probability theory which allows us to draw a conclusion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

4 Sampling Methods:

A
  1. Sampling Frame
  2. Probability Sampling
  3. Non-Probability Sampling
  4. Implementation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sampling Frame

A

the population of interest that you are using (sampling method)

(eg. sampling within the frame of customers at a specific bookstore)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Probability Sampling

A

most sophisticated, rigorous, and defensible method of sampling

take a subset of a population in an organized manner:
-randomly
-systematic (eg. every 20th phone # in the phone book)
-stratified (eg. by age, education)
-cluster (eg. households that all live in the same neighborhood)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Non-Probability Sampling

A

-Convenience Sampling (eg. snowball - you interview people & let them refer you to someone else)
-Volunteered (Volunteered Geographic Information -VGI)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Implementation (Sampling Method)

A

Mail, telephone, web, in-person

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the 4 main scales of data measurement?

A
  1. Nominal Scale
  2. Ordinal Scale
  3. Interval Scale
  4. Ratio Scale (the gold standard)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Nominal Scale data measurement?

A

Categories, the label doesn’t matter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Ordinal Scale data measurement?

A

Ordered categories, ranking only

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Interval Scale data measurement?

A

Continuous, but only absolute differences are meaningful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Ratio Scale data measurement?

A

The gold standard, both absolute and relative differences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 4 Types of Variables?

A
  1. Qualitative Variables
  2. Quantitative Variables
  3. Discrete Variables
  4. Continuous Variables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are qualitative variables?

A

categories of nominal or ordinal data with a ranking

Qualitative research can not provide a generalized understanding (such as a population trend), but it can provide a deeper understanding of a given topic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are quantitative variables?

A

interval or ratio scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are discrete variables?

A

only a finite number of values (eg. count of events, # of accidents on a certain street)

-special case is binary or dichotomous (where you have only two values eg. 0 or 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are continuous variables?

A

infinite # of values (positive and negative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

In a survey design, what is sampling frame?

A

A sampling frame is the population of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the differences between nominal, ordinal, and interval data?

A

Nominal data are categories (eg. ice cream type - think “name” = nominal)

Ordinal data are ranked (think “order” = ordinal)

Interval data are continuous (but only the differences between values have meaning)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are discrete variables?

A

Discrete variables can only take a finite # of values.

A special case is a dichotomous variable, that can only take two values (often 0 and 1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a distribution?

A

statistical distribution describes how values are distributed for a field. In other words, the statistical distribution shows which values are common and uncommon (which values are likely to be observed).

You can represent a distribution graphically or mathematically.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is a histogram?

A

A graph that shows bins with ranges of value - how many observations fall within that range of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is a density curve?

A

A density curve is a graphical representation of a numerical distribution where the outcomes are continuous.

Ideally, you try to replace discrete approximation (histogram) with continuous representation. Density curves can be added to a plot of discrete graphed information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a box plot/box and whisker graph?

A

Based on the ranking of observation.

While a histogram is based on categories, this box plot is based on a ranking from low to high.

Quartiles: 25th percentile, median/mid-point/50th percentile, 75th percentile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the interquartile range?

A

a measure of spread in the box and whisker plot. It accounts for the range from 25th percentile to 75th percentile (whisker to whisker)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

The box plot helps us to identify _________.

A

Outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Outliers are…

A

An outlier is an observation that is “extreme” / outside the reasonable range of the distribution.

An observation that is more than 2 standard deviations different from the mean or an observation that is outside the fences in a box plot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What does a histogram show?

A

A histogram shows the distribution of a variable visualized as a bar chart.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What are the 2 types of Hypothesis Tests?

A

(inferential statistics)

  1. Null Hypothesis
  2. Alternative Hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is a null hypothesis?

A

This is a reference statement that we typically want to reject.

Typically it’s a value, often 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What is the alternative hypothesis?

A

the main purpose is to help to provide evidence for rejecting the null hypothesis. You don’t (NEVER) “accept” the alternative hypothesis, you use it to reject the null.

It is the research hypothesis, a statement one wants to find support for.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

How do you reject the null hypothesis?

A

Find evidence in the data (a statistic like the average)

If we observe that the value of the statistic (eg. mean) is very far from the null, they we reject the null.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

What is a Type I error?

A

The chance/probability that we make the wrong decision in rejecting the null hypothesis when it’s true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

What are the 3 types of Test Statistics?

A
  1. Z-Score
  2. T-Test
  3. Chi-square test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What is z-score?

A

(x-mean)/standard deviation

subtract the mean from the standardized value and divide by the stan dev.

then compare the z-score to the standard normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

What does a t-test test?

A

A t-test is a test on the difference between the means between two subgroups. The null hypothesis would be that the mean scores for the two groups are the same, the alternative hypothesis is that they are different.

Eg. comparing housing prices in an area with programs vs. in an area with no programs - is there a significant difference?

Eg. APA is interested in finding out whether the scores of candidates who take the AICP test right after school under the AICP Candidate Pilot Program are different from the scores of the candidates who had the required years of planning experience.

Answers the question: are 2 groups are part of the same population?

36
Q

What is a chi-square test?

A

a test on goodness of fit

difference between observed and expected values

often used to see if values/elements of table are related or unrelated (too far away - if so, reject the null)

37
Q

When is the “alternative hypothesis” accepted?

A

NEVER.

The null hypothesis is rejected.

38
Q

What is bivariate analysis (two variables)?

A
39
Q

What is a scatter plot?

A

It shows the relationship between 2 variables - one on the x-axis and one on the y-axis.

You then get an idea of the linear association between 2 variables.

40
Q

What is correlation?

A

NOT causation.

correlation is a linear association between 2 variables.

For continuous variables, use Pearson Correlation Coefficient.

For rank variables, use Kendall-Spearman Rank Correlation Coefficient.

41
Q

What is regression analysis?

A

an analysis of the relationship between one or more variables to determine if the factors impact the dependent variable.

  1. where we have a model with a dependent variable on the left side and explanatory variable on the right side,
  2. then we put them in a formula (linear relationship formula)
  3. estimate coefficients
  4. measure how well the model fits the data
  5. use it to predict values and analyze residuals

Eg. A regression analysis is used to help to understand the degree to which a variety of programs are working: one-time grants to homeowners to catch up on mortgage payments, placing a temporary three year freeze on property taxes, and homeowner counseling - whether they are reducing the foreclosure rate.

42
Q

What are the two types of correlation coefficients?

A
  1. Pearson Correlation (for continuous variables)
  2. Kendall-Spearman Correlation (for ranks)
43
Q

What is being estimated in regression analysis?

A

an analysis of the relationship between one or more variables to determine if the factors impact the dependent variable.

The coefficients in a linear equation that relates the dependent variable to a number of explanatory variables.

The coefficients are the intercept and the slopes.

44
Q

What are the 4 categories of research/analysis/ assessment methods?

A
  1. Population analysis
  2. Economic analysis
  3. Statistical analysis
  4. Spatial analysis
44
Q

What are the 4 categories of research/analysis/ assessment methods?

A
  1. Population analysis
  2. Economic analysis
  3. Statistical analysis
  4. Spatial analysis
45
Q

What does the APA mean by Spatial Analysis?

A

in particular, GIS mapping and interpretation

46
Q

What is GIS?

A

Geographic Information System

a spatial system/database that creates, manages, analyzes, and maps all types of data

integrates location data with all types of descriptive information (attributes)

47
Q

Where does GIS data come from?

A

GPS, remotely sensed satellite imagery, li-dar - light imaging detection and radiation

48
Q

What are the 2 dominant spatial data types in GIS?

A
  1. Raster - grids, pixels
  2. Vector - points, lines, polygons
49
Q

What is the Raster spatial data type?

A

raster = grids, pixels

cover a continuous surface
(remotely sensed data)

50
Q

What is the Vector spatial data type?

A

vector = geometric shapes

objects, geographic shapes (simple features)

-points (city location), lines (road), polygons (boundary)
-networks (interlinking lines as roads)
-DEM
-Triangular Irregular Networks (TIN), conveys elevation

51
Q

What are Digital Elevation Models (DEM)?

A

a type of 2.5D (dimension) vector data that is not quite 3D but conveys elevations

52
Q

What are the two important aspects of geographic information?

A
  1. Location information (where)
  2. Attribution information (values)
53
Q

What is geodesy?

A

a way of measuring and representing the Earth and its gravitational field

54
Q

What is datum?

A

a mathematical approximation of the shape of Earth’s surface as an ellipsoid/geoid (not a sphere) to compute location

there are coordinates in a reference system with latitude (y) and longitude (x) in angular degrees, minutes, and seconds

model for setting coordinates

55
Q

What is WGS84 and NAD83?

A

World Geodetic System

North American Datum

They are coordinate systems for datum.

56
Q

What is a projection?

A

converting the 3D Earth’s surface (ellipsoid) to a 2D flat map (plane)

changing angular degree measurements to x,y Euclidian measurements

conic, cylindrical

projections distort at least one of the following:
-shape
-distance
-direction
-land area

57
Q

What are the 4 types of projections?

A
  1. Conformal: shape (keep the shape the same)
  2. Equidistant (keep the distance right)
  3. True direction
  4. Equal area

You use different projections for different purposes.

58
Q

What is the mercator projection used for?

A

Navigation

Angles and directions are true, size and shape are not/distorted.

59
Q

What does the Robinson projection not violate/distort?

A

Angles and shapes are not distorted.

60
Q

What does the Robinson projection not violate/distort?

A

Angles and shapes are not distorted.

61
Q

What are the 4 characteristics of geographic information that are affected by projections, and what are the corresponding projections?

A

-shape (conformal)
-distance (equidistant)
-direction (true direction)
-area (equal area)

62
Q

What is the purpose of map classification?

A

To group the observations into categories that correspond with a given color or shade on the map.

63
Q

What are the map classifications of a choropleth map?

A

-equal interval
-quantile
-natural breaks
-unique value maps

64
Q

What is the difference between a map showing equal interval vs. a quantile distribution?

A

An equal interval map has the same value difference between categories, but the number of observations in each category can differ. (eg. histogram)

A quantile map has an equal number of observations in each category, but the value difference is not constant.

65
Q

Where does the idea of an overlay come from in spatial analysis?

A

Ian McHarg and his book Design with Nature - overlay different suitability maps and then identify the most suitable location.

66
Q

Which planner is recognized as the pioneer of overlay analysis?

A

Ian McHarg, in Design with Nature

67
Q

What is the buffer spatial analysis?

A

put a circular area around a point (eg. adding a 1,000 ft. buffer around points/locations of liquor stores). You could then add a zoning ordinance to prevent location of liquor stores within 1,000 ft. of libraries or schools.

68
Q

What is location intelligence as spatial analysis?

A

the finding of optimal locations for new facilities

map markets, demographics, locations of competitors, etc… to identify the best location

69
Q

What is a heat map used for?

A

shows the intensity of events, often used to portray crimes, incidents, pothole densities

70
Q

What is stratified sampling?

A

When you subdivide the population into at least 2 different subgroups that share the same characteristics, then draw a sample from each subgroup.

a form of probability sampling

71
Q

What is systematic sampling?

A

Every 20th phone # in the phone book

Choose every 7th person in a group

a form of probability sampling

In a systematic random sample everyone in the population has an equal chance of being selected.

eg. picked a random point in the forest and a random direction, went to the random starting point and counted the number of frogs in the forest closest to that point. He then took ten steps in the direction that had been selected at random and counted the number of frogs. He repeated the process of taking ten steps in the same direction and counting the number of frogs until he had counted the number of frogs from 50 walks.

72
Q

What is cluster sampling? THINK NEIGHBORHOODS

A

Divide the population into smaller groups/sections or clusters then randomly select among those clusters to form a sample (eg. neighborhoods represent clusters)

In a cluster sample, the population is divided into clusters and a sample of the clusters is taken, but only some of the clusters are taken. This tends to increase sampling error because it is possible for clusters to be similar; for example, in this case, if the clusters were neighborhoods.

a form of probability sampling

often used to study large populations, particularly those that are widely geographically dispersed

73
Q

You have calculated a measure of dispersion around the mean. This was calculated as the average of the sum of the squared deviations from the mean. This is the:

A

The VARIANCE is a numerical value used to indicate how widely individuals in a group vary.

74
Q

What is a choropleth map?

A

the kind of map that would work best to represent the percentage of residents below the poverty line in each census tract

Choropleth maps are the best way to link statistical data to discrete geographic areas using different colors or shades of the same color.

75
Q

Correlation coefficients whose magnitude are between _______ indicate variables which can be considered very highly correlated.

A

0.9 and 1.0 (or even 0.7-0.9)

Eg. A correlation value of -0.85 would indicate a high correlation between the two variables.

-1 indicates a perfectly negative linear correlation between two variables
0 indicates no linear correlation between two variables
1 indicates a perfectly positive linear correlation between two variables

76
Q

What is TIGER data?

A

Topologically Integrated Geographic Encoding and Referencing system and represents the U.S. Census Bureau’s geographic spatial data.

TIGER files include roads, census blocks, census tracts. NOT demographic data.

but the core TIGER/Line Files and Shapefiles do contain geographic entity codes (GEOIDs) that can be linked to the Census Bureau’s demographic data. They are not embedded in the TIGER files themselves but can be linked to demographic data.

77
Q

What type of map uses scales of 1:25,000; 1:50,000; and 1:100,000?

A

United States Geological Survey (USGS) topographic

78
Q

What is Applied research?

A

Applied research refers to a non-systematic process of providing solutions to the specific problems or issues. These problems or issues can be on an individual level group or societal level as well.

a type of examination that uses basic research to find practical solutions for existing problems. These can include challenges in the workplace, education and society. This research type uses empirical methodologies, such as experiments, to collect further data in an area of study.

79
Q

What is Translational research?

A

Translational research aims to make findings from basic science useful for practical applications that enhance human health and well-being.

Eg. You are organizing a regional research summit to bring together professionals in the region with researchers at the university. You are organizing a panel that will focus on basic research and how it can be taken into practice. This is translational research.

It is practiced in a wide variety of fields such as environmental science, as well as the health, behavioral, and social sciences.

80
Q

What is Basic research?

A

Basic research, or fundamental research, is a type of investigation focused on improving the understanding of a particular phenomenon, study or law of nature.

This type of research examines data to find the unknown and fulfill a sense of curiosity. Usually, these involve “how,” “what” and “why” questions to explain occurrences.

Basic research looks at how processes or concepts work. Information obtained from basic research often creates a foundation for applied studies.

Eg. A study to discover the components making up human DNA
E.g A study accessing whether stress levels make people more aggressive

81
Q

What is Community-engaged research?

A
82
Q

What is sample selection bias?

A

a type of bias caused by choosing non-random data for statistical analysis. It is the bias that results from the failure to ensure the proper randomization of a population sample.

Sample selection bias can occur when the availability of data is influenced by the selection process.

Sampling bias is a bias in which a sample is collected in such a way that some members of the intended population have a lower or higher sampling probability than others. It results in a biased sample of a population in which all individuals, or instances, were not equally likely to have been selected.

83
Q

What is a non-sampling error?

A

A non-sampling error relates to happenings during data collection that create unreliable data.

eg. respondents misinterpreted the intention of a survey question. The planner intended this to mean neighborhood, while respondents interpreted this in multiple ways.

84
Q

The range is a proper summary of the spread in the distribution of which of the following types of data : (1) nominal; (2) ordinal; (3) interval; (4) ratio ?

A

Range = interval and ratio data.

The range is only appropriate when the intervals between data points are meaningful.

85
Q

What is the Hedonic Regression Model/Hedonic Pricing Method?

A

the use of a regression model to estimate the influence that various factors have on the price of a good, or the demand for a good.

86
Q

A median is a proper summary of the central tendency of the following types of data:
(1) nominal
(2) ordinal
(3) interval
(4) ratio

A

The median is appropriate for:
ordinal/ranks,
interval and
ratio data,

but NOT for categories (nominal data).

87
Q

Inferential statistics….

A

use probability theory to determine characteristics of a population based on observations made on a sample from that population.

We infer things about the population based on what is observed in the sample. Inferential statistics are about rejecting the null hypothesis, never proving it.