Lesson 4 Flashcards

0
Q

What are the four principles of experimental design? Explain each term

A

Control - limiting what data is used.

Randomization - relying on chance

Replication - considering similar experiments.

Blocking - exploiting natural and logical groupings of variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

List and describe the four market identification steps. Discuss the steps with in each step.

A

1) problem identification - the initial questions provided by the client

Identify the client and intended       users
Intended use
Purpose 
Date of value
Property characteristics
Assumptions 

2) scope - the solution strategy identified by the appraiser

Breath
Complexity
Detail
Emphasis

3) similarity - a measure of relative equivalency in aspect of time, space, and utility

Transaction - contract terms
Time - market conditions
Location
Utility - physical, legal, financial characteristics

4) data decisions - defining and refining available data

Information 
Database
Dataset
Information set
Illustrative set

5) optimality- research and analysis I may lead to a reconsideration of each of the steps

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 5 steps of data reduction? Explain each step

A

Database - numbers and facts available or attainable

Data frame - excludes what is not statistically useful

Dataset - market data set - what is only relevant to that market area

Information set - data controlled for analysis and comprehension

Illustrative set - data chosen for its ability to exemplify the solution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the four dimensions of similarity?

A

Transaction - rights conveyed , financing, sale conditions

Time - market conditions

Space - adjacent and proximate influences, neighborhood , district, market segment, illustrative comparables

Utility- property characteristics- physics, legal and financial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the 2 major reasons why an analyst might need to go back and expand an original dataset or data frame?

A

1) more data is needed

2) the optimal use is not what originally asserted when the appraisal problem was identified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the four dimensions of compatibility and why are they important to identifying the right market

A

Transaction elements

Time

Space

Utility characteristics

They are important because their relative independence enables simple regression and centers comparison methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The transaction dimension had three elements. How do they differ substantially from the other three dimensions?

A

The elements of transaction terms do not measure any attributed of the property just the contract and motivation of the sale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The utility encompasses the two fundamental types of property benefits, amenities and income. That are the three forms of utility?

A

Similar to optimal use issues, the three forms of utility are physical characteristics, Megan permissibility and financial characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

While neighborhoods and districts are important to real estate economics, what economic division is the one we primarily rely on for defining our datasets? How is it defined?

A

We primarily rely on market segment. Market segment is defined as a homogeneous market as characterized by a set of similarity variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Name two major types of changes in market conditions?

A

Trends and event impacts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How is information better than data?

A

Information is data that has been organized to make it useful and understandable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How is an information set different than a market dataset

A

An information set is a subset of the data set, useful for a particular analytic method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What statistic is most useful in identifying the best unit of measure?

A

The main statistical measure useful in identifying the best unit of comparison is the correlation between sale prices and that particular variable and the COV of the variable itself.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why do you save sales deleted fr your dataset in a different file?

A

Helps with documentation and enables audit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Appraisal has historically used less intensive data analysis methods. Why? How has this changed today?

A

Historically appraisal has used less intensive data analysis methods because data was poor in quality, difficult to obtain, and difficult to analyze. Today, good quality data is much more easily available and computing power and software have made it much simpler to analyze effectively

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the difference between reliability and credibility?

A

Reliability relates to low variability and is measurable.

Credibility is more subjective and is evaluated in terms of the intended use of the appraiser. It includes appropriateness of the model used and mathematical reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why do we use multiple regression analysis to solve all valuation problems?

A

It’s not used for all valuation. Comparison of means and two variable statistics can also be used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the four appraisal principles of experimental design? Give a one sentence explanation of each.

A

1) control - limiting data to homogeneous groups
2) randomization - relying on control (bracketing and balancing) so that elements are distributed by chance
3) Replication - relying on similar experiments or similarity of missing information
4) blocking - exploiting “natural” groupings of variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

List the sequence of five data group subsets we have learned regarding data reduction

A
Five subsets in data reduction: 
Database
Data frame
Market dataset 
Information set
Illustrative set
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Which of the above sets is most similar to the traditional three comparable sales?

A

The illustrative set serves the best simplicity aspects of the traditional “3 comps” report.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Which is better as a unit of comparison, continuous variables or categorical variables?

A

A continuous measure variable, used as a unit of comparison, can significantly improve the precision of an analysis - there is less flexibility with a categorical (discrete) variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Which is better for control and blocking, continuous variables or categorical variables?

A

For control and blocking, the best are binary variables. Next best are categorical (discrete) variables with more than two choices

22
Q

The four dimensions of property similarity:

1) enable determination of scope
2) often have high correlation within each dimension
3) generally have high correlation across dimensions
4) are all variants on locational advantages

A

2) the four dimensions of property similarity generally have low correlation across dimensions, but can have high correlation within dimensions

23
Q

Which of the following is described as the biggest appraisal mistake?

1) not measuring the property correctly
2) errors in identify transaction elements
3) identify the wrong market segment
4) apply an adjustment the wrong way

A

3

24
Q

Which one of the following is described as the second biggest mistake?

1) not measuring the property correctly
2) errors in identifying the transaction elements
3) identifying the wrong market segment
4) apply an adjustment the wrong way

A

2

25
Q

What must be considered in identifying similarity?

1) the four dimensions of comparison
2) location and subject characteristics
3) transaction elements and subject characteristics
4) utility and age of the property

A

1

26
Q

In any given project, the depth of scope and detail of your analysis will depend on:

1) the challenges of the particular assignment
2) any agreement you may have with your client
3) your desire to produce a superior work product
4) all of the above

A

4)

27
Q

The appraisal process is designed to:

1) validate sale prices in a fair market
2) make data meaningful for decision-making
3) clarify scope of work to redirect client problems
4) turn good information into data explanations

A

2) the entire appraisal process is designed to take data, enhance it, and reduce it so that it is meaningful

28
Q

Property characterization:

1) is the start of the appraisal process
2) provides an alternative to finding the right market
3) May cause you to return to your market identification decision
4) can only be valid with full and complete hypothesis testing on all variables of interest

A

3

29
Q

Credibility:

1) means having a high reliability and low correlation among dimensions
2) involves getting the model right, as well as the right scope of work
3) depends on competence and good communication
4) means freedom from random error

A

3

30
Q

Which of the following is correct regarding evaluation of market conditions?

1) trends can be evaluated using a price index or scatter graph
2) events are best analyzed by paired group methods alone
3) the closure of a major employer is a market area is an example necessitates a trend impact analysis
4) a steady market increase necessitate an event impact analysis

A

1

31
Q

The scope of the research element is composed of four fundamental dimensions. Which of the following is not one of the four dimensions?

1) transaction conditions
2) range of characteristics
3) space
4) complexity

A

4

32
Q

The four principles of valuation experimental design are:

1) control, randomization, replication, and blocking
2) balance, bracketing, replication, and control
3) control, randomization, balance, and blocking
4) dimensions, bracketing, estimation, and control

A

1

33
Q

Replication means:

1) Other similar experiments should produce similar results
2) “averaging out” the effects of uncontrollable sources of variation
3) you can reply the challenges with reproductive data
4) taking advantage of natural grouping of similar variables

A

1

34
Q

The difference between a market dataset and an information set is:

1) control, randomization, replication, and blocking
2) that some data must be added due to presence of outliers
3) the degree of iteration necessary
4) that an information set is designed for use with a specific analysis method

A

4

35
Q

What is considered to be a strong and perfect correlation?

A

Strong 0.8 - 1

Perfect = 1

36
Q

What is coefficient of determination? How is it measured?

A

Is used to determine how well our regression line predicts the selling price

R2= 0 none of the variation in sale prices is explained by the model

R2=1. All the deviations from the average sale price are explained by the regression equation and the sum of the squared errors equals 0

37
Q

What are the six key statistics used in evaluating regression results?

Describe and discuss what is a good measurement?

A

COD

SEE - how good the best fit is in terms of how larger the differences are between the regression line and the actual sample observations - the lowest value the better - dollar figure

COV - SEE expressed as a % -

Correlation coefficient

T-statistic - measure of the significance of importance of a regression variable in explaining differences in the dependent variable. - should be greater than 1.64

F- statistic - shows overall quality of the regression - should be less than 0.05

38
Q

After running a regression, you find that the model yields an SEE of 5,000. Is this a good result? What are the problems with using SEE as a measure of “goodness of fit”?

A

SEE is an absolute measure, meaning size alone does not tell us very much. In order to use SEE to analyze “goodness of fit”, we must convert it to a COV by dividing the SEE by the mean. The COV tells us how our model is doing in relative terms %. A target COV for a good model is less than 10%.

39
Q

Explain the difference between simple linear regression and multiple regression?

A

Multiple regression includes two or more independent variables; linear regression includes only one variable. Multiple regression is difficult to depict spatially since it involves three or more dimensions, while linear regression is in two dimensions and can be readily displayed in a graph. Multiple regression involves much more difficult for clients and other real estate professionals to understand.

40
Q

Based on your regression analysis of condominium sales you determine:

Market Value = $42,000 + $70(living area in sqft) + $5,000(number of bathrooms)

What do the coefficients in the equation represent? How many bathrooms would you expect a 1,000 sqft condo that sells for $19,500 to have?

A

The $70 coefficient represents the addition to value that each sqft of living area contributes to value, while the $5,000 coefficient represents the value of each additional bathroom. These coefficients can also be used as adjustment factors. Using the equation, a 1,000 sqft condo selling for $119,500 would have 1.5 bathrooms. $119,500 = $42,000 + $70(1,000 sqft) + $5,000(number of bathrooms)

41
Q

Additional multiple regression includes a major assumption that the impact of the coefficient for a specific independent variable Xi is independent of the impact of other variables. In other words, the impact of one independent variable on the dependent variable Y, assumed not to be related to changes in another independent variable. When these assumptions turn out to be false, what problem do we have? How can this issue be overcome?

A

Multicollinearity. This issue can be identified during initial data exploration. If two independent variables show high correlations, there is a potential for problems in the model. It may be necessary to exclude one or more variables from the model and re-test the regression. The tolerance and VIF statistics are tests for multicollinearity which can be applied to regression models. A low tolerance (less than 0.3) and a high VIF (greater than 3.33) outcome is a warning sign that multicollinearity exists.

42
Q

You conduct a regression analysis of detached single family housing prices, and then use the regression formula to calculate predicted values for your data set and the residuals. That kind of results should you expect when you analyze the descriptive statistics for the residuals?

A

The residuals should have a mean of zero because the regression models function is to find the line of best fit that limits each observations residual from this line. The median however may be positive or negative depending on the skewness in the distribution of predicted values.

43
Q

The first step in testing for multicollinearity is conducted during data-screwing where the correlation of each of the independent variables is determined. What other steps can be taken to ensure that multicollinearity is not present in your model?

A

After creating the model you should examine both the tolerance and VIF statistics or each variable, where tolerance = 1/VIF. If the tolerance of any of the variables is less than 0.3/ VIF is greater than 3.333, multicollinearity exists and the model should be revised.

44
Q

A high SEE indicates:

1) a better result
2) a worse result
3) that multiple regression analysis is not a viable opinion
4) multicollinearity exists

A

2

45
Q

A high VIF indicates:

1) multicollinearity is not present
2) multicollinearity is present
3) the tolerance is also high
4) the correlation coefficient is significant

A

2

46
Q

A COV under 10% indicates:

1) a good result
2) a poor result
3) a multiple regression analysis in not a viable option
4) multicollinearity exists

A

1

47
Q

Consider the following statistics for two samples of apartment rents versus suite size for rental apartments in Waterloo:
Dataset A, luxury high-rise concrete construction - R2 of 0.732 and SEE of 6,000
Dataset B, older 3-story frame walk-up construction - R2 of 0.635 and SEE of 8,673

Rents are much higher in Data A than Dataset B. In which dataset would a regression equation more accurately predict the apartment rent?

1) Dataset A since the R2 is higher and SEE lower than dataset B.
2) Dataset B since the R2 is lower and SEE higher than dataset B.
3) Both datasets would have equal statistical reliability.
4) Impossible to determine because the R2 and SEE are based on absolute values, so relative comparisons are not possible

A

1
R2 value is higher in sample A than sample B, and the SEE is lower. Therefore dataset A appears to both explain more variation and with less error. SEE is an absolute measure, which makes comparisons difficult without more information about the samples. However, because we know that rants are higher in dataset A, it’s lower SEE is even more convincing. Consider the alternative: if the SEE for dataset A was higher, we would not be able to be certain of it was higher because of higher rents or higher because of more error. Because the SEE for dataset A is in fact lower, we can conclude the predictive error is in fact lower.

48
Q

The standard error of the estimate is a good statistical tool for measuring:

1) a mathematical expression of the best fit of ordered pairs
2) the percentage of variation in Y that can be explained by the regression line
3) the amount of dispersion of the observed data around the regression line
4) none of the above

A

3)
The SEE is a measure of the amount of dispersion of the observed data around the regression line. R2 represents the percentage of the variation in Dependent variable that is explained by the regression equation .

49
Q

Consider a model where the dependent variable is sale price and the independent variable is age of building. This resulted in the following regression equation: Y= 100,500-960X and an R2 of 0.8. What can you conclude about these results?

1) Each year adds $960 to value
2) Weak negative correlation with 64% of the variation in sale price explained by building age.
3) Strong negative correlation with 80% of the variation in sale price explained by building age
4) A 1 year old building is worth $101.460.

A

3)
The negative sign of the regression coefficient indicates negative correlation. The R2 at 0.8 indicates a strong correlation. Each year of age reduces value by $960.

50
Q

What is the advantage of multiple regression over simple linear regression?

1) helps deal with non-linear relationships
2) provides the analyst an opportunity to account for additional sources of predictive error
3) accounts for the economic reality that many variables may affect the dependent variable
4) multicollinearity becomes increasingly possible

A

3)
Simple regression only considers one independent variable. However, in reality, many independent variables may affect the dependent variable.

51
Q

If a dataset has high correlation among all the variables. Does this mean that it will always have a high likelihood of predicting the sale price for any combination of the variables within the model parameters?

1) yes the regression has accounted for virtually all the variation in the dependent variable.
2) not always, since there are other factors such as sampling technique, sample size, and COV which should be considered
3) yes since there is no longer any residual error
4) no since only two of the variables are highly correlated

A

2)
The analyst must consider a variety of factors affecting the model, such as sampling technique or extrapolation issues, before determining the regression equation will produce acceptable predictions

52
Q

What would your reaction be if the bathroom variable had a t-statistic of 0.105 and all other statistics for the remaining variable were uncharged?

1) the bathroom variable may offer no benefit to the model
2) we can no longer be confident that the bathroom coefficient value is correct
3) our confidence in the significance of the bedrooms variable is improved
4) both (1) and (2)

A

4)
A t-statistic below the critical value of 2 means we can no longer be confident (at a 95% level) that the variable coefficient is different than zero. This means we are not confident its value is correct or if the variable can be removed without affecting the model.