CHAPTER 2 QUESTIONS Flashcards
Descriptive statistics describe the distribution of a data set in terms of _________________
Descriptive statistics describe the distribution of a data set in terms of both central tendency and dispersion.
Measures of dispersion include _ _ _ _ _
Measures of dispersion include the standard deviation, coefficient of variation (standard deviation divided by the mean), coefficient of dispersion, and range.
The median is the ____ percentile or midpoint in the distribution, with half of the sales prices less than it and half greater than it.
The median is a frequently used measure of central tendency in both assessment and single property appraisals.
This is because _ _ _ _ _ _
The ______ percentile is the first quartile and the _____percentile is the third quartile. These represent the cut-off points for the lowest one-fourth and lowest three-fourths of the data, respectively.
The median is the 50th percentile or midpoint in the distribution, with half of the sales prices less than it andhalf greater than it.
The median is a frequently used measure of central tendency in both assessment and single property appraisals. This is because the mean can be significantly influenced by outliers; for S_PRICE, the influence of some very high priced sales has resulted in a mean that is larger than the median.
The 25th percentile is the first quartile and the 75th percentile is the third quartile. These represent the cut-off points for the lowest one-fourth and lowest three-fourths of the data, respectively.
Although not shown, the coefficient of variation (COV) can be computed by _ _ _ _ _
Although not shown, the coefficient of variation (COV) can be computed by dividing the standard deviation by the mean.
After finding the mean for a sample, it is a good idea to ask whether the figure is representative of the whole population. This involves calculating something called the __________, then determining ___________
After finding the mean for a sample, it is a good idea to ask whether the figure is representative of the whole population. This involves calculating something called the standard error of the mean,then determining confidence intervals around that number.
The _____________, often referred to as _______ OR ___, is a measure of how well the mean for a particular sample estimates the mean for the whole population.
The standard error of the mean, often referred to as standard error or SE, is a measure of how well the mean for a particular sample estimates the mean for the whole population.
S**E**x = Estimate of the standard error of the mean
s = standard deviation of the sample
n= the square root of the sample size
SEx = Estimate of the standard error of the mean
s = standard deviation of the sample
n= the square root of the sample size
The standard error of the mean can be used to form confidence intervals around the mean. Adding and subtracting one standard error to and from the mean produces a range of values that typically encompasses approximately ____% of the possible means for the population overall.
The standard error of the mean can be used to form confidence intervals around the mean. Adding and subtracting one standard error to and from the mean produces a range of values that typically encompasses approximately 68% of the possible means for the population overall.
As stated above, the 95% confidence interval for the mean is closely approximated by _ _ _ _ _ _
As stated above, the 95% confidence interval for the mean is closely approximated by adding and subtracting the value of two standard errors of the mean from the mean value.
TRUE OR FALSE?
The mean will always be in the centre of the range of the 95% confidence interval, but this is not true for the median.
ANSWER: TRUE
While the median value will always be within its 95% confidence interval, the median will not always be at the centre of the confidence interval range. This is because _ _ _ _ _
While the median value will always be within its 95% confidence interval, the median will not always be at the centre of the confidence interval range.
This is because the confidence interval for the median is found by counting values greater than and less than the median rather than by addition and subtraction of a value.
The ASR shows how accurately assessed values relate to actual sale prices by _ _ _ _ _
The ASR shows how accurately assessed values relate to actual sale prices by dividing the assessed value (the value predicted by the model) by the actual sales price.
A “normal” distribution means _ _ _ _ _ _
A “normal” distribution means that the data is evenly spread out on either side of the mean, with the bulk of the observations near the mean and trailing off on either side.
NOTE ONLY
A normal distribution is required in order to accurately estimate confidence intervals of the mean and to carry out probability estimates for sample data: e.g., “68% of the data will fall between the points ___ and ____, which are one standard deviation on either side of the mean”. See Figure 2.1 for an illustration. However, if the data is not normal, these measures will not be completely reliable.
Quite often, real estate data is not normally distributed, especially residential sales data, as it tends to have a large number of lower-priced sales and then a few high priced sales that skew the curve. When this is the case, the median is a better indicator of central tendency than the mean and the confidence intervals of the median are preferable to those around the mean.
NOTE ONLY
A normal distribution is required in order to accurately estimate confidence intervals of the mean and to carry out probability estimates for sample data: e.g., “68% of the data will fall between the points ___ and ____, which are one standard deviation on either side of the mean”. See Figure 2.1 for an illustration. However, if the data is not normal, these measures will not be completely reliable.
Quite often, real estate data is not normally distributed, especially residential sales data, as it tends to have a large number of lower-priced sales and then a few high priced sales that skew the curve. When this is the case, the median is a better indicator of central tendency than the mean and the confidence intervals of the median are preferable to those around the mean.
_____________ allow you to use mathematical and logical operations to create new variables from existing ones.
For example, a database may include information on sale price and house size (square footage). If you were interested in creating a variable that represented sale price per square foot, you would create a transformation for such a variable.
T**r**a**n**sfor**m**a**t**i**o**n**s allow you to use mathematical and logical operations to create new variables from existing ones. For example, a database may include information on sale price and house size (square footage). If you were interested in creating a variable that represented sale price per square foot, you would create a transformation for such a variable.