Week 3 Chapter 4 Flashcards

1
Q

sample variance

A

s^2 = SS/(n-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

population variance

A

σ^2 = SS/N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

population standard deviation

A

σ = √(σ^2) = √((SS)/N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

sample standard deviation

A

s = √(s^2) = √((SS)/(n-1))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

central tendency

A

statistical measure to determine a single score that defines the midpoint of a distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

median

A

midpoint in a list of scores listed in order from smallest to largest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

mode

A

score or category that has the greatest frequency in a frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

bimodal

A

distribution with two scores with greatest frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

multimodal

A

a distribution with more than two scores with greatest frequency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

major mode

A

taller peak when two scores with greatest frequency have unequal frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

minor mode

A

shorter peak when two scores with greatest frequency have unequal frequencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

line graph

A

diagram used when values on horizontal axis are measured on an interval or ratio scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

variability

A

provides a quantitative measure of the differences between scores in a distribution and describes the degree to which the scores are spread out or clustered together. It also helps us determine which outcomes are likely and which are very unlikely to be obtained. This aspect of variability will play an important role in inferential statistics. Variability can also be viewed as measuring predictability, consistency, or even diversity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

if the scores in a distribution are all the same, then there is

A

no variability. If there are small differences between scores, then the variability is small, and if there are large differences between scores, then the variability is large.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

predictability, consistency, and diversity are all concerned with

A

the differences between scores or between individuals, which is exactly what is measured by variability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

a good measure of variability serves two purposes

A
  1. Variability describes the distribution of scores. Specifically, it tells whether the scores are clustered close together or are spread out over a large distance. Usually, variability is defined in terms of distance. It tells how much distance to expect between one score and another, or how much distance to expect between an individual score and the mean. For example, we know that the heights for most adult males are clustered close together, within 5 or 6 inches of the average. Although more extreme heights exist, they are relatively rare.
  2. Variability measures how well an individual score (or group of scores) represents the entire distribution. This aspect of variability is very important for inferential statistics, in which relatively small samples are used to answer questions about populations. For example, suppose that you selected a sample of one adult male to represent the entire population. Because most men have heights that are within a few inches of the population average (the distances are small), there is a very good chance that you would select someone whose height is within 6 inches of the population mean. For men’s weights, on the other hand, there are relatively large differences from one individual to another. For example, it would not be unusual to select an individual whose weight differs from the population average by more than 30 pounds. Thus, variability provides information about how much error to expect if you are using a sample to represent a population.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

three different measures of variability:

A

the range, standard deviation, and the variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

range

A

the distance covered by the scores in a distribution, from the smallest score to the largest score.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

One commonly used definition of the range simply measures the difference between the largest score (Xmax)

A

and the smallest score (Xmin). Range = Xmax - Xmin. By this definition, scores having values from 1 to 5 cover a range of 4 points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

the complete set of proportions is bounded by 0 at one end and

A

by 1 at the other. The proportions cover a range of 1 point. This definition works well for variables with precisely defined upper and lower boundaries. For example, if you are measuring proportions of an object, like pieces of a pizza, you can obtain values such as (1/8), (1/4), (1/2), (3/4). Expressed as decimal values, the proportions range from 0 to 1. You can never have a value less than 0 (none of the pizza) and you can never have a value greater than 1 (all of the pizza).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

An alternative definition of the range is often used when the scores are measurements of a continuous variable. In this case, the range can be defined as the

A

difference between the upper real limit (URL) for the largest score (Xmax) and the lower real limit (LRL) for the smallest score (Xmin). Range = URL for Xmax - LRL for Xmin. According to this definition, scores having values from 1 to 5 cover a range of 5.5 - 0.5 = 5 points .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

When the scores are whole numbers, the range can also be defined as the

A

number of measurement categories. If every individual is classified as either 1, 2, or 3 then there are three measurement categories and the range is 3 points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Defining the range as the number of measurement categories also works for discrete variables that are measured with

A

numerical scores. For example, if you are measuring the number of children in a family and the data produce values from 0 to 4, then there are five measurement categories (0, 1, 2, 3, and 4) and the range is 5 points. By this definition, when the scores are all whole numbers, the range can be obtained by: Xmax - Xmin + 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

The problem with using the range as a measure of variability is that

A

it is completely determined by the two extreme values and ignores the other scores in the distribution. Thus, a distribution with one unusually large (or small) score will have a large range even if the other scores are all clustered close together. Because the range does not consider all the scores in the distribution, it often does not give an accurate description of the variability for the entire distribution. For this reason, the range is considered to be a crude and unreliable measure of variability. Therefore, in most situations, it does not matter which definition you use to determine the range.

25
Q

The standard deviation is the most commonly used and the most important measure of

A

variability

26
Q

Standard deviation uses the mean of the distribution as a reference point and measures variability by considering

A

the distance between each score and the mean. In simple terms, the standard deviation provides a measure of the standard, or average, distance from the mean, and describes whether the scores are clustered closely around the mean or are widely scattered.

27
Q

variance

A

equals the mean of the squared deviations. Variance is the average squared distance from the mean.

28
Q

deviation

A

A deviation or deviation score is the difference between a score and the mean, and is calculated as: deviation = X - μ. A deviation, or a deviation score is often represented by a lowercase letter x.

29
Q

there are two parts to a deviation score:

A

the sign (+ or −) and the number. The sign (+ or −) tells the direction from the mean—that is, whether the score is located above (+) or below (−) the mean, and the number gives the actual distance from the mean. For example, a deviation score of −6 corresponds to a score that is below the mean by a distance of 6 points.

30
Q

the deviation scores add up to

A

zero. This should not be surprising if you remember that the mean serves as a balance point for the distribution. The total of the distances above the mean is exactly equal to the total of the distances below the mean. Thus, the total for the positive deviations is exactly equal to the total for the negative deviations, and the complete set of deviations always adds up to zero.

31
Q

Because the sum of the deviations is always zero, the mean of the deviations is also

A

zero and is of no value as a measure of variability. Specifically, the mean of the deviations is zero if the scores are closely clustered and it is zero if the scores are widely scattered. (You should note, however, that the constant value of zero is useful in other ways. Whenever you are working with deviation scores, you can check your calculations by making sure that the deviation scores add up to zero.)

32
Q

the process of squaring deviation scores does more than simply get rid of plus and minus signs. It results in a measure of variability based on squared distances. Although variance is valuable for some of the inferential statistical methods

A

the concept of squared distance is not an intuitive or easy-to-understand descriptive measure. For example, it is not particularly useful to know that the squared distance from New York City to Boston is 26,244 miles squared. The squared value becomes meaningful, however, if you take the square root. Therefore, we continue the process one more step.

33
Q

Standard deviation

A

is the square root of the variance and provides a measure of the standard, or average distance from the mean.

34
Q

Because the standard deviation and variance are defined in terms of distance from the mean, these measures of variability are used only with

A

numerical scores that are obtained from measurements on an interval or a ratio scale. Recall from Chapter 1 that these two scales are the only ones that provide information about distance; nominal and ordinal scales do not. Also, recall from Chapter 3 that it is inappropriate to compute a mean for ordinal data and it is impossible to compute a mean for nominal data. Because the mean is a critical component in the calculation of standard deviation and variance, the same restrictions that apply to the mean also apply to these two measures of variability. Specifically, the mean, the standard deviation, and the variance should be used only with numerical scores from interval or ratio scales of measurement.

35
Q

SS, or sum of squares, is the

A

sum of the squared deviation scores.

36
Q

To find the sum of the squared deviations, the formula instructs you to perform the following sequence of calculations:

A
  1. Find each deviation score (X - μ)
  2. Square each deviation score (X - μ) ^ 2
  3. Add the squared deviations.
    The result is SS, the sum of the squared deviations.
    Definitional formula: SS = Σ(X - μ)^2
37
Q

Although the definitional formula is the most direct method for computing SS, it can be awkward to use. In particular, when the mean is not a whole number, the deviations all contain decimals or fractions, and the calculations become difficult. In addition, calculations with decimal values introduce the opportunity for rounding error, which can make the result less accurate. For these reasons, an alternative formula has been developed for computing SS. The alternative, known as the

A

computational formula, performs calculations with the scores (not the deviations) and therefore minimizes the complications of decimals and fractions.

38
Q

Computational formula to find SS for the population

A

The first part of this formula directs you to square each score and then add the squared values, . In the second part of the formula, you find the sum of the scores, , then square this total and divide the result by N. Finally, subtract the second part from the first.
SS = ΣX^2 - ((ΣX)2/N)

39
Q

the two formulas produce exactly the same value for SS. Although the formulas look different, they are in fact equivalent. The definitional formula provides the most direct representation of the concept of SS; however, this formula can be awkward to use, especially if the mean includes a fraction or decimal value. If you have a small group of scores and the mean is a whole number, then the definitional formula is fine; otherwise

A

the computational formula is usually easier to use.

40
Q

A sample statistic is said to be biased if, on the average, it consistently

A

overestimates or underestimates the corresponding population parameter.

41
Q

The goal of inferential statistics is to use the limited information from samples to draw general conclusions about populations. The basic assumption of this process is that samples should be representative of the populations from which they come. This assumption poses a special problem for variability because

A

samples consistently tend to be less variable than their populations. Notice that a few extreme scores in the population tend to make the population variability relatively large. However, these extreme values are unlikely to be obtained when you are selecting a sample, which means that the sample variability is relatively small. The fact that a sample tends to be less variable than its population means that sample variability gives a biased estimate of population variability. This bias is in the direction of underestimating the population value rather than being right on the mark.

42
Q

Fortunately, the bias in sample variability is consistent and predictable, which means it can be

A

corrected. For example, if the speedometer in your car consistently shows speeds that are 5 mph slower than you are actually going, it does not mean that the speedometer is useless. It simply means that you must make an adjustment to the speedometer reading to get an accurate speed. In the same way, we will make an adjustment in the calculation of sample variance. The purpose of the adjustment is to make the resulting value for sample variance an accurate and unbiased representative of the population variance.

43
Q

the sample formula has exactly the same structure as the population formula and instructs you to find the sum of the squared deviations using the following three steps:

A
1. Find the deviation from the mean for each score: 
deviation = X - M
2. Square each deviation: 
squared deviation = (X - M) ^2
3. Add the squared deviations: 
SS = Σ(X - M)^2 (Definitional formula)
44
Q

Computational formula to find SS for the sample

A

SS = ΣX^2 - ((ΣX)2/n)

45
Q

the sample formulas divide by n − 1 unlike the population formulas, which divide by N

A

This is the adjustment that is necessary to correct for the bias in sample variability. The effect of the adjustment is to increase the value you will obtain. Dividing by a smaller number (n − 1 instead of n) produces a larger result and makes sample variance an accurate and unbiased estimator of population variance.

46
Q

the formulas for sample variance and standard deviation were constructed so that the sample variability provides a good estimate of population variability. For this reason, the sample variance is often called

A

estimated population variance, and the sample standard deviation is called estimated population standard deviation. When you have only a sample to work with, the variance and standard deviation for the sample provide the best possible estimates of the population variability.

47
Q

degrees of freedom

A

For a sample of n scores, the degrees of freedom, or df, for the sample variance are defined as df = n - 1 . The degrees of freedom determine the number of scores in the sample that are independent and free to vary.

48
Q

To calculate sample variance (mean squared deviation), we find the sum of the squared deviations (SS) and divide by the number of scores that are free to vary. This number is n - 1 = df . Thus, the formula for sample variance is

A

s^2 = sum of squared deviations/number of scores free to vary = SS/(n - 1)

49
Q

unbiased

A

A sample statistic is unbiased if the average value of the statistic is equal to the population parameter. (The average value of the statistic is obtained from all the possible samples for a specific sample size, n.)

50
Q

Although no individual sample is likely to have a mean and variance exactly equal to the population values, both the sample mean and the sample variance, on average, do provide

A

accurate estimates of the corresponding population values.

51
Q

Adding a constant to each score

A

does not change the standard deviation.

52
Q

Multiplying each score by a constant causes the standard deviation to be

A

multiplied by the same constant

53
Q

Standard deviation is primarily a descriptive measure; it describes how variable, or

A

how spread out, the scores are in a distribution. Behavioral scientists must deal with the variability that comes from studying people and animals. People are not all the same; they have different attitudes, opinions, talents, IQs, and personalities. Although we can calculate the average value for any of these variables, it is equally important to describe the variability. Standard deviation describes variability by measuring distance from the mean. In any distribution, some individuals will be close to the mean, and others will be relatively far from the mean. Standard deviation provides a measure of the typical, or standard, distance from the mean.

54
Q

As a rule of thumb, roughly 70% of the scores in a distribution are located within a distance of one standard deviation from the mean, and almost all of the scores (roughly 95%) are

A

within two standard deviations of the mean. In this example, the standard distance from the mean is s = 4 points so your image should have most of the boxes within 4 points of the mean, and nearly all of the boxes within 8 points.

55
Q

the relative position of a score depends in

A

part on the size of the standard deviation.

56
Q

In general, low variability means that existing patterns can be seen clearly, whereas high variability tends to

A

obscure any patterns that might exist.

57
Q

In the context of inferential statistics, the variance that exists in a set of sample data is often classified as error variance. This term is used to indicate that the sample variance represents

A

unexplained and uncontrolled differences between scores. As the error variance increases, it becomes more difficult to see any systematic differences or patterns that might exist in the data. An analogy is to think of variance as the static that occurs on a radio station or a cell phone when you enter an area of poor reception. In general, variance makes it difficult to get a clear signal from the data. High variance can make it difficult or impossible to see a mean difference between two sets of scores, or to see any other meaningful patterns in the results from a research study.

58
Q

Is it possible to obtain a negative value for the variance or the standard deviation?