Measurement Flashcards
mean
expected value, measure of central tendency
standard deviation
In statistics and probability theory, standard deviation (represented by the Greek letter sigma, σ) shows how much variation or dispersion exists from the average (mean), or expected value. A low standard deviation indicates that the data points tend to be very close to the mean; high standard deviation indicates that the data points are spread out over a large range of values.
The standard deviation of a random variable, statistical population, data set, or probability distribution is the square root of its variance.
Square root of the variance
variance
The variance is a measure of how far a set of numbers is spread out. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean (expected value). In particular, the variance is one of the moments of a distribution.
The variance of a random variable X is its second central moment, the expected value of the squared deviation from the mean μ = E[X]:
(s2) = Σ [(xi - x̅)2]/n - 1
s2 = Variance Σ = Summation, which means the sum of every term in the equation after the summation sign. xi = Sample observation. This represents every term in the set. x̅ = The mean. This represents the average of all the numbers in the set. n = The sample size. You can think of this as the number of terms in the set.
frequency distribution
Example: Newspapers
These are the numbers of newspapers sold at a local shop over the last 10 days:
22, 20, 18, 23, 20, 25, 22, 20, 18, 20
Let us count how many of each number there is: Papers Sold Frequency 18 2 19 0 20 4 21 0 22 2 23 1 24 0 25 1
It is also possible to group the values. Here they are grouped in 5s: Papers Sold Frequency 15-19 2 20-24 7 25-29 1
percentile
The 20th percentile is the value (or score) below which 20 percent of the observations may be found
If a score is in the 86th percentile, it is higher than 86% of the other scores.
1Sort the test scores so they are in order from lowest to highest score. Normally this is done by entering the scores in a computer spread sheet and then clicking on the sort command. You can do this manually by listing the possible scores on the test in order and then making a hash mark beside the appropriate score for each test.
2 Start to calculate the percentile of your test score (as an example we’ll stick with your score of 87). The formula to use is L/N(100) = P where L is the number of tests with scores less than 87, N is the total number of test scores (here 150) and P is the percentile. Count up the total number of test scores that are less than 87. We’ll assume the number is 113. This gives us L = 113 and N = 150.
3 Divide out L/N to get the decimal equivalent. (113/150 = 0.753). Multiply this by 100 (0.753(100) = 75.3).
4 Discard the digits to the right of the decimal point. For 75.3 this leaves 75. This is the percentile of a score of 87 and means you did better than 75% of the people who took the test. Not bad at all!
5 Calculate the score which is at a given percentile. Let’s say you want to know what the median test score is (the test score for which 50% of the students scored less and 50% scored as much or higher. We use the same variables but a slightly different equation. The formula is P/100(N) = L. In our example, P = 50 and N = 150 so we have 50/100(150) = 75.
6 Count the number of test scores starting with the lowest until you get to 75. The next higher score (#76) is the score at the 50th percentile.
calculate z-score
z = [raw score - mean of a population] / (standard deviation of a population)
Or
Z= deviation / standard deviation
covariance
1
Calculate the mean, or average, of the first variable. This is done by adding all the data points and then dividing by the number of data points. For example, take the data sets {1, 3, 3, 5} and {12, 12, 11, 7} for the variables X and Y respectively. The mean for the first variable, X, is: (1 + 3 + 3 + 5) / 4 = 3. 2 Calculate the mean for the second variable the same way you did for the first. Continuing the example, the mean for the second variable, Y, is: (12 + 12 + 11 + 7) / 4 = 10.5. 3 Multiply each data point for the first variable by the corresponding data point for the second variable. Again for the example: {12 x 1, 12 x 3, 11 x 3, 7 x 5} = {12, 36, 33, 35}. 4 Calculate the mean of the data set you created in the step above. This is the mean (XY). Continuing the example: (12 + 36 + 33 +35) / 4 = 29. 5 Multiply the mean of X by the mean of Y. For the example, this is 3 x 10.5 = 31.5. 6 Subtract the difference between the mean of X and Y, as calculated in Step 5, from the mean (XY), as calculated in Step 4. This will give you the covariance. Finishing the example: 29 - 31.5 = -2.5. This is a negative covariance, indicating that when one variable increased, the other decreased.
correlation
Pearson correlation
Let us call the two sets of data “x” and “y” (in our case Temperature is x and Ice Cream Sales is y):
Step 1: Find the mean of x, and the mean of y Step 2: Subtract the mean of x from every x value (call them "a"), do the same for y (call them "b") Step 3: Calculate: a × b, a2 and b2 for every value Step 4: Sum up a × b, sum up a2 and sum up b2 Step 5: Divide the sum of a × b by the square root of [(sum of a2) × (sum of b2)]
DsqCAR
Deviation from Mean
(Squared)
Collect square deviation across subjects,
Average mean of the sum of squares (variance),
Root (standard deviation) avg distance from the mean
Calculate standard deviation
9, 2, 5, 4, 12, 7, 8, 11, 9, 3, 7, 4, 12, 5, 4, 10, 9, 6, 9, 4
Calculate the sample standard deviation of the length of the crystals.
Calculate the mean of the data. Add up all the numbers and divide by the total number of data points. (9+2+5+4+12+7+8+11+9+3+7+4+12+5+4+10+9+6+9+4) / 20 = 140/20 = 7 Subtract the mean from each data point (or the other way around, if you prefer... you will be squaring this number, so it does not matter if it is positive or negative).
(9 - 7)2 = (2)2 = 4 (2 - 7)2 = (-5)2 = 25 (5 - 7)2 = (-2)2 = 4 (4 - 7)2 = (-3)2 = 9 (12 - 7)2 = (5)2 = 25 (7 - 7)2 = (0)2 = 0 (8 - 7)2 = (1)2 = 1 (11 - 7)2 = (4)22 = 16 (9 - 7)2 = (2)2 = 4 (3 - 7)2 = (-4)22 = 16 (7 - 7)2 = (0)2 = 0 (4 - 7)2 = (-3)2 = 9 (12 - 7)2 = (5)2 = 25 (5 - 7)2 = (-2)2 = 4 (4 - 7)2 = (-3)2 = 9 (10 - 7)2 = (3)2 = 9 (9 - 7)2 = (2)2 = 4 (6 - 7)2 = (-1)2 = 1 (9 - 7)2 = (2)2 = 4 (4 - 7)2 = (-3)22 = 9
Calculate the mean of the squared differences. (4+25+4+9+25+0+1+16+4+16+0+9+25+4+9+9+4+1+4+9) / 19 = 178/19 = 9.368 This value is the sample variance. The sample variance is 9.368 The population standard deviation is the square root of the variance. Use a calculator to obtain this number. (9.368)1/2 = 3.061 The sample standard deviation is 3.061
Psychometrics
Psychometrics: a field that studies the design, operation, interpretation/use and evaluation of a measurement instrument or an assessment program using scientific research methodology.
Hubley & Zumbo (2013): “a field of study that focuses on the theory and techniques associated primarily with the measurement of constructs as well as the development, interpretation, and evaluation of tests and measures.”
Two types of measurement error
Random measurement error is the difference between a person’s true score and the observed score that is caused by sporadic and unknown sources in the process of measurement. Random measurement errors are noises, they can neither be identified nor explained, but they can be quantified.
Systematic measurement error is the difference between a person’s true score and the observed score that is caused by certain aspects in the system of measurement operation. Systematic errors are biases, they can be identified, explained, and quantified. A typical example of systematic measurement error is rater (judge) subjectivity.
Reliability
Reliability is the degree of consistency among infinite (independent and identical) repeated measures of the same phenomenon from the same individual.
If the operation is perfectly reliable, then there will be no difference between one score and another over the infinite repeated measures.
standard error of measurement and true score
In reality, to some degree, there will be variation among repeated measures of the same phenomenon from the same individual. The standard deviation of such variation over infinite repeated measures is referred to as the standard error of measurement (SEM), and the mean of the repeated measures is referred to as the true score
Spearman’s Theoretical Reliability
ratio of Variance of true score / Var(x score)