Chapter 15- Assessments and Grading Flashcards
standardized tests
tests given under uniform conditions and scored according to uniform procedures, teachers do not have much say in selecting these tests
classroom assessment
selected and created by teachers and can take many different forms- unit tests, essays, portfolios, projects, performances, oral presentations, critical because teaching involves making many kinds of judgements
measurement
an evaluation expressed in quantitative (number) terms, tells how much, how often, or how well by providing scores, ranks, or ratings, compare one students performance on a task to a standard or the performance of other students
assessment
procedures used to obtain information about student performance, broader than testing and measurement, includes all kinds of ways to sample and observe students’ skills, knowledge, and abilities, formal or informal
formative assessment
ungraded testing used before or during instruction to aid in planning and diagnosis, purpose is to guide teacher in planning and improving instruction and to help students improve learning, helps form instruction and provides feedback
pretest
type of formative test for assessing students knowledge, readiness, and abilities, identifies areas of weakness, not graded
summative assessment
testing that follows instruction and assesses achievement, purpose is to inform the teacher and the student about the level of accomplishment attained, provides a summary for accomplishment
norm-referenced testing
testing in which scores are compared with the average performance of others, people who have taken the test provide a backdrop for determining the meaning of an individual’s score
norm group
large sample of students serving as a comparison group for scoring tests, can determine if the test score is above, below, or around the average for the group the person belongs to, four types: class/school, school district, national samples, international samples, formed so that a variety of demographic samples are included, tend to encourage competition and do not say whether a student is ready to move on to the next level
criterion-referenced testing
testing in which scores are compared to a set performance standard, benchmark, or minimum passing score, measures the accomplishments of very specific objectives, results tell exactly what the student can or can’t do, best for teaching basic skills, standard can be arbitrary and based on the teachers experience
reliability
consistency of test results, scores are reliable if a test gives consistent and stable reading of a persons abilities from one occasion to the next, measuring reliability by giving the test on two separate occasions indicates stability (test-retest reliability), if a group of people take two equivalent forms of a test and the scores on both tests are comparable, this has alternate-form reliability, refers to the internal consistency or precision of the test (split-half reliability: compare performance on half of the test questions with performance on the other half)
standard error of measurement
hypothetical estimate of variation in scores if testing were repeated, the more reliable a test is, the less error there will be in the score we observe, a reliable test is defined as one with a small standard error of measurement
confidence interval
range of scores within which an individual’s particular score is likely to fall, calculated using the standard error of measurement and identify a range of scores above the actual test score and below it, width of an interval represents how much a student’s score might vary due to errors of measurement
true score
the score the student would get if the measurement were completely accurate and error-free, confidence interval may not include the true score
validity
degree to which a test measures what it is intended to measure, to have validity, the decisions and inferences based on the test must be supported by evidence, judged in relation to a particular use or purpose, content related evidence: test measures the skills covered in the course so test questions are about the important topics, criterion-related evidence: scores correlate with academic performance in school, construct-related evidence: demonstrated when the results of a test correlate with the results of another well-established, valid measure of that same construct, test must be reliable before it can be valid, and reliability does not guarantee validity
assessment bias
qualities of an assessment instrument that offend or unfairly penalize a group of students because of the students’ gender, SES, race, ethnicity, etc, can arise from many factors such as content, language, or examples that might distort the performance of a group, the questions asked may centre on experiences and facts more familiar to students from the dominant culture than to students from minority groups
culture-fair or culture-free testing
a test without cultural bias, unsuccessful, minority students scored the same or worse, cannot separate culture from cognition: every students learning is embedded in their own culture, and every test question emerges some kind of cultural knowledge
objective testing
multiple choice, matching, true/false, short answer, fill-in tests, scoring answers does not require interpretation, variety in question types lower’s students anxieties because the entire grade does not depend on one type of question that a particular student may find difficult
stem
the question part of a multiple-choice item
distractors
wrong answers offered as choices in a multiple choice item
authentic assessments
assessment procedures that test skills and abilities as they would be applied in real-life situations, test those capabilities and habits we think are essential, and test them in context, make tests replicate the challenges at the heart of each academic discipline
performance assessments
any form of assessment that requires students to carry out an activity or produce a product in order to demonstrate learning, portfolios and exhibitions
portfolio
a collection of student work in an area, showing growth, self-reflection, and achievement, contains anything that demonstrates learning in the area being taught and assessed, portfolios often display unfinished pieces, criterion referenced
exhibition
a performance test or demonstration of learning that is public and usually takes an extended time to prepare, helps students understand the qualities of good work and recognize those qualities in their own productions and performances, criterion referenced
scoring rubrics
rules that are used to determine the quality of a students performance, shouldnt be too specific or too general, focused on worthwhile skills that can be taught and assessed, achieve reliability not because they capture underlying agreement among raters, but because the rubrics limit options and thus limit variability in scoring
norm-referenced grading
assessment of students’ achievement in relation to one another, based on comparison with others who also took the course
grading on the curve
norm-referenced grading that compares student’s performance to an average level, distributes grade proportions based on the normal or bell-shaped curve, only a few students receive very high or very low grades and most receive grades between a C+ and B-
criterion referenced grading
assessment of each student’s mastery of course objectives, grade represents accomplishments, may represent a certain number of objectives met satisfactorily, criteria is usually spelled out in advance
mean
arithmetical average, one way to describe central tendency, a score that is representative of the whole distribution of scores
central tendency
typical score for a group of scores
median
middle score of a group of scores, one way to describe central tendency, half the scores are higher, half the scores are lower
mode
most frequently occurring score, one way to describe central tendency
standard deviation
measure of how widely scores vary from the mean, the larger the standard deviation, the more spread out the scores are in the distribution, distributions with small SDs have less variability
variability
degree of difference or deviation from the mean, distributions with small SDs have less variability
range
distance between the highest and lowest scores in a group
normal distribution
the most commonly occurring distribution, in which scores are distributed evenly around the mean, bell-shaped curve that describes many naturally occurring physical and social phenomena, many scores fall in the middle, giving the curve its bell appearance, mean is the midpoint, SD+- 1 = 34% + 34%, SD+-2 = 14 +34 +34 +14
percentile rank
percentage of those in the norming sample who scored at or below an individual’s score, shows the percentage of students in the norm group that scored at or below a particular raw score, score the same as or better than 3 quarters of the students in the norm, you were in the 75% percentile,
grade-equivalent score
measure of grade level based on comparison with norming samples from each grade, obtained from separate norm groups for each grade level, generally listed as numbers like 8.3, 10.2, the whole number is the grade, and the decimal is the tenths of the year, high score means superior mastery of material at the grade level, do not mean the same thing at every grade level
standard scores
scores based on the standard deviation
Z score
standard score indicating the number of standard deviations above or below the mean, 78 avg on the test, you scored a 70, z score= +2, z score of 0 means you scored the mean
z = raw score-mean / SD
t score
standard score with a mean of 50 and a standard deviation of 10, t score of 50 means average performance
stanine scores
whole-number scores from 1 to 9, each representing a range of raw scores, 5 is the mean, SD is 2, provide a method of considering a student’s rank because each of the nine scores includes a specific range of percentile scores in the normal distribution, advantageous because it encourages teachers to view a student’s score in more general terms instead of making fine distinctions based on a few point differences
high-stakes testing
standardized tests whose results have powerful influences when used by school administrators, other officials, or employers to make decisions, mismatches between what is taught and what is tested tend to occur, testing narrows curriculum, tests used have to be reliable, valid for the purposes used, and free of bias,
accountable
making teachers and schools responsible for student learning, usually by monitoring learning with high stakes testing
achievement tests
standardized tests measuring how much students have learned in a given content area