5. Measuring Variables and Sampling Flashcards
nominal scale
variables that are categorical in nature. They have no innate ranking or order, and only serve to describe the type of an attribute. If only two types are possible, the variable can also be called binary or boolean.
(eg. gender, marital status, memory strategy, personality type, college major, type of therapy, experimental condition [treatment vs. control] )
variable
a condition or characteristic that can take on different values
ordinal
variables that have an innate ranking, but the distance between the rankings is not necessarily the same.
(eg. position in the family, placement in a race, strength of agreement or disagreement)
interval
a variable which can be quantitatively measured, where the distance between rankings is the same, but the measurement of 0 does not imply the absence of the thing being measured. (ie. an arbitrary 0 point)
(eg. temperature scales (Farenheit or Celsius), IQ scores
ratio measurement
a quantitative measurement with an absolute zero point
eg. age, Kelvin temperature measurement, response time, height, income
What are the four distinguishing characteristics of the four levels of measurement?
The four levels of measurement can be distinguished by whether they refer to a quantity or type (nominal), whether the distance between increments is the same or different (ordinal), and whether the scale has an absolute zero point (interval or ratio).
What are the two necessary properties for good measurement?
reliability and validity
reliability
the consistency (or stability) of the measurements
validity
the extent to which the measurements actually measure what the researchers intend for them to measure. Accuracy of the interpretations, inferences or actions made on the basis of the measurements.
reliability coefficient
a specific correlation coefficient that acts as an index to suggest the reliability of a measure. The stronger and more positive the better. (ie. > 0.70)
What are the types of reliability tests?
test-retest reliability
equivalent-forms reliability
internal consistency
interrater reliability
test-retest reliability
refers to the consistency of measures over time. to evaluate this reliability, the second set of measurements should be taken without a treatment condition present, seeking to maximize reliability. It is challenging to determine the appropriate time interval, and generally the longer the time interval, the lower the reliability coefficient will be.
equivalent forms reliability
refers to the consistency of measurement across two different research instruments, designed to measure the same thing. In order to have a high reliability coefficient, individuals should have similar scores on each instrument.
(eg. ACT / SAT or GRE/MAT )
internal consistency reliability
the consistency with which the items on a single instrument measure a single construct. Generally, as the length of these instruments grows, so too does the internal consistency reliability. Measuring this only requires one administration of the instrument. (generally called coefficient alpha or cronbach’s alpha)
(eg. personality tests)
interrater reliability
the consistency or degree of agreement between two or more scores, raters, judges, etc. (also measured by inter-observer agreement - the percentage of time that different observers’ ratings are in agreement)
How does a researcher obtain evidence of reliability in measurement?
Depending on the type of reliability they seek to determine :
- repeat the measure on the same group after a time interval - measures test-retest reliability
- measure the # of agreement between different judges of the same observation measures interrater reliability
- measure the degree of consistency between items that measure the same construct measures internal consistency reliability
- measure the degree to which individual scores are consistent across two different instruments measures equivalent forms reliability
operational definition
the way a construct is defined, represented, and measured in a research study. (ie. set the definition for the construct for the purposes of the study)
(eg. “disadvantaged people” may be operationalized as “individuals who have incomes below the poverty level for the past six months”
What is the key challenge in creating an operational definition?
An operational definition must be specific enough to be VALID for the study you are conducting, but not so specific that it lacks ecological validity and therefore has no relevance in the outside world.
“Is your definition truly representative?”
validation
the continuous process of gathering evidence regarding the soundness of inferences made from measurements
content validity
judgement by experts of the degree to which items, task or questions on a test adequately represent the construct. (ie. prima facie - do they appear to measure it? is anything excluded? in anything extraneous included? generally requires the judgement of multiple experts
multidimensional construct
a construct consisting of two or more dimensions, as contrasted with a unidimensional construct. (ie. MBTI test measures personality type on the basis of four constructs - E/I , N/S, F/T, P/J)
Factor analysis
a statistical analysis procedure used to determine the number of dimensions present in a set of items. this is important because it is necessary in order to correctly interpret the resulting scores of the instrument.
homogeneity
the degree to which a set of items measures a single construct or trait.
validity coefficient
a correlation coefficient used in validity evidence
criterion-related validity
the degree to which scores predict or relate to a known criterion such as a future performance or an already-established test
predictive validity
the degree to which scores obtained at one time correctly predict the scores on a criterion at a later time
concurrent validity
the degree to which test scores obtained at one time correctly related to the scores on a known criterion obtained at approximately the same time
convergent validity evidence
validity evidence based on the degree to which the focal test scores correlate with independent measures of the same construct
discriminant validity evidence
validity evidence based on the degree to which the focal test scores do not correlate with measures of different constructs
known groups validity evidence
the degree to which groups that are known to differ on a construct actually differ according to the test used to measure the construct
How does one obtain evidence of validity in based on content?
experts on the construct examine the instrument and determine whether the contents adequately represent the construct
norming group
the reference group upon which reported reliability and validity evidence is based.
What are some sources of information about psychometric tests and measures?
APA PsychINFO, PsychARTICLES,
How do researchers gather validity evidence based on internal structure?
They use factor analysis, which indicates how many constructs are present in the set of items, the homogeneity of each set by calculating item to total correlation and coefficient alpha
How do researchers obtain evidence of validity based on relations to other variables?
They determine if the scores are related to known criterion by collecting concurrent and predictive validity evidence.
sample
a set of elements selected from a population
sampling
the process of drawing a sample from a population
representative sample
a sample that resembles the population
equal probability of selection method
sampling method in which each individual element has an equal probability of selection into the sample
statistic
a numerical characteristic of sample data
parameter
a numerical characteristic of a population
sampling error
differences between sample values and the true population parameter
census
collection of data from everyone in the population
sampling frame
a list of all the elements in a population
response rate
the percentage of individuals selected to be in a sample and who participate in the research study
biased sample
a sample that is nonrepresentative
proximal similarity
generalization to people, places, settings, and contexts that are similar to those described in the research study
simple random sampling
a popular and basic equal probability selection method, where every individual has an equal chance of being included in the sample. (eg. random number generator)
stratified random sampling
division of population elements into mutually exclusive groups and then selection of a random sample from each group
stratification variable
the variable on which the population elements are divided for the purpose of stratified sampling
proportional stratified sampling
where the stratified samples drawn match the proportions of the stratification variables in the population.
(ie. 20% are women, so 2 of the 10 in the sample are women)
disproportional stratified sampling
stratified sampling where the sample proportions are made to be different from the population proportions on the stratification variable.
cluster random sampling
sampling method where clusters are randomly selected (eg. neighborhoods, schools, workplaces, etc.)
one stage cluster sampling
clusters are randomly selected and all the elements in the selected clusters constitute the sample
two stage cluster sampling
clusters are randomly selected, and a random sample of elements is drawn from each of the selected clusters
systematic sampling
the sampling method where one determines the sampling interval (k), randomly selects an element between 1 and k, and then selects every kth element. (sampling interval is the population size divided by the desired sample size
)
periodicity
a problematic situation in systematic sampling that can occur if there is a cyclical pattern in the sampling frame
What are some strengths and weaknesses of the major random sampling techniques?
- systematic sampling
- cluster random sampling
- stratified random sampling
- simple random sampling
convenience sampling
use of individuals who are readily available, volunteer, or are easily recruited for inclusion in a sample (eg. psychology 101 students)
quota sampling
a researcher decides on the desired sample size, and quotas for groups identified in the sample. The researcher then fills the sample by convenience according to those quotas. (eg. 50% men and women in sample from psychology 101 course, even though course is 80% women)
purposive sampling
a researcher specifies the characteristics of the population of interest and then locates individuals who have those characteristics
snowball sampling
each sampled person is asked to identify other potential participants with the inclusion characteristic. (particularly useful for hard-to-find populations)
What are the key characteristics of the different types of non random sampling methods?
- convenience sample
- quota sample
- purposive sample
- snowball sample
random selection
selection of participants using a random sampling method
random assignment
placement of participants into experimental conditions on the basis of a chance process.
What is the purpose of random selection?
The purpose of random selection is to obtain a sample that is representative of the population.
What is the purpose of random assignment?
The purpose of random assignment is to produce equivalent groups for use in the experiment.
sample size calculator
a statistical program used to provide a recommended sample size
what are some general suggestions in determining the ideal sample size?
- if the population is less than 100 people, try to include the entire population
- try to get a relatively large sample size when possible
- examine other research studies in the topic to see what has been done before
- use a sample size calculator
- use larger sample sizes when you want to break down the data into multiple subcategories.
- larger sample sizes when you want to obtain a relatively narrow confidence interval
- certain statistical techniques require larger or smaller sample sizes. (see page 268)
mixed sampling
use of a combination of quantitative and qualitative sampling methods
maximum variation sampling
identification / selection of a wide range of cases for data collection and analysis (eg. psychotherapy clients with high-medium and low self-esteem)
extreme case sampling
identification/selection of cases from the extremes or poles of a dimension (ie. highest and lowest rank in a class)
homogeneous sample selection
identification and selection of a small and homogeneous group or a set of homogeneous cases for intensive study
(eg. adolescent girls for a focus group on diet and body images)
typical case sampling
finding what is believed to the the typical or average case
critical case sampling
identification/selection of particularly important cases
negative case sampling
cases that you believe will probably disconfirm your generalizations so that you can make sure that you are not just selectively finding cases to support your theory.