Exam 1 Terms Flashcards
What is a case?
An individual unit that is often a person, place, or thing. A row of data usually represents a case.
What are variables?
Variables are a characteristic or measurement that describes the cases. Typically, a column of data represents a variable. Examples are height, weight, age, temperature, time, etc.
What are the two main types of variables?
categorical and quantitative
What is a categorical variable?
A variable comprised of 2 or more categories. (ex. gender)
What is a quantitative variable?
A variable that measures a numerical quantity. (ex. GPA, pulse rate, height)
What are the two subsets of quantitative variables?
Continuous and discrete
What is a continuous variable?
A type of quantitative variable that can take on an infinite set of values within some range. (ex. temperature, life expectancy, food calories)
What is a discrete variable?
A type of quantitative variable that has a finite set of possible values. (ex. number of babies born in a pregnancy, number of courses you are taking next semester)
What is a population?
The entire set of cases.
What is a sample?
A subset of the population. We collect data for the sample.
What is a parameter?
Describes the population. (ex. population mean, GPA for an entire class)
What is a statistic?
Describes the sample. (ex. sample mean, GPA for a selection of students in a class)
What is statistical inference?
The process of using data from a sample to gain information about the population.
Why should we take random samples?
A random sample should be selected from a population, otherwise it may be prone to bias. The goal is to obtain a sample that is representative of the population.
What is a representative sample?
A subset of the population from which data are collected that accurately reflects the population.
What is bias?
The systematic favoring of certain outcomes.
What is sampling bias?
Systematic favoring of certain outcomes due to the methods employed to obtain the sample.
What is simple random sampling? Why do we do it?
A method of obtaining a sample where every member of the population has an equal chance of being selected (similar to drawing names from a hat). Samples are selected without replacement.
SRS is done to avoid sampling bias and to obtain a sample that’s representative of a population.
ex. if we wanted to research how long PSU students sleep at night, it would be best to randomly select students for the sample rather than only surveying students in an 8 AM class.
What is a convenience sample?
A method of obtaining a sample by ease of accessibility. These samples are NOT random and they may NOT represent the intended population.
Besides convenience sampling, what are other sources of bias?
- non-response bias
- response bias
What is non-response bias?
Individuals who do not participate in a study differ from those who do participate.
- inability to contact individual
- individual chose not to participate
What is response bias?
Individuals participate, but do not respond truthfully.
- may do so to align with social norms
- may do so to appease the researcher
What is a confounding variable?
A third variable that may explain the association between two other variables.
Ex. when ice cream sales increase, so do shark attacks. This is is association only, not causation. Temperature is a confounding variable here because as it increases, so do ice cream sales/going to the beach
What are the two main types of studies?
observational and experimental
What is an observational study?
Researchers simply observe the data as they occur. We cannot say that there is a cause and effect based on this type of study because there can be confounding variables.
These studies almost always have confounding variables.
Observational studies can almost never be used to establish causation.
ex. Question: Does coffee cause hyperactivity in college students?
A researcher randomly samples students and surveys them about their coffee intake and hyperactivity
What is an experimental study?
Researchers actively control one or more of the variables of interest. These studies can be used to prove cause and effect by manipulating the parameters of a study.
Ex. Question: Does coffee cause hyperactivity in college students?
A researcher randomly samples students and randomly assigns them to drink coffee with or without caffeine.
How can confounding variables be avoided?
By using a randomized experiment.