Sampling and Data Flashcards
Average
Also called mean; a number that describes the central tendency of the data
Blinding
not telling participants which treatment a subject is receiving
Categorical Variable
variables that take on values that are names or labels
Cluster Sampling
a method for selecting a random sample and dividing the population into groups (clusters); use simple random sampling to select a set of clusters. Every individual in the chosen clusters is included in the sample.
Continuous Random Variable
a type of random variable that can take on an infinite number of possible values within a given range. These values are typically real numbers and can include any value within an interval or across multiple intervals.
Control Group
a group in a randomized experiment that receives an inactive treatment but is otherwise managed exactly as the other groups
Convenience Sampling
a nonrandom method of selecting a sample; this method selects individuals that are easily accessible and may result in biased data.
Common Statistical Study Problems
- Sample problems (representative)
- Self-selected samples
- Sample size issues
- Undue influence
- Non-response or participation refusal
- Undue influence
- Causality (a relationship does not mean one causes the other to occur)
- Self-funded/Self-Interest studies
- Misleading use of data
- Confounding
Cumulative Relative Frequency
The term applies to an ordered set of observations from smallest to largest. The cumulative relative frequency is the sum of the relative frequencies for all values that are less than or equal to the given value.
Data
- a set of observations (a set of possible outcomes);
- Lowercase letters (e.g., x, y) are generally used to represent data values
Descriptive Statistics
Organizing and summarizing data is
Double-blind experiment
an experiment in which both the subjects of an experiment and the researchers who work with the subjects are blinded
Experimental Unit
any individual or object to be measured
Explanatory Variable
the independent variable in an experiment; the value controlled by researchers
Frequency
the number of times a value of the data occurs
Inferential Statistics
uses probability to determine how confident we can be that our conclusions are correct.
Informed Consent
Any human subject in a research study must be cognizant of any risks or costs associated with the study. The subject has the right to know the nature of the treatments included in the study, their potential risks, and their potential benefits. Consent must be given freely by an informed, fit participant.
Institutional Review Board
a committee tasked with oversight of research programs that involve human subjects
Level of measurement
The way a set of data is measured - data are classsified in four levels
- Nominal scale
- Ordinal scale
- Interval scale
- Ratio scale
Lurking Variable
a variable that has an effect on a study even though it is neither an explanatory variable nor a response variable
Nominal Scale
Qualitative (Categorical) data and cannot be ordered
Nonsampling Error
an issue that affects the reliability of sampling data other than natural variation; it includes a variety of human errors including poor study design, biased sampling methods, inaccurate information provided by study participants, data entry errors, and poor analysis.
Numerical Variable
variables that take on values that are indicated by numbers
Ordinal Level
Similar to nominal scale but can be ordered (e.g., list of top 5 banks in the U.S.)
Parameter
- a numerical characteristic of the whole population that can be estimated by a statistic.
- A key part of statistics is determining how accurately a statistic estimates a parameter
Placebo
an inactive treatment that has no real effect on the explanatory variable
Population
A collection of persons, things, objects, or measurements whose properties are being studied
Probability
- A mathmatical tool used to study randomness
- A number between zero and one, inclusive, that gives the likelihood that a specific event will occur
Proportion
the number of successes divided by the total number in the sample
Qualitative Data
a set of observations (a set of possible outcomes); qualitative data has an attribute whose value is indicated by a label
Quantitative Data
a set of numerical observations (a set of possible outcomes)
Quantitative Continuous Data
Data is continuous if it is the result of measuring (such as distance traveled or weight of luggage).
Quantitative Discrete Data
Data is discrete if it is the result of counting (such as the number of students of a given ethnic group in a class or the number of books on a shelf).
Random Assignment
the act of organizing experimental units into treatment groups using random methods
Random Sampling
a method of selecting a sample that gives every member of the population an equal chance of being selected.
Relative Frequency
the ratio of the number of times a value of the data occurs in the set of all outcomes to the number of all outcomes to the total number of outcomes
Representative Sample
a subset of the population that has the same characteristics as the population
Response Variable
the dependent variable in an experiment; the value that is measured for change at the end of an experiment
Sample
a subset of the population studied
Sampling Bias
not all members of the population are equally likely to be selected
Sampling Error
the natural variation that results from selecting a sample to represent a larger population; this variation decreases as the sample size increases, so selecting larger samples reduces sampling error.
Sample Types
- Random
- Stratified
- Cluster
- Systematic
Sampling with Replacement
Once a member of the population is selected for inclusion in a sample, that member is returned to the population for the selection of the next individual.
Sampling without Replacement
A member of the population may be chosen for inclusion in a sample only once. If chosen, the member is not returned to the population before the next selection.
Simple Random Sampling
a straightforward method for selecting a random sample; give each member of the population a number. Use a random number generator to select a set of labels. These randomly selected labels identify the members of your sample.
Statistic
a numerical characteristic of the sample; a statistic estimates the corresponding population parameter.
Stratified Sampling
a method for selecting a random sample used to ensure that subgroups of the population are represented adequately; divide the population into groups (strata). Use simple random sampling to identify a proportionate number of individuals from each stratum.
Systematic Sampling
a method for selecting a random sample; list the members of the population. Use simple random sampling to select a starting point in the population. Let k = (number of individuals in the population)/(number of individuals needed in the sample). Choose every kth individual in the list starting with the one that was randomly selected. If necessary, return to the beginning of the population list to complete your sample.
Treatments
different values or components of the explanatory variable applied in an experiment
Variable
a characteristic of interest for each person or object in a population. Usually notated as a capital letter (e.g., X, Y, etc.)
Variation
Present in any set of data
Two samples of the same population will yield diferent results