Quantitative exam Flashcards
non equivalent comparison group designs
can be used when we are unable to randomly assign participants to groups but find an existing group that appears similar to the experimental group
what is a quasi experiment
research design that has some, but not all the characteristics of a true experiment
missing element usually the random assignment of subjects to control/experiment groups
O1 x O2
O1 O2
what are ways to strengthen the internal validity of non equivalent comparison groups design?
the use of multiple pretests:
O1 O2 x O3
O1 O2 O3
by using the same pre tests at different time points before intervention begins, we can detect whether one group is already engaged in a change process and the other is not
what is a simple time series design?
a design in which a single population group is studied over a period during which an intervention takes place
it may be a simple pretest/post test design, or several measurements are made both before and after an intervention
what is a cross sectional study?
cross sectional study examines phenomenon by taking a cross section of it as one point in time
what are cross sectional studies useful for ?
they may help rule out multiple of the alternative explanations through multi-variate statistical procedures
what does the case control design do?
compares groups of cases that have had contrasting outcomes and then collect retrospective data about past differences that might explain the differences in outcomes
(also a type of quasi experimental design)
what is an example of a case control study?
a study trying to find out that if people who smoke (the factor) are more likely to get cancer (the outcome).
- the experimental group (the cases) were people with lung cancer.
- the control group were people without lung cancer, and some in each group were smokers
- if a larger proportion of the cases with cancer were smokers than the control group, then it implies that the hypothesis that smoking causes cancer is valid
what are the pitfalls of carrying out experiments and quasi experiments in social work agencies?
- fidelity of the intervention (the degree of exactness with which something is copied or reproduced)
- contamination of the control condition (the control group and the experimental group interact)
- resistance to the case assignment protocol
- problems in client recruitment and retention
what are qualitative techniques for experimental or quasi experimental research?
-ethnographic shadowing
-participatory observation during training or group supervision (to identify
differences between the stated intervention and what is actually done)
- informal conversations with agency staff (to identify problems with research protocol)
- video-tapping practitioner-client sessions
- practitioner logs and event logs
what are the importance and limitations of experiments and quasi experimental designs
more likely to derive a more robust estimate of the effect of an intervention than an non experimental design
we can test the intervention variables one by one (this is not a holistic approach
x1 —–> y
x2
…
Xn
what is non probability sampling?
the chance of being selected is not equal for every person in the population (example: people might be selected because they happen to be in a certain place at a particular time or have certain conditions)
what is probability sampling?
every person in the population has an equal chance or known probability of being selected to be part of the sample (example: if the sample is 5% of the population, each person has a 1 in 20 chance of being selected)
what are the four major types of non probability sampling?
- availability (accidental or convenience sampling)
- purposive or judgemental sampling
- quota sampling
- snowball sampling
what is availability sampling?
a non probability method that selects those who are available or easy to find
eg: you want to survey survey homeless people in a community about drug use, but only 3 out of the 8 homeless people give you permission to distribute the survey, so you only survey those three of them
lowest in reliability
what is purposive sampling?
a sampling method in which elements are chosen based on the purpose of the study
you purposively select those respondents who would be able to answer your research question based on your own knowledge of the population
example: you may choose clients who have been particularly successful or unsuccessful in a treatment program
can include an entire population of some limited group (students of the one year program at LU) or a subset of the population (one class of the one year program)
what is quota sampling?
the researcher sets quotas to ensure that the sample represents certain characteristics in proportion to their prevalence in the population
designed to avoid the flaw in availability sampling
you have to know the characteristics of the population ahead of time
example: if you want to have a sample proportional to the population in terms of gender, you have to know the gender ratio of the population, then collect samples until yours matches
what is snowball sampling?
a researcher first identifies one person of the population of interest, then asks that person to find others in the population of interest. each time, the person is asked to refer the researcher to another person and so on.
good for cases when members of the population are hard to find
what are the four major types of probability sampling ?
- simple random sampling
- systematic random sampling
- stratified random sampling (proportionate and disproportionate)
- cluster sampling
what is simple random sampling
most widely known type of sampling where each member of the population has a statistically equal chance of being selected as a sample, thus if the sample size is big enough, the sample will represent the characteristics of the population
often used when little to nothing is known about the population
eg: random sampling from the phone book
what is systematic random sampling
involves the selection of every kth person from a sampling frame, where k, the sampling interval, is calculated as:
k = population size (N) / sample size (n)
if the population is 1000 and the sample is 100, then k =10. then, starting at a random point on the sampling frame
more precise than random sampling
what is stratified random sampling?
the population is first divided into two or more mutually exclusive homogenous subsets based on some categories of variables of interest in the research. then, the appropriate number of respondents are drawn from each subset.
example: you are doing stratified sampling for a community and your interest is religion and your sample is 10% of the population:
population sample christians - 580 58 muslims- 210 21 jews- 30 3 others- 180 18 total- 1000 100
stats Canada often uses stratified sampling
what is cluster sampling?
used when ‘natural’ groups are evident in the population
a convenient method when there are no complete and/or updated lists of people
Eg: studying bar employees in thunder bay with a sample of n= 50
-there is no list of employees to draw from, but you have a list of all the bars in thunder bay. you would randomly sample ten bars and then randomly sample 5 employees within each bar.
what is multi cluster sampling ?
you develop an alternative sampling frame, e.g communities in thunder bay. then, you use a probability method to sample the communities and then select:
- blocks within the community
- streets in the blocks
- houses in the streets
- people in the houses
what is sampling error?
a term used to describe the level of accuracy for representing the population
a statistical error that occurs when an analyst does not select a sample that represents the entire population of data
what is external validity?
when the results derived from a sample can be generalized to the population, we reach external validity
what is survey research?
the administration of questionnaires to a sample of respondents selected from the population
what are the advantages of survey research?
-standardization ( easy for quantitative analysis), economy, and the amount of data that can be collected
weaknesses of survey research
being artificial and potentially superficial
what is secondary data analysis?
a form of research in which the data collected and processed in one study are re-analyzed in a subsequent study
what are examples of qualitative sources?
biographies, newspapers, memoirs
what are examples of quantitative sources?
census, survey data
what are the advantages of secondary analysis?
- high quality
- the existing data sets for public use are usually large
- low or no cost, saving money and time
- using other sources can also facilitate a comparison with other data samples and allow multiple sets of data to be combined
what are the limitations of secondary analysis?
- does not permit the process from formulating a research question to designing methods to answer that question
- not feasible for secondary data analyst to engage in the habitual process of making observations and developing concepts
- problem of validity
- problem of reliability
what is content analysis?
a way of transforming qualitative materials into quantitative data
what is the survivorship bias?
the logical error of concentrating on the people or things that made it past some selection process and overlooking those that did not, typically because of their lack of visibility.
what is a nominal variable?
A nominal variable is a type of variable that is used to name, label or categorize particular attributes that are being measured.
eg: gender, marital status, religious affiliation, race, college major, birthplace
these attributes are not ranked in any way. each of them is not better or worse or higher or lower than the other. they are just different in sense of measurement
what is an ordinal measurement?
refers to a scale whose attributes can be ranked from lowest to highest.
eg: education can be categorized as primary school, middle school, high school, undergraduate, graduate
if these are put on a numbered scale, it doesn’t mean that one is better than the other, for example if 1 is primary school and 2 is middle school and so on
what is an interval measurement?
an interval measurement refers to the scale whose attributes are not only being ranked but also being measured by logical distance between them, and there is no meaningful zero point to this scale.
an example is the Fahrenheit scale for temperature. a temperature of 0 does not mean no heat.
what are ratio measures?
the same as interval measures except that this scale of measurement does have a meaningful zero
for example, number of children is a ratio level measurement because it is possible to have no children; the amount of money is a ratio measurement and zero money means no money
what is a univariate analysis?
the analysis of a single variable
what is descriptive statistics?
quantifies or summarizes data without implying or inferring relationships
what is frequency distribution?
demonstrates the number of cases for each attribute of a given variable
eg: test scores listed by percentile, age distribution of the population. these are often graphed as a histogram or pie chart
what is central tendency?
the tendency of a group of scores to cluster around a central representative score
the statistics most frequently used for measures of central tendency are the mean, median, and mode
what is standard deviation?
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values.[1] A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the values are spread out over a wider range.
what does cross tabulation do?
displays the joint distribution of two or more variables
When you want to look at the relationship of two variables that have small number of attributes (values), either nominal or categorical values, you can use Cross Tabs.
what is the goal of multivariate analysis?
to identify statistical relationships between the variables
what is simpson’s paradox?
when a data set is separated into groups, the results reverse from when the data was aggregated
a phenomenon in probability and statistics in which a trend appears in several groups of data but disappears or reverses when the groups are combined
what is inferential statistics?
the process of drawing conclusions from data that are subject to random variation
helps us decide whether we can generalize about a larger population based on the characteristics of the observed sample
examples of inferential statistics include drawing inferences, making predictions, and testing signficance
what is statistical significance testing?
statistical significance testing identifies the probability that our findings can be attributed to chance
when is a result statistically significant?
if it is unlikely that the result occurred by chance
what is a null hypothesis
Fan’s definition: the hypothesis that there is no validity to the tested variables
The null hypothesis, H0 is the commonly accepted fact; it is the opposite of the alternate hypothesis. Researchers work to reject, nullify or disprove the null hypothesis. Researchers come up with an alternate hypothesis, one that they think explains a phenomenon, and then work to reject the null hypothesis.
what is a non directional hypothesis?
hypothesis that does not specify whether the predicted relationship will be positive or negative. with non directional hypothesis, we use a two tailed test (otherwise you use a one tailed test)
what is a type I error?
an error we risk committing whenever we reject a null hypothesis
what is a type II error?
an error we risk committing whenever we fail to reject a null hypothesis
Why do we reject a hypothesis instead of prove it right?
Think back to Popper’s falsification theory–we can never prove any empirically formulated hypothesis or theory, however, we can reject (falsify) such a theory by finding an exception to it.
what is the p value?
The p-value is the probability that, if the null hypothesis is true, sampling variation would produce an estimate that is further away from the hypothesized value
how likely it is to get a result like this if the null hypothesis is true
what is a chi square?
a statistical significance test used when both the dependent and independent variables are nominal level. e.g: yes/no, m/f
it tests whether two (or more) variables are: 1) independent or 2) homogenous. In other words, this test examines whether knowing the value of one variable helps to estimate the value of another variable
in social work: intervention/no intervention (independent variable) vs effective/ineffective (dependent variable)
the major criteria for selecting a statistical test are…(4)
- the level of measurement of the variables
- the number of variables and the number of categories (or attributes) for the nominal variables
- the type of sampling methods used in data collection
- the way the variables are distributed in the population
when to use a one sample t-test
used to determine whether a sample mean is different from the population with a specific mean or a criterion.
what is ANOVA?
a statistical test for heterogeneity of means by analysis of group variables. in other words, one way ANOVA test the null hypothesis that the means of two or more groups are statistically no different between each other.
what is correlation analysis?
correlation is a measure of the strength of the relationship between two (continuous) variables
what are cautions to using correlation analysis?
a correlation is not necessarily a causal relation
what are the conditions that must be justified when looking at correlation analysis (3)
- if x changes, y changes (correlation)
- y changes after x (time sequence)
- if x is removed, y disappears
what is regression analysis ?
a broad class of widely used statistical methods to summarize the trends: simple regression, multiple regression, time series, logistic regression
what is the purpose of regression?
1) explanations: estimating and describing data and variables, test hypothesis and finds “laws”
2) prediction of new Y values: if ‘laws’ are identified, that if the conditions (independent variables) of the ‘laws’ are satisfied, we will get same results (of the dependent variables)
what should be in the report of a regression analysis? (3)
- discussion of the validity of each variable (relevance to the dependent variable, current knowledge about this variable…)
- fitness (R2)
- coefficients and their statistical significance