exam 2 Flashcards

1
Q

what is research

A

the systematic investigation into and study of materials and sources to establish facts and reach conclusions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is key to research

A

carefully formulating a research question/falsifiable hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are the types of research

A
  • observational
  • experimental
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

observational research

A
  • measuring of relationships betwen events or conditions
  • no manipulation or intervening
  • supports future experimental work
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the con of observational research

A

hard to make confidenct cause-and-effect inferences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

expiremental research

A

investigator directly manipulates conditions to measure the response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are the different types of variables

A

-independent variable
- dependent variable
- intervening/confounding variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

independent variable

A

controlled by the investigator, or the uncontrolled cause

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

other ways to refer to IV

A

treatment, predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

dependent variable

A
  • not controlled
  • the effect
  • the variable that is measured in response to the IV
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

other ways to refer to DV

A

response, criterion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

intervening/confounding variables

A

influence the DV as well but are not controlled by the investigator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what can intervening/confounding variables lead to

A

erroneous conclusions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is the goal of sampling

A

to extract a sample (n) that is representative of the population (N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does having a representative same do

A

the effect of an IV on a DV can be generalized to all other members of that population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the types of sampling methods

A
  • random sample
  • stratified sample
  • conenience sample
  • systemic sample
  • cluster sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

random sample

A

each member of the population has an equal chance of being selected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

stratified sample

A

ensure representation of subgroups within the population of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

convinence sample

A

members are selected based on “ease and proximity”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

systemic sample

A

members are selected at regular intervals from a randomly ordered list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

cluster sample

A

populations are divided into subgroups or “clusters” then members are randomly selected from a cluster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what is the ideal sampling method how come it isnt used

A

random sample, but its difficult for this to occur

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what is the sampling method that is most suseptable to bias

A

convinence sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

bias

A

aspects of the sample make it unrepresentative to the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
how can bias be decreased
having a larger sample = gerater proportion of the population = less bias
26
what are the factors that go into determining which sampling method is appropriate
the research objectives, resources/cost/times, and population characteristics
27
what are the type of experimental designs
- pre-experimental - quasi- experimental - true experimental
28
Pre- experimental design
- exploratory - used when rigorous approaches are not feasible - weak evidence of causality - no control group or random assignment
29
quasi experimental
- moderate evidence of causality - no random assignment
30
true experimental
- random assignment of participants to treatment or control groups - strong evidence of causality
31
how many treatment groups can a participant be in
multiple
32
how many control groups can a participant be in
one
33
examples of pre-experimental
- case study - pretest-posttest - static group comparison - nonequivalent groups
34
examples of quasi-experimental
- interupted time series - natural experiment
35
examples of true experimental
- independent groups - matched groups - randomized controlled trial - repeated measures - factoral - pretest-posttest - solomon four-group
36
how would a pre-experimental case study group be set up
single group is exposed to an intervention/treatment and the outcome is measured
37
what are the cons of case study pre-experimental study
theres no way of knowing whether other factors contributed to the outcome
38
how is a one group pretest-posttest pre-experimental study set up
a single group is measured before and after an intervention/treatment
39
how is a static group comparison pre-experimental study set up
- one of two groups recieves the intervention/treatment - includes a control group - no randomization
40
what influences a static group comparison stuyd
pre-existing differences between groups may influence the outcomes
41
how is an interupted time series quasi-experiment set up
multiple measurements taken before and after intervention/treatment
42
how is a natural quasi-experiment run?
observation of effects of natural occurances/changes
43
how does a natural experiment influence external validity
non labratory based setting and in the natural environment can be used to study how the real world operates and generalize findings
44
how is an independent group, betwen subject, true experiment run
- random assignment to study groups - can have more than two groups
45
what does random assignment to study groups for independent group, between-subjects, tru experimental studies do
minimizes the effect of individual differences
46
can individual differences still affect the results of an inependent group, between subjects, true experimental study
yes
47
how are matched groups, true experimental study run
- participants matched on key attributes then randomly assigned to groups - helps to minimize individual differences further
48
what is a matched group true experimental group useful for
when specific attributes are expected to interact with the IV
49
how are randomized control trial true experimental studies run
- participants are randomly assigned to treatment or control group - uses blinding to reduce bias
50
what is the gold standard for clinical research
Randomized controlled trials
51
single blind
particicpants unaware of grouping
52
double blind
participants and researchers unaware of grouping
53
how are repeated measures, within subjects, true experimental studies run
participants complete all conditions - the participants are their own controls - random/counterbalanced order to minmize carryover effects - smaller sample sizes needed comapted to equiavalent independent/matched groups design
54
what are repeated measures, within subjects, true experimental studies sensituve to
the effects of the IV
55
how are factorial, true experimental studies run
- examines the effects of multiple IVs on a single DV - can be incorporated into between, within subjects design
56
how are solomon four-group experiments designed
- separate treatment and control groups may or may not be "pretested" - controls for carryover effects from the pretest, improved internal validitiy - requires a larger sample size and randomization into each group
57
criterion of parameters
source: population calculated: no constants: yes examples: mean, standard deviation, population
58
criterion of statistics
source: sample calculated: yes constants: no examples: mean, SD, n
59
statistical inference
estimating population parameters from sample statistics
60
sampling error
amount of error in th estimate of a population paramter that is derived from a sample statistic
61
why are probability statements accompanied with statistics
because of the uncertainty in our parameter estimate
62
law of large numbers
as a sample size increases, the sample mean approaches the population mean if 1. samplesa are independenct 2. samples are identically distributed
63
what are easily swayed by extreme values according to the law of large numbers
means of small random samples = larger sampling error
64
what are resistant to extrememe values according to the law of large numbers
means of large random samples = smaller sampling error
65
sampling distribution of the mean
theoretical frequency distribution of all possible sample means that can be calculated from a population
66
according to the sampling distribution of the mean what is the relationship between variability of the sampling distribution and each sample mean
the variability of the sampling distribution decreases as sample size of each sample mean increases
67
according to the sampling distribution of the mean what is the relationship between the variability of sampling distribution and the variability of the population
the variability of sampling distribution is smaller than the variability of the population
68
standard error of the mean
how much the sample mean (statistics) is likely to differ from the true population mean (parameter)
69
what is the standard error of the mean also known as
the standard deviation of the sampling distribution of the mean
70
what is the equation for SEm
SD/sqrtn
71
when will samples have smaller SEm
- they are homogenous - they have a larger sample size
72
what is the square root law
the accuracy of a parameter estimate is inversely proportional to the square root of the sample size
73
how will quadrupuling the sample size affect the SEm if everything else is normla
it willl half the SEm (half the variablity) according to the square root law
74
how do you interpret SEm
just like SD on a normal curve - e.g. SEm = Z score of +/- 1.0
75
how do you state standard error of mean
there is a 68% chance the population mean is within 163.5 <= mu <= 182.5 lbs
76
what does the 68.5% chance the population mean is within a given interval mean
that this is also the confidence interval of 68%
77
what does stating a confidence interval of 68% mean
that there is also a 32% probability of error, or a chance that the mean is not within that range - p = 0.32
78
what is an acceptable level of uncertainty
79
what is Alpha (a)
the area under the curve that represents the probability of error, the liklihood of chance ocurrence
80
what does an alpha value of 0.05 mean
that there is 5% chance of rejecting the null hypothesis incorrectly
81
what is the equation for finding the confidence interval
C.I. = Z-score mean +/- Z-score * SEm
82
how to report a C.I.
A 95% CI will give the mean +/- 1.96(SEm) - the 1.96 is the interval where 95% of the data is found
83
how to interpret a confidence interval of 173 +/- 18.62 lbs or 154.38 lbs <= mu <= 191.62
with 95% confidence we conclude that the mean weight of all college-ages men is between 154.38 and 191.62 lbs. However, there is a 5% chance (p = 0.05) that the true mean falls outside of this range
84
what does a larger confidence interval result in
- less likely to be wrong - less precise
85
how does statistical hypothesis testing begin
with two mutually exclusive, exhaustive mathematical statements about the relationship between variables/groups are formed
86
what are the two hypothesese formed for statistical hypothesis testing
- null hypotheses (H0) (this is assumed to be true unless evidence is found to the contrary) - alternative hypothesis
87
what does mutual exclusive hypothesis means
only one can be true
88
what does exhaustive mean in terms of hypothesis testing
that no other option exists
89
nondirectional hypothesis
H0: mean 1 = mean 2 H1L mean 1 does not equal mean 2
90
directional hypothesis
H0: mean 1 < mean 2 H1: mean 1 > mean 2
91
what does a p value indicate in statistical hypothesis testing
indicates the probabilituy of obtaining the data collected IF the null hypothesis H0 is true
92
what does a p value < 0.05 indicate
that the result is statistically significant and the H0 can be rejected and you accept the alternative hypothesis
93
what does rejecting a H0 indicate
depending on what the H0 is, it would be indicating that there is a difference bettwen the two variables or that there the treatment group is significant
94
two tailed test hypotheses
H0: mean 1 = mean 2 HA: mean 1doesnt = mean 2
95
region of rejection for two tailed tests
- set by alpha value - split between tails of the distribution (each 2.5% AUC)
96
when do you use a two tailed test
when prior research/logical reasoning does not suggest a direction or different, a difference should be expected
97
one tailed test hypotheses
H0: category 1 > category 2 HA: category 1
98
region of rejection
- set by alpha value - concentrated at one tail of the distribution
99
when to use a one tailed test
when there is a strong evidence to think a difference exists
100
Type I error
H0 is rejected when it is actually true (a false positive)
101
what is confluded froom a type I error
conclude that an effect/relationship exists when, in reality, if does not
102
how can type I error risk be reduced
by decreasing alpha
103
Type II error
H0 is accepted when it is actually false (falsse negative)
104
what does a Type II error conclude
that no effect/relationshiup exists when it really does
105
how can Type II error risk be reduced
through decreasing beta
106
what is beta
the probability of committing a Type II error (typically strive for beta=0.2)
107
statistical power
probability of rejecting H0 when H0 is false
108
what is the equation for statistical power
power = 1 - beta - typically strive for 0.8
109
factors that may tie into Type I error
- measurement error - lack of random sample - alpha value too liberal (a=0.10) - investigator bias - improper use of one tailed test
110
factors that tie into Type II
- measurement error - lack of sufficient power (N too small) - alpha value too conservative ( a = 0.01) - treatment effect not properly applied
111
how to decrease a
- decrease a priori significance level a (a bonferonni correction) - control confounding variables - increase sample size
112
what does decreasing the signifcance level of alpha do
you will increase the chance of a Type II error
113
what is a bonferoni correction
correction to the alpha value dividing 0.05/# of tests
114
how do you decrease beta
increase a priori significance level alpha
115
what does increasing significance level alpha result in
may increase the chance of a Type I error
116
what is the based way to Type I, II error risk with available resources
conducting a power analysis
117
Correlation
the degree of association between betwen two interval- level variables
118
what is correlation represented by
a coefficient between +1.00 and -1.00
119
what does a +1.00 correlation coefficient mean
- perfect positive correlation - the size of deviations from the mean in both variables are equal in the same direction
120
what does a -1.00 correlation coefficient mean
- perfect negative correlation - or the size of deviations from the mean in both variables are euqal in opposite directions
121
what is a 0.00 correlation coefficient mean
- no correlation - there is no pattern to the size and direction of deviations from the mean between variables
122
what does the sign of the coefficient indicate
direction
123
what does the magnitude of the correlation coefficient indicate
strength
124
what are scatter plots best used for
to visualize the correlation between variables
125
what is the line of best fit
best linear estimate of the relationship between variables given the data used to calculate it
126
what does the line of best fit minimize
residuals
127
what are risiduals
error between measured and predicted values by the lines equation
128
what is pearson correlation coefficient also called
pearon's product moment correlation coefficient
129
what is the equation for pearson correlation coefficient
r = sum of (ZxZy)/N - Zx being number of score pairs - Zy being product of z-scores for each variable
130
what is the alternative "machine formula" that does not require z-scores
r = (sum of (x-mean)(y-mean))/sqrt(sum of (x-mean x)^2) sum(y-mean y)^2))
131
what are the assumptions of pearson correlation
- both variables must be on a continuous (interval or ratio) scale - each pair of variables must be indepoendent - both variables should be approximately normally distributed - the relationship between variables (if one exists) must be linear - the dataset should not contain outliers
132
what do you do if the relationship is non linear for pearson correlation
use spearman's rank
133
how does outliers affect Pearson Correlation
it is really sensitive to outliers so it may creat an overly strong correlation or weak correlation
134
what is the eqation for spearman's rank correlation coefficient
p = 1- (6*sum of di^2)/(n(n^2-1)) - di^2: the difference between variable ranks - n = number of observations
135
what is spearman's rank
- a nonparametric test - w/ fewer assumptions including about the data distribution
136
what are the parameters of spearman's rank
- variables do not need to be normally distributed - variables can be discrete - relationship between variables can be non-loinear but must be monotonic - less sensitive to outliers
137
coefficient of determination
r^2 - quantifies the shared variance betwen variables - how well the indeoendent variables explain the variation in the dependent variables
138
how to verbally express the coefficient of determination
_____% of the variance in the dataset can be explained by the variance in what is being looked at.
139
degrees of freedom (df)
the number of scores that are free to vary when the sum the scores is set
140
what is the equation for degrees of freedom
df = N-#of variables in the correlation
141
what does correlation doesnt = causation mean
correlation does not necessarily mean that a change in one variable will result in a change in the other
142
bivariate regression
strong enough correlations allow for predictions of one variable based on the values of another variable
143
what is the equation for a bivariate regression model
y = beta not + beta1x + e
144
bivariate regression assumptions
- the relationship between variables must be linear - each pair of variables must be independent - for any value of a predictor (independent variable) the dependent variable must be approximately normally distributed - the variance of the residuals must be consistent across the range of predictor values
145
what is homoscedasticity
when the spread of residuals is relatively consistent within the regression model
146
how to calculate coefficients for the bivariate regression model
beta1: (r(SDy)/(SDx)) beta0: mean y - ((rSDy)/(SDx))mean x
147
will there always be residuals
yes, unless there is a perfect correlation between variables
148
how can the residuals in a regression model be represented
- using the standard error of the estimate - or the SD of the residuals
149
standard error of estimate equation
SEe = sqrt ((sum(yactual-ypred)^2)/(n-2)
150
what is the alternate equation for the standard error of estimate
SEe = SDysqrt(1-r^2)
151
what does using the alternate equation for the standard error of estimate result in
it underestimates SEe when the sample size is small
152
in a regression coefficient what is H0
beta 1= 0
153
ina regression coeeficent what is HA
beta1 does not equal 0
154
what does the t-statistic tell you
determine significance of beta1
155
what is multiple correlation
quantifies the degree of relationship/association betwen a function of independent variables and one dependent variabl
156
what is multiple correlation represented by
a coefficient R between 0 and 1
157
what does a R = 0.00 value indicate
no correaltion, or there is no relationship/association between independent variables and the dependent variable
158
what does a R = 1.00 value indicate
perfect correaltion, or the independent variables completely explain the dependent variable
159
what is the multivariabe coeeficient of determination
- R^2 - same interpretation as bivariate r^2
160
partial correlation
quantifies the relationship between an independent variable and dependent variable after removing the effect of another variable
161
covariate
an independent variable that can influence the outcome of a given statistical trial, but which is not of direct interest
162
partial coefficient of determination
the variance in Y explained by X1 after removing the effects of X2 on both
163
what is an example of partial correlation
- interested in association between children's age (X1) and muscle strength (Y) - children grow and get heavier with age (X2) and may be a covariate - using partial correlation = partial out the effect of weight and can leave the variance in strength due solely to age
164
what is unexplained variance and what is it represented by
- (1-R^2) - the amount of variation in a dependent variable that a model can explain using the independent variables
165
Multiple and partial correlation assumptions
- both variables must be on a continious (interval or ratio) scale - each pair of variables must be independent - both variables should be approx. normally distributed - the relationship between variabels (if one exists) must be linear - the dataset should not contain outliers
166
multiple linear regression
prediction of one dependent variable from multiple predictor variables (independent variables)
167
what is the equation of a multiple linear regression
Y = a + b1X1 + b2X2 + .... bkXk - b values are the slope coefficients - x values are the independent variables - a is the Y-intercept
168
hierarchial multiple regression
reseracher has full control over the model equation and which predictors are included
169
when is hierarchial multiple regression used
when hypothesis testing is the goal rather than accurate, efficient dependent variable prediction
170
algorithmic multiple regression
computer software/algorithms construct the model equation
171
what are the types of algorithmic multiple regression
- forward selection - backward elimination - stepwise
172
forward selection algorithmic multiple regression
starts with the intercept only, predictors are added to the model one-by-one and assessed, if R^2 increases that shows unique variablility
173
backward elimination algorithmic multiple regression
- starts with all predictors - eliminates predictors one-by-one and assesses the resulting model - if the removal of the variable decreses explained the varible the least (not sig decrease) the variable is eliminated
174
stepwise algorithmic multiple regressio
same as forward selection but previously entered variables can be eliminated in later steps - if R^2 is not affected by the inclusion or exclusion
175
what is the drawback of a stepwise multiple regression
requires a larger sample size compared to other methods to return reliable results
176
what is the ideal ratio of subjects:variables for a stepwise multiple regression
20:1 to 40:1 ratio
177
in a table of correlation values for a given dataset how do you interpret the values prsented
the values are the r values that indicate the strength of correaltion between variables - values close to 1.00 indicate strong correalation - this is then squared to report how much variance of the dataset is explained through this variable
178
how can you tell if a variable has unique variance
- based on if the variable is highly correlated with other variables - if the addition of the variable in the R^2 calculation increases significantly, if it does this indicates unique variance
179
how can you visually tell if variables offer unique variance
- if the circle overlaps heavily with the dependent variable - if the overlap is present but more overlap is seen with another variable, it doesnt explain that much for the variance and therefore isnt unique
180
what are the multiple regression assumptions
- the relationship between variables must be linear - each pair of variables must be independent - for any value of a predictor (independent variable), the dependent variables must be approx normally distributed - variance of the residuals must be consistent across the range of predictor values - independent variables (predictors) should not be correlated with each other
181
what does multicollinearity lead to
leads to inflated confidence intervals for slope coefficient estimates and unstable slope coefficient estimates when addtional predictors are added
182
is there a threshold off acceptable multicolinearity
no
183
what should the variance inflation factor be
greater than 10 should be suspicious
184
what is the equation of variance inflation factor
VIF = 1/1-R^2
185
what is singularity in multicollinearity
two IVs are perfectly related (r=1.00) usually because one was mathematically derived from the other
186
cross validation
the process of testing regression equations on a separate and equivalent sample from which they were built to ensure accuracy in their predictions
187
what is expected when applying models to different samples
higher prediction errors
188
what is the cross validation model good for and what will be a result
training data - the correlation coefficient will undergo shrinkage and would be smaller on different samples
189
who developed the T test
william sealy
190
when are t tests useful
- we do not know the distribution of the population - we have a relatively small sample relative to the population
191
how does sample size relate to t distribution
as sample size increases, the t distribution approaches a normal distribution
192
what is a t statistic
the ratio between mean differences and variability
193
what is a critical statistic
the value that must be met to reach statistical significance at a given alpha level
194
what is the generic t test formula
t = mean difference/SEof mean difference
195
what can a t statistic be though of as
a signal to noise ratio
196
SEM for t statistic
SD/sqrt n
197
what is the standard error of the difference look at
the variability of the difference between two groups
198
what are the assumptions of a t Test
- the data must be normally distributed - the data must be on the interval or ratio scales - the sample is randomly selevted from the greater population - when two samples are taken, they should have homogeneity of variance
199
what is a single sample t Test
used to compare a single sample mean with a known population meanat i
200
what is the equation for single sample t Test
t = (sample mean - population mean)/SEM
201
what is used to determine the critical statistic for significance
the degrees of freedom
202
when can H0 be rejected and HA accepted
if the |t statistic| > criticial statistic
203
how to calculate a confidence interval for a single sample t Test
C.I. = sample mean +/- tcv(SEM)
204
when is the adjusted standard error of difference equation used
when unequal sample sizes are present
205
what is the formula for the adjusted standard error of the difference
SED = Square root([((n1-1)(SD1^2)+(n2-1)(SD2^2))/(n1+n2-2)][(1/n1)+1/n2)]
206
paired sample t test
used ot compare two means from the same or correlated samples
207
what is the equation for a paired sample t test
t = sample mean pre - sample mean post /SED
208
what is the corrected SED for the paired sample t test
SED = square root((SD1^@)/n1)+(SD2^2/n2)-2r(SD1^2/n1)(SD2^2/n2))
209
what is the alternate approach to calculting the t statistic and SED for a paired sample t Test
t = d/SED - d = mean difference between individual's scores SED = SDd/sqrt n - SDd = standard deviationof the difference
210
what is the confidence interval for the alternate t statistic and SED for paired samples t test
CI= mean difference between individuals socres +/-tcv(SED)
211
what is used if data violates the assumptions of a t Test
single samples: Wilcoxon signed rank test independent samples: Mann-Whitney U test paired samples: Tilcoson signed rank test
212
effect size
the stregnth of the relationship between variables
213
omega sqaured
estimate of the varinace explained by the influence of the independent variable
214
what is used as a measure of effect size for pretest-posttest
percent change