exam 2 Flashcards
Why do we have reliability and variability?
In order to ensure that data collected IS WHAT WE THINK THER ARE, or what we use to measure IS IN FACT MEASUING DATA.
because INTERPRETATION OF DATA HAS CONSEQUENCES & All statistics IS MEANINGLESS unless we are confident that we know what we are looking at
What is a measurement scale?
The assignment of values to outcomes following a set of rules. PARTICULAR LEVELS AT WHICH OUTCOMES ARE BEING MEASURED.
What are the 4 different types of measurement scale?
nominal, ordinal, interval and ratio, always DEGREE OF ERROR, different PRECISENESS
Definition+example of a nominal mesurement scale
“names” (nominal- Latin). It is the LEAST PRECISE. DATA CAN ONLY BE CLASSIFIED.
MUTUALLY EXCLUSIVE CATEGORIES, outcome FIT IN ONE AND ONLY ONE.
Ex: Gender, Ethnicity, political affiliation
Definition+example of an ordinal mesurement scale
ORDER and the things being measured are ordered. DATA ARE RANKED Ex: your rank in class…. (We know that #1 is better than #2, but we don’t know by how much)
Definition+example of an interval mesurement scale
: MEANINGFUL DIFFERENCE BETWEEN VALUES. tool based on A CONTINUUM. Ex:temperature, dress size, metrics?
Definition+example of a ratio mesurement scale
MEANINGFUL 0 POINT & RATIO between VALUES.
EX: Number of boys versus 0 degrees
0 boys would be the ABSOLUTE VALUE thus the RATIO, since there are actually no boys present in the room
What are the 4 general scales rules?
- Any outcome can be assigned to one of 4 scales of measurement.
- ORDER FROM LEAST TO MOST PRECISE ( nominal-ratio)
- HIGHER SCALE ARE MORE INFORMATIVE AND PRECISE
- THE MORE PRECISE, contains ALL qualities of scale BELOW
What is reliability?
test used as a measurement tool is measuring something CONSISTENTLY. we should be able to use the test TIME after TIME and get SIMILAR RESULTS.
What are the two test score element;
*observed score (score obtained i.e. 94.5% in my stats first exam) versus
* true score (97%)
Observed score= true score+ error score (reason why test score vary from being 100% true)
what is the first form of reliability? example
1)TEST-RETEST: examine if test is reliable over TIME, helps examine changes over time.
if conditions are the same, results should be the same.
ex: same exam taken different results each time its taken=not reliable
What is the third form of reliability? example
3)Internal CONSISTENCY: extent to which a test or procedure is CONSISTENT WITHIN ITSELF i.e., questionnaire items or questions in an interview should all be measuring the same thing= REPRESENT ONE dimension or area of interest CRONBACH’S ALPHA
To remember on Reliability
Ensure instructions are standardized and clear across all settings
- Increase no of observations to increase the chance of the sample being reliable.
- Delete all unclear items.
- Moderate the easiness/difficulty of test
- Minimize external effects
What is validity?
The tool MEASURES WHAT IT SAYS IT DOES; what it’s supposed to
What is the first type of validity?
CONTENT VALIDITY:Good SAMPLE of the specific “UNIVERSE”
ex: Does the content of a test cover everything in the area of interest?
What is the second type of validity?
CRITERION VALIDITY:Systematically RELATED TO OTHER CRITERION
1 Concurrent: New measure test scores are CORRELATED with those FROM AN ESTABLISHED VALID test
ex: we have a high positive correlation between scores on the new and old tests. this test valid!
2 Predictive:futur
Ex:Can an intelligence test at age 3 predict academic performance at 21?
Not finding the validity evidence
means that your test is not doing what it should.
when no criterion validity
to re-examine the nature of the items on the test and answer questions the way you expect the responses to be.
Not finding construct validity
means that you have to take a closer look at the theoretical rationale that underlies the test you have developed.
Relationship between validity and reliability
A test can be reliable and not valid, but not the other way around. Because a test can do what it does over and over, but still not do what it is supposed to do.
HYPOTHESIS
Definition
It is an EDUCATED GUESS
The “QUESTION/PROBLEM STATEMENT” we want to answer/address with research
TRANSLATES A PROBLEM IN A QUESTION that can be tested.
,
How to formulate a good hypothesis
- should TRANSLATE A statement/research question into a more amenable testing form.
- Use the RESEARCH question as a GUIDE. Then the hypothesis will determine the techniques to use to create a good hypothesis
rules about Samples and population
Samples should ACCURATELY, to allow a higher a degree of GENERALIZATION for the study results.
Null Hypothesis def:
Statement that two or more things are EQUAL OR UNRELATED to each other
H0 : m1 = m2 or H0 : rm1m2 = 0
Purpose of the NH
The NH acts as BENCHMARK & STARTING POINT against which the actual OUTCOMES of a study can be MEASURED (STATE OF AFFAIR accepted coz no other info)
Research hypothesis def
STATEMENT OF INEQUALITY posits that there is a RELATIONSHIP between variables.
RH two forms
•NON-DIRECTIONAL RH (ONE TAIL): UNSPECIFIED DIFFERENCE between groups, H1 : X1 > X2 more than/less than
• DIRECTIONAL RH (2 TAILS): SPECIFIED DIFFERENCE between groups.
H1 : X1 ≠ X2
Purpose of RH
DIRECTLY TESTED in RESEARCH. results compared with you expect by CHANCE ALONE & see what is a more attractive explanation for any differences observed between groups.
Difference between the NH and the RH
•RELATIONSHIP (NH-YES & RH- NO)
•NH -POPULATION &the RH -SAMPLE.
•NH INDIRECTLY tested & RH DIRECTLY ested
•NH in GREEK symbols & RH in ROMAN symbols.
*NH is IMPLIED, & RH is EXPLICIT- reason why NH is not used in research reports.
What makes a good hypothesis?
•DECLARATIVE form & NOT A QUESTION •POSITS A RELATIONSHIP between VARIABLES variables •REFLECT THEORYon which based •Should be BRIEF & TO THE POINT. -TESTABLE (unambiguous)
PROBALITY
Why?
*BASIS for the NORMAL CURVE & the FOUNDATION for INFERENTIAL statistics.
• determining the degree of confidence we have in stating that a statement is true.DIDNT HAPPEN BY CHANCE
Probility: normal curve (bell-shaped curve)
def:
it is a VISUAL REPRESENTATION of a distribution of scores
what are the probability normal curve 3 characteristics?
- NOT SKEWED (mean=median=mode)
- SYMETRICAL-both HALVES r IDENTICAL.
- ASYMPTOTIC: as they come CLOSER to HORIZONTAL AXIS, they NEVER TOUCH.
explain the reasons for the curve characteristics?
With large sets of data, & repeated samples of data from population, the VALUE IN CURVE APPROXIMATE THE SHAPE of a normal curve. EVENTS in the EXTREME tend to have a SMALLER PROBABILITY , and event in the MIDDLE have a HIGHER probability.
Probability
standard scores def:
scores that COMPARABLE because they are STANDARDIZED, allow us to COMPARE scores with DIFFERENT MEANS
They help decide THE PROBABILITY OF SOME EVENT OCCURING
LIKELY, MORE LIKELY, LESS LIKELY
Probability
What do z scores represent?
RAW SCORES & a particular LOCATION on the x-axis
-the LARGER the Z score, the further awya from the mean
Hypothesis testing and z scores
*NH - no difference between groups with a chance of a 100% of that occurring.
*RH shows that the likelihood of that event occurring is somewhat extreme;
RH - better explanation than NH.
*Z score will show the LIKEHOOD OF EVENT HAPPENING
Inferential statistics: significant
any DIFFERENCE between GROUPS is caused by an EXTERNAL FACTORS & not by CHANCEalone.
allowing leeway on fact that difference in groups could be may be caused by uncontrollable factors.
Inferential Stats: what is Significance level?
is the DEGREE WE ALLOW FOR ERROR, the level of chance or risk we are willing to take that the RESULTS in an experiment are NOT DUE TO CHANCE ALONE.
What is the difference between significance and meaningfulness?
Significance(PROBABILITY) is not meaningful(CONTEXT) on its own (variability, difference of people, mean difference): it is influenced by the mean of the groups, variability and the difference in groups
Define inferential statistics
Tool used to infer result from sample to population
How does inference work?
- sample
- test
- significance
- inference
Steps for significance
- NH
- .01or .05
- Compute T
- compare T to Bar
- reject/accept NH
What is confidence interval?
The BEST ESTIMATE OF THE RANGE of population in a sample
What do we need Z test for?
To compare a SAMPLE MEAN to a POPULATION MEAN
Tea
Tool look up MEAN DIFFERENCES between of 1 or MORE VARIABLE between GROUPS that are INDEPENDENT of 1 ANOTHER (independents/dependents samples)
When to use tea?
With independent and/or dependent samples
Assumptions of T
Observations are INDEPENDENT
2 populations must be NORMAL
2 groups with EQUAL VARIANCE
Explain
t(58)=-.14>.05
t = test statistics
58 =degree of freedom
-.14 = obtained value
P>.05 =probability
What is degree of freedom?
EVERYTHING u need to KNOW before u KNOW the TEST
Number of entities that are FREE TO VARY
Df = n1-1+n2-1 in independent T2
Df=n-1 in dependent T1
Explain tails
- 2 tails= NON-DIRECTIONAL RH, results could go either way
* 1 tail= DIRECTIONAL RH, results could only go one way
What are effects size?
HOW BIG IS BIG? How different is different- magnitude Small 0.0-2.0 how similar r gps/overlap Medium 2.0-.50 Large .5 and above (Xbar1-Xbar2)/SD from either gp
SPSS T
Analyze, compare means, independents samples T-test/paired samples T test
Z scores observation
Scores below mean r negative
Positive scores r right to the mean
Z scores r comparable
Z scores formula
Z= x-x
—
Sd
what is the second form of reliability? example
2) PARRALEL FORM:To examine the EQUIVALENCE OR SIMILAR FORM of the same test.
ex: Two versions of the same test should yield equivalent results =not reliable
what is the fourth form of reliability?
4)Interrater reliability: HOW MUCH 2 REATERS AGREE on their judgments and if they follow the same procedures.=use STANDARDISE CATEGORIES
There SHOULD BE A HIGH POSITIVE CORRELATION between the scores of different observers
Chronbach’s Aplha
special measure of reliability INTERNAL CONSISTENCY:how closely related a set of items are as a group. The more consistently an item vary with the total score on the test, the higher the Chronbach’s Aplha value. (as the intercorrelations among test items increase)
People who do well should do well on harder questions
if you can establish validity
- lower the error
- use standardised instructions
- increase no of observation
- delete unclear items
- moderate the test
- eliminate external event
What is the importance of validity?
Often times we CAN’T “SEE” the CONCEPT we are measuring, i.e. “intelligence,” or “depression” – therefore establishing validity is important
what is internal validity
The tool is measuring WHAT IT IS INTENDING to measure
what is external validity
The findings can be GENERALIZED BEYOND THE CONTEXT of the research situation
What is the third type of validity?
CONSTRUCT VALIDITY: Related to an UNDERLYING IDEA
Ex: social status
Z-scores and SD
when comparing scores across distribution, Z scores and SD are EQUIVALENT
percentages in the curve
34,13; 15,59; 2,15;0,13
What is hypothesis testing?
procedure that DECIDES that the OUTCOME of a STUDY supports a THEORY at a POPULATION level
error type 1
rejecting a TRUE hypothesis. Claiming a difference when there is NO DIFERENCE
FALSE POSITIVE
greek alpha
error type 2
accepting a FALSE hypothesis.Claiming there is no difference while THERE IS A DIFFERENCE.
FALSE NEGATIVE
greek beta
p ≤ .05
On any one test of the NH there is a 5% chance you will reject it when the NH is actually true.
The probability of observing this outcome in the “normal population” is less than .05. (the “Outcome” is rejecting the NH when it is true)
There is a 5% probability that a score is that extreme if theNH is true
balancing errors
If you set your significance level at .000001 to control for a Type I error, you risk being too stringent to detect a real effect – committing a Type II error.
Tradeoffs must be made
SIGNIFICANT
If p<.05 that means it is SIGNIFICANT.
If obtained value is MORE extreme, REJECT NH
NOT SIGNIFICANT
p>.05 means a result is not significant
If obtained value is NOT MORE extreme, ACCEPT NH
what is a Test Statistic (Obtained Value)
– RESULT of a specific STATISTICAL TEST done on a sample
independent samples
SEPARATE groups TESTED ONCE (males vs.females)
only 2 groups total
interested in DIFFERENCE BETWEEN GROUPS
NUMERATOR IN T STATISTIC
DIFFERENCE BETWEEN MEANS
Denominator in T Statistics
AMOUNT of VARIANCE WITHIN & BETWEEN groups