8- Statistics and hypothesis testing in ABA Flashcards

1
Q
You work Top : Large to small- 
       Theory 
      Hypothesis 
    Test hypothesis 
     Specific answer

Requires statistics to interpret large amounts of data (Quantitative/hard number)

majority of Social science researchers have a ____ orientation

A

Deductive Research Paradigm

AKA deductive approach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Work from Bottom Up, small to large:

 Generalize
Analysis - (Results: come to conclusions that you can generalize to other people about. )
   Data

Fluid, qualitative approach

Examples of qualitative research:
interviews
observation of cultures
Focus groups

.

A

Inductive approach – research

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Research in ABA is typically ……..in that we do not test hypotheses

but we are also quantitative

Reversal designs are :
   -flexible (ABA vs. ABAC)
   -quantitative, 
   - without a pre-determined outcome
Why the differences? Not withstanding the differences can we use the tools?
A

inductive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

•Goal:

  • To “Describe” Properties of the sample(s) you’re working with
  • can talk about the central tendency of the sample or population in terms of what the most typical score in your sample or population look like.
  • can talk about the variability Around the measure of typicalness be it mean median or mode. What is the variability around that measure of central tendency
  • and talk about Effect size
A

”Descriptive” statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

-Complements visual analysis

Already use them to describe:
•level change
• IOA
Can use in Program evaluation By aggregating data across clients

May open doors for Funding.
Ex. Effect size (Can be compared to other effect sizes)

A

Descriptive Statistics in ABA- Reasons for using

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

May hide Trends in behavior

A

Descriptive statistics in ABA: reasons for not using

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Goal:
• To Use a sample data as a basis for Answering questions about the Population. (Can’t access whole populations. Instead we collect samples.)
• Since we rely on samples, we must to better understand how they relate to populations.
••Then we use HYPOTHESIS testing to make those inferences : T-tests, ANOVA etc

(The inferences about the samples are about the population from which the sample was drawn.)

(And the inferences are about relationships or features Of the population.)

A

Inferential statistics- Goal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Appropriate for certain types of research-
ex. When ABA does not use single case design such as contingency management – group

May open doors to funding
• hypothesis testing

Perceived weakness of reliance on Visual analysis in ABA.
Inconsistent?

A

Reasons for using Inferential statistics - ABA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

• Do not tell us how likely the results are to be replicated.

  • in ABA We use an ABA design or Multiple BASELINE design.
  • INFERENTIAL statistics, we’re not Operating under circumstances that allow us to REPLICATE effect.

Do not tell us the probability that the results were due to Chance

Tells us The Probability is a CONDITIONAL probability event under true null hypothesis

  • Very few situations in which there is only randomness in data.
  • Best way to increase your chances of significance is increasing number of participants.
  • A large number of variables that will have very small effects become important.
  • Limits the reasons for doing experiments.
  • Reduce scientific responsibility.
  • Emphasizes population parameters at the expense of behavior.

“Behavior is something an individual does not what a group average does.”

•We should be attending to:

  • value/social significance,
  • durability of changes
  • Number and characteristics of participants that improve in a socially significant manner.
A

Inferential - Some reasons for not using it in ABA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Looked at behavioral treatment and normal educational and intellectual functioning and young autistic children (Journal of consulting and clinical psychology, 1987)

Hypothesis: the construction of a special, Intense, and comprehensive learning environment for very young children with autism would allow them to catch up with their normal peer is by first grade.

Subjects were young children diagnosed with autism.

  • Group one : 19 subjects – 40 hours a week of ABA
  • Group twi: 19 subjects- 10 hours a week of ABA
  • Group 3:21 subjects – other treatments

Groups of one and two received two or more years of therapy

A

Lovaas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Statistical analysis (MANOVA) used to compare the DV (IQ) To show that the intensive group demonstrated a large increase relative to the other conditions

He was a behavior analyst. Why hypothesis testing, statistics, and IQ as a dependent variable?-

  • Intensive, long-term study that used measures and analysis that others NOT in our field would pay attention to.
  • Control groups allowed for strong conclusions
A

Inferential Statistics

Lovaas Study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
  1. Nominal (name) refers to categories
    Ex. School districts and colors
  2. Ordinal (order), Quantities that have an order
    Ex. Physical fitness and pain scale
    (Not a lot you can do with these two types of data)
  3. Interval - difference between each value is Even
    Ex. Degrees Fahrenheit
  4. Ratio: when the difference between each value is even, has a true Zero
    Ex. Time, weight, temperature in kelvin

Practically, interval and ratio are types of data we are interested in

A

data used in statistics 4x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
  1. Mean
  2. Median
  3. Mode

More than one because many different types of Distributions are possible.

A

Three measures of central tendency

Descriptive statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The sum of the score is divided by the number of scores

Advantage: every number in the distribution is used in its calculation

However changing a single score or adding a new score will change it, except when the new score equals it

Most preferred measure

  • Every score used it it’s calculation
  • used to calculate other statistics

However Some situations in which mean cannot be calculated or is not most Representative measure.

Remember, the goal is to find a single value that best represents the entire distribution (median and mode)

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The score that divides the Distribution exactly in half

A ____ Splits gives researchers two groups of equal sizes..

 - Low Scores
  - High Scores
A

Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
  1. Collect all Odd number of scores
    • List from Lowest to Highest
    • It’s the Middle score

Ex., (10, 11, 12, 13, 14. )____. = 12

Even number of scores:

  • List from lowest to highest
  • Add the middle 2 scores and divide by two

Example, 2, 3, 5, 8, 10, 12, = 5+8/2 = 6.5

A

Calculate Median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Use when:
there are Extreme scores/skewed distribution’s

Undetermined Values

Open ended distribution’s

A

Median: When to use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Is the score or category that has the greatest flexibility ( Peak)

A distribution can have more than one mode,
• bimodal
•multimodal

Easy to find in basic frequency distribution tables

NOT A frequency. It’s a score or category

A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Two modes/peaks;

Can be equal or major/minor

A

BiModal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

More than two modes

A

Multimodal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Use when It can be used in place of or in conjunction with other measures of central tendency. That is, when there are:
1. NOMINAL Scales; (only measure of central tendency for nominal Scales),
Ex. Are you male or female. 40 are male, 60 female. Can’t calculate the mean or median but can say the most TYPICAL participant is a female because thats 60% of the sample.

  1. Use when there are: Discrete Variable: “What is most typical” score; remember the goal of measures of central tendency
    Ex. to know the number of golf clubs – calculate the mean.. Most typical score
  2. Describing shape: easy to figure out
A

Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Describes the distribution in terms of Distance;
How far is that person from the central tendency whether mean, median, or mode

Distance between one score and another or,

Distance between one score and the mean

Describes how well each score or a group of scores describes the entire distribution.

Provides A quantitative measure of the degree to which scores in a distribution are spread out or clustered together.

A

Variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q
  1. Range

2 interquartile

  1. standard deviation - Most important
A

Three measures of variability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

The distance between the Largest score and the “Smallest” score plus 1

A crude, unreliable measure of variability because:
-Does not consider ALL the scores in the distribution

Calculate:
Ex. 1
1, 4, 5, 8, 9, 10

10 - 1 + 1 = 10

Ex. 2:
10, 15, 20, 25, 30, 35, 40

40- 10+ 1 = 31

Take Highest and lowest, ignore the others in the range. Not detailed variability.

A

Range – variability Measure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Most important measure of variability Measures the “Typical” DISTANCE from the MEAN and uses ALL Of the scores in the distribution How far is Score from the mean. Using an ABA: - can be used to identify variability in behavioral data (Autocorrelation can be used for this too). - Can be described to identify important variability in IOA Data. Mean and range tell us nothing about which set of circumstances we have which is why we should always report standard deviation over IOA scores along with mean.
Standard deviation – variability measure
26
The relationship between samples of populations Cannot talk about the Exact Relationship between samples of populations… But we can talk about Potential outcomes (I.e. Probability)
Probability - inferential statistics
27
To make ”inferences” about Populations based on sample data We are Sampling the population with a certain Probability Two kinds: 1. Subjective 2. Objective
Inferential statistics – Role
28
Based on experience or intuition -Chance of rain, likelihood of recession, chance of getting married in the next year, likelihood of Miami Heat winning another championship
Subjective probability
29
Based on mathematical concepts and theory
Objective probability – inferential statistics
30
P(event) =. # of outcomes classified as the event divided by/ total number of Possible outcomes The probability of event A, p(A), Is the ratio of the number of outcomes that include event A to the total number of possible outcomes Example What is the probability that a selected Person has a birthday in October, assume 365 days in a year? Step 1: how many chances are there to have a birthday in a year? Step 2: how many chances are there to have a birthday in October? Step 3: the probability that a randomly selected person has a birthday in October is: P (October birthday) = 31/365 = 0.0849
Probability formula
31
Contained in a limited range 0-1. If P = 0, the event will not occur If P = 1, the event will always occur Can be expressed as fractions, decimals, or percentages. These values are always positive. Ex, P = 3/4, P = 0.75, P = 75% (All these values are equal) In order To apply these rules to samples and populations, we must satisfy two requirements: 1. Each individual in the population must have an equal Chance of being selected 2. If more than one individual is to be selected for sample, there must be Constant probability for each and every selection (Sampling with replacement) Example: you draw a number out of a hat and record it, you put the number back and it can be chosen again. Remember, probability and proportion or equivalent. Thus, whenever a population is presented in a frequency distribution grass, it will be possible to represent probabilities of proportions of the graph. Ex., if a population is presented in a graph, it is possible to represent probabilities as proportions. What is the probability of drawing an exam of B or better out of the pile of 31? Many students getting B or better = 24. 24/31 = 77 proportion or 77%. Can convert from a frequency distribution to probability
Probability values
32
Normal shape distributions are the most common occurring shape for population distribution’s. Identify sections of a normal distribution using Z scores (Eg, 1 or 2 SD above mean) The normal shape can also be described by the proportions of area contained in each section of the distribution. Ex., 1). Left and right sides of distribution have the same proportions 2) Proportions apply to any normal distribution Why is this important? We can now describe X values ( Raw scores) In terms of probability. Ex., What is the probability of randomly selecting a person who is taller than 80 inches? 2.28% (See slide) Ex., Raw score of 118 on IQ test converts to Z = 1.02. Look for corresponding proportion in table Where Do These Percentages Come From? Example: Raw score of 118 on IQ test converts to z = 1.02  Look for corresponding proportion in table
Probability And frequency distribution
33
A tool that allows you to see how you’ve done in a normal distribution. Identify sections of a normal distribution using Z scores (Eg, 1 or 2 SD above mean) Collecting enough data tends to yield a normal distribution If I get a particular score, I can convert it to a Z-score (if I know the SD and mean). Z-score of 1.0 means I did better than 84% of the population Needing the population standard deviation and mean is a large limitation. Why use z-score? Because data are normally distributed along the axis. if I get a score and it translates into a 1, I know exactly how I did compared to everyone else. You can find Z score on a table ``` Application of Z-Scores Test 1 908 958 962 977 1000 1000 1045 1046 1047 1060 Mean 1000 SD 50 25 Test 2 109 121 125 145 152 158 165 170 178 180 What if I took both tests? Test 1 Score: 1100 Test 2 Score: 200 ``` ``` Application of Z-Scores 98% 2% Test 1 score: 1100 Z-Score: 2 Test 2 score: 200 Z-Score: 2 What if I get a Z-score that is not a pretty number? ```
Z – score review
34
Normally distributed population:  (Mean) = 24 years old  (SD) = 2 years Normally distributed population:  = 24,  = 2 1. Draw a sample of 25 from population:  = 22 2. Draw a second sample:  = 22 3. Draw a third sample:  = 20 4. Draw a fourth sample:  = 18 5. Draw a fifth sample:  = 26 6. Draw a sixth sample:  = 22 7. Draw a seventh sample:  = 24 8. Draw an eighth sample:  = 24 9. Draw a ninth sample:  = 26 10.Draw a tenth sample:  = 24 Normally distributed population:  =24,  = 2  We now have 10 means from samples of 25:  22, 22, 20, 18, 26, 22, 24, 24, 26, 24 We take those 10 means and create a frequency distribution of the means: Closer inspection of the distribution of sampling means reveals a mean = 22.8 and SD (called standard error of the mean) = 2.52 ( Average distance between data points) Distribution of Sample Means What did we just do? • Used a sample to provide information about a population • What do we already know about this process?Samples provide incomplete pictures of the population called; Sampling Error
Sampling distribution of the means: Inferential statistics
35
(The difference between the mean of a sample and the mean of the population) Or.. The discrepancy or amount of error between a Sample statistic and its corresponding population parameter From Illustration: -Population mean = 24, SD = 2 - Sample mean = 22.8, ST = 2.52 (average distance between data points) Samples will be different from the population because there are different individuals, different scores and therefore different sample means.
Sampling error
36
Sampling Error How can you tell which sample best describes the population? Can you predict how well a sample will describe its population? What is the probability of selecting a sample that has a certain sample mean? We answer this question by establishing a set of Rules that…
...Relate samples to populations
37
The collection of sample means for all possible random samples of a particular size ,(n), that pcan be obtained from a population. Eg, 10 samples yielded a collection of sample means and each sample size was 25, (random samples of a particular size (n). ) Different samples taken from the same population will yield different statistics In most cases, it is possible to obtain thousands of different samples from one population The sample means tend to pile up around the population mean The distribution of sample means is approximately NORMAL in shape We can use the distribution of sample means to answer PROBABILITY questions about the sample means How can we predict characteristics of the sample? It’s not always possible to collect and compute ALL the possible sample means... ..So we need some general characteristics that describe a distribution of sample means. Leads to the Central Limit. Theorem
Distribution of sample means
38
Summary: The larger your number of samples, the more normal your distribution will be. For any population, the distribution of sample means will approach a normal distribution as “n”. approaches infinity The shape of the distribution of sample means will be almost perfectly NORMAL if either one of the following conditions is satisfied: • Population from which sample selected is normal, and the number of scores (n) and each sample is relatively LARGE (n > 30) • A sample mean is Expected to be near its population mean
Central Limit Theorem
39
The larger the sample size, the more probable it is that the sample mean will be CLOSE To the population mean. Primary use of a distribution sample means is to find the probability associated with any specific Sample
The law of large numbers
40
A statistical method that uses sample data, statistics, to Evaluate a hypothesis, question, about a population parameter. A basic,common inferential procedure that uses Z – score is, probability, and the distribution of sample means Purpose: to help researchers differentiate between REAL patterns in data and RANDOM Patterns in data: - .
Hypothesis testing
41
Begins with a population with Known parameters. Goal: to determine what happens to the population after the Treatment is administered. If treatment has any affect, it is simply to add or subtractm, a constant amount to each individual score. •Shape and standard deviation will remain the same
Hypothesis testing
42
Because researchers must have a Standardize method for evaluating results of their research studies. Not everyone will acknowledge visual analysis. Need to disseminate and speak the language of those outside of behavior analysis.
Formulized testing procedure: why it is important
43
* Random sampling * Independent observations; Each Individual data point you get must be independent of the next data point * Value of SD is unchanged by the treatment and Normal sampling distribution
Assumptions for hypothesis tests with Z – score
44
1. State the hypothesis about a population 2. Set the criteria for a decision - Use the hypothesis to predict characteristics that the sample should have. 3. Collect data and compute sample statistics. Obtain a random sample, compute mean 4. Making a Decision - Compare the obtained sample data with the prediction that was made from hypothesis
Hypothesis testing: four main steps
45
Determine the effect of a certain treatment on the population mean: • What is the effect of verbal stimulation on the language development of an infant? • What is the effect of the use of alcohol on visual and auditory perception? •What is the effect of family therapy on the relapse hospitalization rate of schizophrenia patients? • What is the effect of the empty chair gestalt technique on the expression of anger and sadness?
Example hypothesis
46
Step 1: State the hypothesis •Statements about unknown population after treatment in terms of population parameters • Two opposing hypotheses (non-directional): -1. NULL hypothesis H0 : Predicts the independent variable (treatment) will have NO effect on the dependent variable (H0; u = ?) 2. Alternative hypothesis H1, predicts that the independent variable (treatment) WILL have an effect on the dependent variable. (Symbolic statement: H1: u = ? ). The Null Hypothesis H0 vs The alternative hypothesis H1. ..Two opposing hypotheses (non-directional) Keep in mind – these are non-directional hypothesis...So they are referring to two- tailed hypothesis tests... Which means they DO NOT predict the direction (increase or decrease) of change
Step One: Hypothesis Testing 
47
Think of being on trial for a crime and you are innocent Your plea will be “not guilty” Null hypothesis: No relationship (not guilty) Alternative hypothesis: Guilty You are presumed not guilty, the prosecutor must demonstrate that you are guilty If the prosecutor is successful (found guilty) the jury has accepted the alternative hypothesis If the prosecutor is unsuccessful (found not guilty) the jury has failed to accept the alternative hypothesis The defense attorney did not prove you are innocent, there is just not enough evidence for the alternative hypothesis
Fail to Reject?
48
Example: Null hypothesis =Treatment A is no better than Treatment B Alternative=Treatment A is better than Treatment B Possible researcher conclusions: -If Treatment A is better than Treatment B, accept the alternative hypothesis - If Treatment A is NOT better than Treatment B, fail to reject the null hypothesis
Null Hypothesis
49
Step 2: Set the criteria for a decision.. By using data from the sample to evaluate the credibility of the null hypothesis Create a distribution of sample means if the null hypothesis is true Divide the distribution of sample means into two regions: - Sample means likely to be obtained if H0 is true - Sample means that are very unlikely to be obtained if H0 is true Need to separate high probability samples from low probability samples
Step Two: Hypothesis Testing
50
Step three: Collect data, compute sample statistics Date collected after hypothesis stated and criteria set •Ensures honest, objective manipulations of the data Sample mean computed Sample mean compared with the mean stated in H0
Step Three: Hypothesis Testing
51
Step four: Make a decision: • 2 possible decisions using z scores: 1. Reject H0 2. Fail to reject H0 (accept H0) Reject the null hypothesis H0 if: - Sample mean falls in the critical region - Big discrepancy between sample and H0 (Unlikely to occur if H0 is true) -Demonstrate treatment effect Fail to reject the null hypothesis if: - Sample mean does not fall in the critical region -Data is reasonably close to H0 -Treatment effect not demonstrated Note: We do not ever talk about proving the alternative
Step Four: Hypothesis Testing
52
Purpose: To determine whether the result of the research study (the obtained difference) is more than would be expected by CHANCE alone Example: • Z-score statistic that is used in hypothesis testing A test statistic is a statistic (a quantity derived from the sample) used in statistical hypothesis testing.[1] A hypothesis test is typically specified in terms of a test statistic, considered as a numerical summary of a data-set that reduces the data to one value that can be used to perform the hypothesis test. In general, a test statistic is selected or defined in such a way as to quantify, within observed data, behaviours that would distinguish the null from the alternative hypothesis, where such an alternative is prescribed, or that would characterize the null hypothesis if there is no explicitly stated alternative hypothesis. An important property of a test statistic is that its sampling distribution under the null hypothesis must be calculable, either exactly or approximately, which allows p-values to be calculated. A test statistic shares some of the same qualities of a descriptive statistic, and many statistics can be used as both test statistics and descriptive statistics. However, a test statistic is specifically intended for use in statistical testing, whereas the main quality of a descriptive statistic is that it is easily interpretable. Some informative descriptive statistics, such as the sample range, do not make good test statistics since it is difficult to determine their sampling distribution. Two widely used test statistics are the t-statistic and the F-test.
Test Statistic- Hypothesis Testing
53
Hypotheses testing Can be completed for two types of errors: Type l Type ll Type 1: Rejecting the null hypothesis when it is actually true. That is... •Treatment effect found when effect does not exist Consequences of making this error •False reports in the scientific literature P (type I error) = (a) chosen by the experimenter)
Type I error
54
Failing to reject the null hypothesis when it is actually false. That is.. - Treatment effect exists but it is not detected  P (type II error) =  b(beta) – cannot be simply determined.  Type II error can’t be controlled, and it is determined by many factors.
Type II error Hypothesis Testing
55
shortcoming as an inferential statistic: •The computation requires knowing the population standard deviation We use t-statistic, rather than Z-score, for hypothesis testing
Z-scores- a t-test
56
____ with a sufficient sample from a population, and independent observations, we can test a hypothesis. Used to COMPARE two MEANS -tells you whether or not they are statistically different. Completed using independent observations.. •occurrence of the first event has no effect on the probability of the second event • Usually satisfied by using random samples Use when you have TWO groups (e.g., treatment vs. control) Teaching strategy A produces no difference in standardized test scores when compared to standard teaching strategy B Use this rather than a Z score or hypothesis testing What if our samples were not independent?
T-test Study hint (T = “Two Mean”)
57
They Eliminate the problem of individual differences between subjects. •so, also called WITHIN-SUBJECT designs. Greatly reduces the sample variance, which can be inflated due to differences between subjects that have nothing to do with treatment effects
Related-Samples Studies t-test Advantage
58
1. Carryover effects: Subject’s response in the second treatment is altered by lingering aftereffects from the first treatment 2. Progressive error: Subject’s performance changes consistently over time Two ways to deal with potential problems: 1) Counterbalance the order of treatment presentation 2) If substantial contamination expected, use a different experimental design (i.e, independent- measures)
Related samples T – test - Contaminating factors That can cause D to be statistically significant when there is actually no difference between the before and after conditions are:
59
1) The observations within each treatment condition must be independent 2) The population distribution of difference scores (D values) must be normal* (This can be ignored if n is greater than or equal to 30)
Assumptions of the Related-Samples t-Test
60
Some cases, more appropriate to demonstrate skill acquisition than a traditional single-case subjects design Multiple baseline or repeated measures? In this case, a repeated measures design would probably have been more believable - “Slavish devotion to design" Resources would probably have been better spent on testing MORE participants, rather than testing same few participants many times over •Establishes generality • Effective? • Calculate effect size
Repeated Measure Related samples t-test and ABA
61
If three groups instead of two: -Example: The effects on language acquisition of therapy A, therapy B, and therapy C. Could compare A vs. B, B vs. C, A vs. C by using three T-tests This type of Analysis tells whether or not there is a significant difference between THREE or more groups: (A= B = C) Follow-up MCP MCP the difference is located -Example: Therapy A is better than B and C
Analysis of Variance (ANOVA)
62
Depends on the research question Can use ABA in group design Can use hypothesis testing for single-case data with Caution! May promote: - Non-independence - Trends may be masked if we just focus on means - Single-case effect sizes may tell a better story, and be better received among ABA audience than hypothesis testing Can use ABA In a group design if we test early ABA intervention on socially important outcomes in children with autism. Other tests: •Factorial Analysis of Variance where there are More than two groups and more than one factor, MANOVA, And repeated measures ANOVA
Are T-Tests and ANOVAs appropriate for me | -ABA and Hypothesis Testing
63
A statistical technique used to measure and describe a relationship between two variables Tests relationships between quantitative variables or categorical variables. - a measure of how things are related. The study of how variables are correlated
Correlation
64
Type of data/question influences what you can do: Ex.If I ask a question, “Does smoking shorten your life? Can you do a T-test for this? I can look at how long you smoked and how many years lived, and draw a “CORRELATION” Correlations are useful because if you can find out what relationship variables have, you can make predictions about future behavior. Knowing what the future holds is very important in the social sciences like government and healthcare. Businesses also use these statistics for budgets and business plans.
Why Correlations?
65
Three characteristics of the relationship between X and Y: 1. Direction (positive or negative) • Positive correlation is When X and Y change together moving in the same direction: - X increases Y also increases. - X decreases Y also decreases • Negative correlation is when X and Y change inversely: - X increases Y decreases - When X decreases Y increases 2. Form of relationship which is LINEAR 3. Degree (strength) of the relationship. +/- 1.00 is Perfect correlation (straight line) Perfectly consistent, predictable relations. 0.00 has no relationship between X and Y. Note: The correlation coefficient (Pearson Correlation [r]) will always be between –1.00 and + 1.00 ``` Examples  Head size and memory  Height and shoe size  Anxiety and athletic performance  Red cars and speeding tickets  IQ and social skills  GRE and college GPA  Class attendance and grades ``` .
What Does Correlation Measure?
66
You CANNOT determine cause- and-effect relations from correlation A correlation coefficient is a way to put a VALUE to the relationship. Correlation coefficients have a value of between -1 and 1. A “0” means there is no relationship between the variables at all, while -1 or 1 means that there is a perfect negative or positive correlation (negative or positive correlation here refers to the type of graph the relationship will produce). Types The most common correlation coefficient is the Pearson Correlation Coefficient. It’s used to test for LINEAR relationships between data.
Understanding and Interpreting Correlations
67
Describes linear relationship between two or more variables Linear _____. equation Mostly a prediction formula -Builds upon correlations to make predictions Ex’s. Two Variables: New test for high school seniors and First Year college GPA Are these variables related? We Can calculate a correlation easily From graph : Correlation = .89 Very strong positive correlation, but now what? ? Effect size?
Regression- Correlation
68
A measure of strength of a phenomenon Isn’t that covered in significance testing? No, significance testing only informs us of: - the probability of obtaining the results that were obtained in the study, given that the null hypothesis is true” (p.167) Examples:  P < 0.05  P < 0.01 Think about an ABAB reversal design Measures: r/R in correlations and regressions Cohen’s d There are other’s but it’s not important
Effect Size
69
Interpretation, ( in many fields using Cohen’s d..) - Small- 2 to .3, only visible through careful study - Medium - .5 - Large – .8 - Larger, easy to identify But size should be interpreted based on your subject and question For reassuring scenario use the correct tool (T-test) With all of the parameters met, significance at .01 and an effect size of .89
Effect Size
70
Makes it relatively easy to calculate and interpret for group data Can be used to summarize data from many studies with different dependent variables Not dependent on sample size Drawbacks Cannot be applied to behavioral data because of: -Dependent observations -Autocorrelation The. studies can be too different.
Effect Size - Benefits
71
Best way to increase your chances of significance is increasing number of participants A large number of variables that will have a very small Effects become important Limits reasons for doing experiments .. only applies if we working in the hypothetical Deductive model Reduce scientific responsibility Emphasize population parameter is at the expense of behavior The probability is a conditional probability event under the True NULL- hypothesis - Very few situations in which only randomness in data - Best way to increase chances of significance is increasing number of participants
Inferential statistics - reason NOT to use in ABA
72
Behavior is something in individual does, not what a group average does. We should be attending to: • value/social significance • Durability of changes • number and characteristics of participants that improve in a socially significant manner
Inferential statistics in ABA - Reason not to use
73
Can be measured to identify variability and behavioral data (Autocorrelation can be used for this to) Can be described to identify in Porten variability in IOA Data
Standard deviation
74
Goal - To find a single value that best Represents the Entire distribution: Median and mode
Mean
75
Example. How many cups of coffee per day. May find the number one value is 2 cups. Even though mean and median might be different.
Discrete variable- what is most typical score
76
* No info on effect size * Limits reasons for experiments * multiple subjects needed Minimize social significance Minimizes significance of the individual
Hypothesis
77
1. refers to categories | Ex. School districts
Nominal (name)
78
Quantities that have an order Ex. Physical fitness and pain scale (Not a lot you can do with these two types of data)
Ordinal (order)
79
Difference between each value is Even | Ex. Degrees Fahrenheit
3. Interval
80
When the difference between each value is even, has a true Zero Ex. Time, weight, temperature in kelvin
4. Ratio
81
The effect was based on the treatment versus random things in the environment.
REAL Patterns in data- hypothesis testing
82
Wasn’t affecting all of the kids at the same time
RANDOM Patterns in data – hypothesis testing
83
Variables that can only take on a finite number of values are called "discrete variables." All Qualitative variables are discrete. Some quantitative variables are discrete, such as performance rated as 1,2,3,4, or 5, or temperature rounded to the nearest degree.
Discrete Variable
84
A distribution of statistics obtained by selecting ALL possible samples of a specific size from population
Sampling Distribution
85
We can use this to answer PROBABILITY questions about the sample mean
the Distribution of Sample mean
86
We define “high” and “low” probability samples by selecting........ The probability value that is used to define the very UNLIKELY sample outcomes if the null hypothesis is TRUE Commonly used ........ levels are .05, .01, 001 Defines the critical region: •Critical region: The extreme sample values that are very UNlLIKELY to be obtained if the null hypothesis is true Use the alpha level of the unit normal table to find the critical region Example:  (alpha) = .05 We are 95% sure that we are not making a decision error when we reject the null hypothesis Before you run any statistical test, you must first determine your alpha level, which is also called the “significance level.” By definition, the (a) alpha level is the probability of rejecting the null hypothesis when the null hypothesis is true For example, a significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference.
The alpha level (a)
87
AKA level of significance
Alpha Level
88
In inferential statistics, is a general statement or default position that there is no relationship between two measured phenomena, or no association among groups (in a statistical test) the hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error.
Null hypothesis
89
With a sufficient SAMPLE from a population, and independent observations we can Test a
HYPOTHESIS
90
Can be distorted in two ways: 1. Restricted range 2. Outliers
Correlation
91
In statistics, an effect size is a quantitative measure of the magnitude of a phenomenon. (1] Examples of effect sizes - The correlation between two variables, - regression coefficient in a regression - The mean difference, - Even the risk with which something happens, such as how many people survive after a heart attack for every one person that does not survive. For most types of effect size, a larger absolute value always indicates a stronger effect, with the main exception being if the effect size is an odds ratio. Effect sizes complement statistical hypothesis testing, and play an important role in power analyses, sample size planning, and in meta-analyses. They are the first item (magnitude) in the MAGIC criteria for evaluating the strength of a statistical claim. Especially in meta-analysis, where the purpose is to combine multiple effect sizes, the standard error (S.E.) of the effect size is of critical importance. The S.E. of the effect size is used to weigh effect sizes when combining studies, so that large studies are considered more important than small studies in the analysis. The S.E. of the effect size is calculated differently for each type of effect size, but generally only requires knowing the study's sample size (N), or the number of observations in each group (n's).
Effect Size
92
A type of inferential statistic used to determine if there is a significant difference between the MEANS of two groups, which may be RELATED in certain features. It is mostly used when the data sets, like the data set recorded as the outcome from flipping a coin 100 times, would follow a normal distribution and may have unknown variances. A t-test is used as a hypothesis testing tool, which allows testing of an assumption applicable to a population.
T-test
93
A t-test is a type of inferential statistic used to determine if there is a significant difference between the means of two groups, which may be related in certain features. The t-test is one of many tests used for the purpose of hypothesis testing in statistics. Calculating a t-test requires three key data values. 1. the difference between the mean values from each data set (called the mean difference), 2. the standard deviation of each group 3. the number of data values of each group. There are several different types of t-test that can be performed depending on the data and type of analysis required.
KEY TAKEAWAYS
94
Z-tests are appropriate for comparing means under stringent conditions regarding normality and a known standard deviation. A t-test is appropriate for comparing means under relaxed conditions (less is assumed).
Test statistics - two types of tests.
95
Why is probability relevant to inferential statistics? Statistics are, in one sense, all about probabilities. Inferential statistics deal with establishing whether differences or associations exist between sets of data. The data comes from the sample we use, and the sample is taken from a population. So we need to think about whether the sample represents the population from which it has been taken. The larger the sample we take the greater the probability that it is representative of the population. If we took the whole population for our study the probability would = 1 since the sample = the population. A sample smaller than the whole population means that we cannot guarantee that it is similar to the population. There is a probability that it is not. We want to keep this probability of sampling error as small as possible, so researchers often set a limit of probability (p) of a sampling error at no more than 0.05. Some studies might be more stringent and set the chance of a sampling error at 0.01. And in very important studies where you want to be reasonably certain there is little chance of error - say, testing new drugs, some researchers may even use a probability of error being very small indeed at 0.001, saying that the chance of an error is one in a thousand. Type 1 and Type ll Errors
Probability and inferential statistics
96
Say we want to see if a group of patients, who have been given a new drug, have recovered more quickly than a group of patients who received the standard drug. We can use a statistical test to see if there is a difference. Whatever test we use we need to remember that the data we are analysing comes from groups that originally started off as similar to one another. If this were not the case we could not tell if the new drug had made the difference. So if we find a difference, it might be due to the trial, but there is a possibility that it is due to sampling error. Another way of thinking about sampling errors is that it is the error that gives rise to the difference between the sets of data. If the error were not present then there would not be a difference. This type of sampling error, (known as a type 1 error ) says that a difference is found when no difference exists. It is one of the reasons why researchers publish the results of their research. This then enables other researchers to repeat the study to see if they find similar results. If the results were originally due to an error (which has a small chance of happening, ie less than 1 in 20, or 0.05) then repeating the study may not be able to reproduce the result.
Type one error – Probability Associated with inferential statistics
97
Type 2 errors There is the possibility of a type of error, known as a type 2 error. Such possibilities have a probability of occurrence. They arise when it is reasonable to expect a difference and you find that the sampling has resulted in no difference being found. Think about a drug trial, and in this instance think about the possibility that people taking the new drug will each react differently to the drug. Not everybody will respond in exactly the same way to the drug. Some will show a big improvement and for some it will be very minor, if any, improvement. So there is a probability that the trial group is unrepresentative if the sample that forms this group includes folk who do not respond to the new drug. In the end, the type 2 error means we find no difference when one should be found. The probability of such an event can be determined. Researchers usually set the probability in this case at 0.2. That is, a one in five chance of a type 2 error.
Type ll error - probability associated with inferential statistics
98
In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables (or 'predictors'). More specifically, regression analysis helps one understand how the typical value of the dependent variable (or 'criterion variable') changes when any one of the independent variables is varied, while the other independent variables are held fixed. Most commonly, regression analysis estimates the conditional expectation of the dependent variable given the independent variables – that is, the average value of the dependent variable when the independent variables are fixed. Less commonly, the focus is on a quantile, or other location parameter of the conditional distribution of the dependent variable given the independent variables. In all cases, a function of the independent variables called the regression function is to be estimated. In regression analysis, it is also of interest to characterize the variation of the dependent variable around the prediction of the regression function using a probability distribution. A related but distinct approach is Necessary Condition Analysis[1] (NCA), which estimates the maximum (rather than average) value of the dependent variable for a given value of the independent variable (ceiling line rather than central line) in order to identify what value of the independent variable is necessary but not sufficient for a given value of the dependent variable. Regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning. Regression analysis is also used to understand which among the independent variables are related to the dependent variable, and to explore the forms of these relationships. In restricted circumstances, regression analysis can be used to infer causal relationships between the independent and dependent variables. However this can lead to illusions or false relationships, so caution is advisable. Many techniques for carrying out regression analysis have been developed. Familiar methods such as linear regression and ordinary least squares regression are parametric, in that the regression function is defined in terms of a finite number of unknown parameters that are estimated from the data. Nonparametric regression refers to techniques that allow the regression function to lie in a specified set of functions, which may be infinite-dimensional.
Regression- statistics
99
Your caloric intake and your weight. Your eye color and your relatives’ eye colors. The amount of time your study and your GPA.
examples HIGH correlation
100
Your sexual preference and the type of cereal you eat. A dog’s name and the type of dog biscuit they prefer. The cost of a car wash and how long it takes to buy a soda inside the station.
examples LOW correlation (or none at all):
101
Characteristics: - Involves no manipulation or control - Requires two scores for each individual (X and Y) - Presented graphically in a scatter plot
Correlation
102
In statistical modeling, analysis is a set of statistical processes for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables (or 'predictors'). More specifically, regression analysis helps one understand how the typical value of the dependent variable (or 'criterion variable') changes when any one of the independent variables is varied, while the other independent variables are held fixed.
Regression