Exam 3 Flashcards

1
Q

Uses of statistics in research

A
  • Provide a description of the research sample (descriptive statistics)
  • Perform statistical tests of significance on research hypotheses (inferential statistics)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Descriptive Statistics characterize

A
  • shape
  • central tendency (average)
  • variability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When providing a description (picture) of the data set what should you include?

A
  • frequency distributions
  • measures of central tendency
  • measures of variability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What do illustrations of statistics allow

A
  • comparison of the sample to other samples
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Frequency Distribution

A
  • table of rank ordered scores

- shows how many times each value occurred (frequency)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Histogram

A
  • bar graph
  • composed of a series of columns
  • each representing a score or class interval
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Normal Distribution

A
  • bell-shaped
  • most scores fall in middle
  • fewer scores found at the extremes
  • symmetrical
  • mean, median, and mode represent the same value
  • important assumption for parametric statistics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When have a predictable spread of scores with normal distribution

A
  • 68.2% of population within 1SD above and below the mean

- 95.44% of the population within 2SD above and below the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Skewed data

A
  • asymmetrical
  • to right or left
  • distribution of scores above and below the mean are not equivalent
  • there are specific stats appropriate to non-normal distributions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Data that is skewed positively

A
  • skewed to the right (tail points to right)
  • most scores cluster at low end
  • few at high end
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data that is skewed negatively

A
  • skewed to the left (tail points to left)
  • most scores at high end
  • few at low end
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Measures of Central Tendency

A
  • mode: most frequent score
  • median: value in middle
  • mean: average. sum divided by #
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Measures of Variability

A
  • dispersion/spread of scores
  • range
  • percentile
  • variance
  • standard deviation
  • coefficient of variation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Range

A
  • difference between highest and lowest values in distribution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Percentiles

A
  • percentage of a distribution that is below a specified value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Variance

A
  • measure of variability in a distribution

- equal to the square of the standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Standard deviation

A
  • descriptive statistic reflecting the variability or dispersion of scores around the mean
  • square root of the variance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Coefficient of Variation

A
  • measure of relative variation as a %

- (SD/mean)*100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Standardized scores

A
  • z scores
  • expresses scores in terms of standard deviation units
  • z = (score - mean)/SD
  • if mean is 30, SD is 2….score of 32, z-score would be +1, if score was 34, z-score would be +2, if it was 28 it would be -1
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Inferential statistics

A
  • decision making process

- estimate population characteristics from data from a sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Draw valid conclusions from research data

A
  • does the sample represent the population?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Probability

A
  • likelihood an event will occur given all possible outcomes
  • p represents probability (i.e. p=0.50 that a coin flip will be heads)…probability of being within a single standard deviation of the mean is….26%
  • p=0.95 corresponds to z of 1.96 (within 2 SD of mean)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Probability used in research

A
  • helps make decisions about how well sample data estimates characteristics of a population
  • did differences we see between treatment groups occur by chance or are we likely to see these in the larger population?
  • estimating what would happen to others based on what we observe in our sample
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Sampling Error

A
  • estimating population characteristics (parameters) from sample data
  • assumes that samples are random (i.e. individuals randomly drawn from the population), and that samples represent the population
  • i.e. if 1,000,000 people over age of 55 in the population with mean age of 67 and SD of 5.2 years
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Sampling error of the mean for a single sample

A
  • sample mean (Xbar) minus population mean (u)

- if drew many (infinite) samples would see varying degrees of sampling error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Normal curve when plot sample means

A
  • mean of all sample means will equal population mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Sampling distribution of means

A
  • the distribution of sample means

- all plotted

28
Q

Do you use the entire population with sampling error?

A
  • no

- in practice we only select a single sample and make inferences about populations from that sample

29
Q

Predictable properties of normal curve

A
  • use the concept of the sampling distribution to draw inferences from sample data
30
Q

Standard error of the mean

A
  • sampling distribution is a normal curve (can establish its variability)
  • standard deviation of sampling distribution of means is called standard error of the mean (SEM)
  • SEM = SD/root(n), where n is sample size
31
Q

Confidence Intervals

A
  • can use sample mean as an estimate of population mean
  • point estimate
  • single sample value will not be a true estimate of population mean
32
Q

Interval estimate of mean

A
  • interval which contains the population mean

- range of values which contain the population mean (confidence interval CI)

33
Q

Confidence Intervals

A
  • CI is a range of scores
  • boundaries (confidence limits), contains the population mean
  • boundaries are based on sample mean and SEM
  • expressed as 95% confidence interval
34
Q

for a 95% confidence interval, what is the z-score?

A
  • z=1.96

- we are 95% confident that the population mean falls within this range of values

35
Q

Null Hypothesis

A
  • no differences

- H0: uA = uB

36
Q

Alternative hypothesis

A
  • difference

- H1: uA does NOT equal uB

37
Q

Research Hypothesis

A
  • your statistical tests are based on the null hypothesis only: rejecting or failing to reject H0
  • if p<0.05, we reject the null hypothesis and accept the alternative hypothesis
  • if p>0.05, we fail to reject the null hypothesis
38
Q

Non-directional hypothesis

A
  • do not specify which group will be greater in value than the other
  • i.e. H1: uA does not equal uB
39
Q

Directional hypothesis

A
  • specifies which group will be greater than the other

- i.e. H1: uA > uB

40
Q

Type 1 Error

A
  • alpha: 0.05
  • when you state there IS a statical difference but there really ISNT
  • we reject the null in study but should have accepted
41
Q

Type 2 Error

A
  • Beta: 0.20
  • when you state that there IS NOT statistical difference but there ACTUALLY IS A DIFFERENCE
  • WORSE
  • if we fail to reject the null but we should have rejected it
42
Q

p-value amount

A
  • less rigorous p value (i.e. 0.05) INCREASES the change of type I error and REDUCES chance of type II error
  • more rigorous p value (i.e. 0.01) REDUCES the change of Type I error, INCREASES the chance of type II error
43
Q

Statistical Power

A
  • power of a test is the probability that a statistical test will lead to rejection of the null hypothesis (probability of attaining statistical significance i.e. showing diff between groups)
  • usually choose 80% power (for test at 80% power, probability is 80% that a test would correctly show a statistical difference if actual differences exist)
  • statistical power of a test is the complement of B error, 1-B
  • B is probability of a type II error (usually 20%)
44
Q

Significane criterion (a=p value)

A
  • if you chose p=0.01 it is more dificult to show statistical differences btwn groups than if you choose p=0.05
  • trade-off between type I and II errors: the more you reduce the probability of a type I error, the greater your chance of a type II error
  • p=0.05; power=80%; beta=20% (prob. of type II error)
  • p=0.01; power=75%; beta=25% (prob of type II error)
45
Q

Variance within a set of data

A
  • when variability within individual groups is large in their responses or performance on a test
  • then ability to detect differences between groups is reduced (i.e. power is reduced)
46
Q

Sample Size differences

A
  • larger the sample (the greater the power)
  • small samples are unlikely to accurately reflect population characteristics
  • therefore true differences between groups are unlikely to be manifested
47
Q

Effect Size (ES)

A
  • is magnitude of the observed difference between group means or magnitude of relationship between 2 variables
  • if difference between groups means is MSL is 10 inches, ES is 10
  • if we find a correlation of 0.67, ES is 0.67
  • greater ES = greater power
48
Q

Why do power analysis?

A
  • to determine sample size needed for a study
  • researcher can specify a priori a level of significance and a desired power level
  • based on this can estimate (using tables) how many subjects are needed to detect significance for an expected ES
49
Q

When results are not significant, may want to determine the probability that a type II error was committed

A
  • this is a POST HOC analysis
  • if you know the observed ES, level of significance used, and sample size, the researcher can determine degree of power achieved in the study
  • if power was low (high type II error): replicate study with larger sample to increase power
50
Q

Parametric vs. Non-parametric statistics

A
  • stats used to estimate population characteristics (parameters) are called parametric statistics
  • use of parametric tests assumes that the samples are randomly drawn from population that are normally distributed
  • to test this assumption, you test for normality within data
51
Q

Testing the normality of SPSS by viewing the histogram

A
  • analyze; descriptive stats, frequencies, charts, check histograms with normal curves, choose appropriate variables
  • as histogram is just a visual check, may seem to be normally distributed, however move onto the 1 sample K-S test to confirm
52
Q

Testing normality SPSS

A
  • view Kolmogorov-Smirnov test (1 sample K-S) and look for its significance or non-significance (2-tailed) is the p-value
  • go to analyze, non-parametric stats, legacy dialog; 1 sample KS; add in variables you want to test normality
  • if asymp. sig (2 tailed) p-value is p>0.05 then data are normal
  • if p<0.05 then your data is not normal (cannot use parametric stats)
53
Q

Problem for normality

A
  • small samples: one or a few outliers may skew the sample i.e. may not be normally distributed due to 1 or a few outliers
  • high/low z-score (i.e. greater than 3, -3) may be thrown out of a dataset as an outlier
54
Q

Other parametric assumptions

A
  • variances in the samples being compared are approximately equal (homogenous)
  • data must be measured on interval or ratio scale
55
Q

Violating assumptions of parametric tests requires what?

A
  • non-parametric testing
56
Q

Nonparametric tests are not based on population assumptions

A
  • used when NORMALITY or HOMOGENEITY of VARIANCE assumptions are violated
  • used with VERY SMALL SAMPLES that are NOT NORMALLY DISTRIBUTED
  • used with DATA THAT ARE NOT CONTINUOUS (i.e. nominal or ordinal scales)
57
Q

examples of comparing 2 means only

A
  • males vs females
  • young vs old
  • fallers vs non-fallers
  • strength today versus in 7 days time
  • sway with eyes open vs eyes closed
58
Q

Purpose of Parametric T-Test

A
  • to compare 2 means of independent samples or between 2 means obtained with repeated measures
59
Q

T-Test assumption

A
  • assumption of normality
60
Q

Are t-tests independent or paired samples?

A
  • T-tests can be either independent or paired samples
61
Q

Independent samples t-test

A
  • unpaired t-test
  • used when 2 independent groups are compared
  • each group is composed of independent sets of subjects (male vs female, fallers vs non-fallers, assistive device users vs. non-users)
62
Q

Independent T-test SPSS

A
  • analyze. compare means. independent samples t-test.
  • select your grouping variable (i.e. gender is grouped as male or female) and define groups (1=male, 2=female)
  • the group coding is determined based on how data was coded in the variable view
  • test variables are what you want to compare between the two groups and you can click over as many variables as you want
  • once click ok, will take you to output
63
Q

Output of Independent T-tests

A
  • first look at the Levene’s test to determine if there is equal or unequal variance for age between M and F
  • if Levene’s test is significant then equal variance is not assumed
  • if Levene’s test is not significant, then equal variance is assumed
  • depending on this value you will use the top or bottom row of the output in the independent samples test box
64
Q

Independent t-test SPSS (age in males vs females)

A
  • 1st check equality of variances
  • in this case, equal variance for age in M vs F is not assumed because the Levene’s test is significant (i.e. there is a significant difference in the variance for the variable age, between males and females)
65
Q

Output 2nd

A
  • to determine if there was an actual difference in mean age between the two groups
  • you then look at the Sig (2-tailed) value in the appropriate row
66
Q

Paired samples t-tests

A
  • used when there is a definite relationship between each pair of data points
  • measurements are taken from the same subject (i.e. each subjects step length for trial one versus trial two)
67
Q

Paired T-test SPSS

A
  • analyze; compare means; paired samples t-test
  • select the two variables you want to compare
  • click ok
  • output: look in the table labeled “paired samples test”…do not look at the p-value under the table labeled “paired samples correlations”. to determine if there is a diff between the variables, look at the p-value under sig (2-tailed)