Statistics Flashcards
Also called a categorical variable. Simple classification. We do not need to count to distinguish one item from another, mutually exclusive.
Nominal
The only discrete-only in scales of measurement.
Nominal
The only continuous alone on scales of measurement or have 0.5 as the smallest unit.
Ordinal
Cases are ranked or ordered. Represent position in a group where the order matters but not the difference between values.
Ordinal
It uses intervals equal in amount measurement where the difference between two values is meaningful.
Interval
Similar to interval but includes a true zero point and relative proportions on the scale make sense.
Ratio
Which among the scales of measurement are parametric and non parametric?
P- Interval & ratio
NP- Nominal & Ordinal
What are the 4 scales of measurement?
Nominal
Ordinal
Interval
Ratio
Refers to the analysis of data of an entire population merely using numbers to describe a known data set.
Descriptive Statistics
Value in a group of values which is the most typical for the group, or the score at which all the scores are evenly clustered around. The average or midmost score.
Measures of Central Tendency
What are the measures of central tendency?
Mean
Median
Mode
The average/arithmetic mean. Sum of a set of measurements in the set. Data is interval only.
Mean
Central value of a set value such that the half the observations fall above it and half below it. The middle score in the distribution. Use ordinal and interval data.
Median
Modal value of a set. Most frequently occurring value. For grouped data, it is the midpoint of the class interval with the largest frequency, uses nominal, ordinal and interval data.
Mode
Measures of how much or how little the rest of the values tend to vary around the central or typical value. Variation or error.
Measures of variability/Dispersion
What are the measures of variability/dispersion?
Standard deviation
Variance
Range
What level of data does all measures of variability/dispersion use?
Interval (some books include ratio)
Square root of variance. Shows the distribution of measurement.
Standard deviation
(Sd)²
Variance
Simplest measure of variation. Difference between the largest and smallest measurement.
Range
Used to describe the position of a particular observation in relation to the rest of the data set.
Measures of Location
In measures of location, The pth percentile of a data set is a value such that at least percent of the observation take on this value or less and at least _ percent of the observations take on this value or more.
100-p
What are the measures of location?
Percentiles
Quartiles
Deciles
Frequency Distribution
Percentage of the total number of observations that are less than the given value. Identifies the point below which a specific percentage of the cases fall.
Percentiles
The data can be divided into 4 parts instead of two. This is what you call the cut points.
Quartiles
The data can be divided into 10 parts instead of two or four. This is what you call the cut points.
Deciles
A classification of data that may help in understanding important features of the data may be graphically presented in the form of a histogram, polygon, etc.
Frequency Distribution
This measure of location represents the same 2 elements:
Set of categories that make up the original measurement scale.
A record of the frequency, or number of individuals in each category.
Frequency Distribution
All measures of location use ordinal, interval, and ratio level of data except _ which uses all levels of data.
Frequency Distribution
Measurement of the extent to which pairs of related values on 2 variables tend to change together; gives measure of the extent to which one variable can be predicted from values on the other variable.
Measures of correlation.
If one variable increases with the other, the correlation is positive (near _). If the relationship is inverse, it is a negative correlation (near _). A lack of correlation is signified by a value close to _.
+1
-1
0
What are the measures of correlation?
Pearson’s Product moment correlation
Spearman’s Rho Rank-order
Kendall’s Coefficient of Concordance W
Point-Biserial Coefficient rpb
Phi or Fourfold Coefficient
Lambda
A measure of correlation for 2 groups, using interval level of data. Data must be in the form of related pairs of scores. The higher the r , the higher the correlation.
Pearson’s Product Moment Correlation (r)
A measure of correlation for 2 groups, using the ordinal level of data. Data must be in the form of related pairs of scores and is used for ≤ 3. Easy to calculate but non parametric.
Spearman’s Rho Rank-order
A measure of correlation for ≥ 3 groups, using the ordinal level of data. Data must be ≥ 3 sets of ranks. Easy to calculate but non parametric.
Kendall’s Coefficient Concordance W
A measure of correlation for 2 groups, using the continuous and dichotomous nominal level of data.
Point-Biserial Coefficient rpb
A measure of correlation for 2 groups, using 2 dichotomous nominal level of data.
Phi or Fourfold Coefficient
A measure of correlation for ≥ 2 groups, using nominal (dependent/independent) ) levels of data. It is also known as Guttman’s Coefficient of predictability. Gives an indication of the reduction of errors made in a prediction scheme.
Lambda
A non parametric measure of the agreement between two rankings.
Tau Coefficient
Tests for statistical dependence.
Kendall’s Tau Coefficient
An index of interrater reliability of ordinal data.
Coefficient of Concordance (W)
Measurement of the extent to which pairs of related values on 2 variables tend to change together; gives measure of the extent to which values on one variable can be predicted from the values on the other variable.
Inferential statistics
What are the inferential statistics tests?
Z-test of one sample mean
T-test
Variation of t-test
Independent samples
Dependent samples
Proportions/Percentages
Variances
2 correlation coefficients
What level of data do all tests for inferential statistics use?
Interval
A measure for inferential statistics for 1 group.
N ≥30 used to test whether a population parameter is significantly different from some hypothesized value.
Z-test of one sample mean
A measure for inferential statistics when n< 30
T-test
This kind of t-test is for 2 groups. It assesses whether the means of 2 groups are statistically different from each other.
Independent samples
This kind of t-test is for 1 group. It is used when the subjects making up the 2 samples are matched on some variable before being put in the 2 groups or the situation where the 2 groups are the same subjects administered a pretest and post test.
Dependent samples
This kind of t-test is for 1 group. It is used to test the hypothesis that an observed proportion is equal to a pre-specified proportion.
Proportions/Percentages
This kind of t-test uses the F test for equal and unequal.
Variances
This kind of t-test is for 2 groups. It is used to assess the significance of the difference between two correlation coefficients found in 2 independent samples.
2 correlation coefficients
It is used for problems of predicting one variable from a knowledge of another or possibly several other variables. It is always the regression of the predicted value on the known variable.
Regression Equation
What are the regression equations?
Linear regression of y on x
Linear regression of x on y
Standard error of estimate (SEE)
Standard deviation of errors of prediction. An indication of the variability about the regression line in the population wherein predictions are being made.
Standard error of estimate (SEE)
Among ANOVA and t-tests, which organizes and directs analysis and has easier interpretation of the results?
ANOVA
Performing repeated t-tests increases the probability of _?
Type I error
ANOVA needs to be followed by what test?
Post hoc test
What does the post hoc test determine?
Which group differs from each other.
We should not conduct a post hoc test unless the null is ?
Rejected.