Unit 12: Analyzing Quantitative Data Flashcards
level of measurement
is away to classify quantitative measure
levels of measurement
1.nominal measurement - involves using numbers simply to categorize attributes
> provided information only about categorical equivalence and non equivalence
2. ordinal measurement ranks objects on their relative standing on an attribute.
3. interval measurement occurs when researcher can specify the ranking of objects on an attribute and the distance between those objects.
4. ratio measurement is the highest level of measurement. have meaningful zero
>provide information about the absolute magnitude of the attribute
descriptive statistic
statistic used to describe and summarize data
parameter
indexes are calculated on data from a population
statistic
a descriptive index from sample
frequency distribution
is a systematic arrangement of numeric values from lower to highest, together with a count or percentage of the number of times each value was obtained
symmetric distribution
occurs of when folded over, the two halves of frequency polygon would be superimposed
skewed distribution
asymmetric distribution when the peak of off center and one tail is longer than the other
positive skew
is tail is longer of the right side
negative skew
is tail is longer on the left side
normal distribution
bell shape is a symmetric unimodal and not very peaked
central tendent
a stat index of the typicality of a set of scores, derived from the center of the score distribution; indexes of central tendency include the mode, median and mean
variability
the degree to which values on a set of scores are dispersed
range
the highest score minus the lower score in a distribution
standard deviation
the most widely used variability. it is calculated based on every value in a disturbance. summarizes the average amt of deviation of values from the mean
bivariate descriptive statistics
two variable descriptive statistic describe relationship between two variable
>contingency table is two dimensional freq distribution in which the freq of two variable cross tabulated
»correlation. relationship between two variables, positive or negative
range from-1 to 0 or 0-1
cross tabulation
a determination of the # of cases occurring when two variable are considered simultaneously. the results are typically presented in a table with rows and columns divided according to the values of the varaible
product moment correlation coefficient
persons r which is computed with interval or ration measures
spear’s rank order correlation
r(s) or spearman’s rho one correlation index for ordinal measure
correlation matrix
in which variables are displayed in both rows and columns
descriptive statistic are useful for
summarizing data more than simply describe
inferential statistic
which are based on the laws pf probability, provide a means for drawing conclusion about a population given data from a sample
sample distribution of the mean
which is a theoretical distribution rather than actual distribution because in practice no one draws consecutive sampled from a population and plots their means
statistic have deonstrated
means follow a normal distribution
mean of sampling distribution for an infinite # of sample means equals the population mean
standard error of the mean (SEM)
problem is to determine the standard of deviation of he sampling distribution
increasing the sample size increased the accuracy
the more homogeneous the population the more accurate
hypothesis testing
consist of two major techniques:
estimation procedure are used to estimate a single population characteristic, such as a mean value
statistical hypothesis provides objectives criteria for deciding whether research hypotheses should be accepted as true or rejected as false
type 1 and type 2
type one error is rejecting the null hypothesis when it actually is true
type two accepted a false null hypothesis a false negative conclusion
level of significance
is the term used to signify the probability of making a type one error (alpha)
beta
probability of type two error estimated power analysis
test of statistical significance
every test statistic there is theoretical sampling distribution, analogous t the sampling distribution of means
hypothesis testing
uses the practical distribution to establish probable and improbable values for the test statistic
parametric and non-parametric tests
have three attributes 1. they focus on population parameter
2. they require measurements on at least an interval scale
3. they involve other assumption such as assumption that the variables int he analysis are normally distributed n the population
>more powerful and preferable than nonparametric
nonparametric test
do not estimate parameters and involve less restrictive assumption about the shape of he distribution of the critical variables
hypothesis testing procedure
selecting an appropriate test statistic
selection the level of significance
computing a test statistic
comparing the test statistic to a tabled value
degree of freedome
refers to the # of observation degree to vary about a parameter
bivariate statistical tests.- t tests
the procedure used to test to statistical significance pf a difference between the means of two groups is the parametric test
the value of the t statistics is calculated based on group means, variability, and sample size
establishes an upper limit to what is probable;e of the null hypothesis is true
ANOVA
analysis of variance is a parametric procedure uses to test mean group differences of three or more groups
variation between groups is contrasted with variation within groups to yield the statistic ca;;ed F ratio
multiple comparison procedures
are needed to isolate the differences between group means that are responsible for rejecting the general ANOVA null hypothesis
repeated measures of ANOVA
is sued when the means being compared are means at a diff points in time
chi squared
is non-parametric procedure used to test hypotheses about the proportion of cases that fall into various categories, as in contingency table
by summing difference between the observed freq in each cell and the expected freq- the freq that would be expected if there were no relationship between the two variable
multivariate
refer to analyses dealing with at threat three but usually more variable simultaneously
multiple regression
which allows them to use more than one independent variable to explain predictor a dependent variable
independent variable
are either interval level or ratio level variables or dichotomous nominal level variable such as male/female
multiple correlation coefficent
R used to predict a dependent variable the resulting statistic varies from .00 to 1.00
> the overall relationship between the independent variable and the dependent variable and the dependent variable is likely to be real or the results of chance fluctuations
> second way of evaluating R is to determine whether the addition of new independent variables adds further predictive power
> the magnitude of R is also informative
when squared it can be interpreted as proportion of the variability in the depend variable accounted for or explained by the independent variables
other multivariate techniques
discriminant function analysis is used to make predictions about membership in groups-that is, about a dependent variable measured on the normal scale
logistic regression
analyzes the relationship between multiple independent variables and a nominal level dependent variable
odds ratio
OR which is the factor by which the odds change for a unit change om the predictor
factor analysis
is widely used by researchers who develop, refine, or validate complex instruments
multivariate analysis of variance
MANOVA is the extension of ANOVA to more than one dependent variable
causal modeling
involves the development and statistical testing of a hypothesized explanation of the causes of a phenomenon, usually with non experiential data
pathanalysis
which is based on multiple regression, is a widely used approach tp casual modeling
LISREL
linear structural relations analysis are highly complex stat tech whose utility relies on a sound underlying casual theory