Definitions Flashcards

1
Q

measurement

A
  • Assigning number or codes to aspects of objects or events according to rules. -
  • positioning observations along numerical continuum -
  • classifying observations into categories
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Observation

A

Unit upon which measurement is made

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Variable

A

measurable charactoeristic that varies among persons, places, or objects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Nominal measuremsents

A

Observation variable that have two or more categories, but there is no intrinsic ordering to the categories. Nonparametric.

Examples: sex, blood type

aka. Categorical variable, attribute variable, qualitative variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Ordinal measurements

A

Observation variable that has categories that can be put into rank order. Differs from interval, b/c space b/w values is not equal. Non-parametric.

Examples:Stage of cancer on a point scale); economic status (low, med, high)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Quantitative measurements

A

Observation variables are along meaningful numeric scale.

  • Interval = is equal spacing scale, but not absolute zero. (i.e. Farenhight, celcius)
  • Ratio = is value has absolute zero and can be added. (i.e. age, body weight, kelvin)

aka, ratio/interval measurement, numeric variable, scale variable, continuous variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Surveys

A

Type of study used to quantify population characteristics. “sampling” rule of statistics b/c data for entire population is rarely available.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Simple Random Sample (SRS)

A

Randomly sample population to collect data so:

1) each population member has same probability of being selected in the sample
2) selection of any individ into the samples is not bias for selecting another individ.
aka. sampling independence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Cautions

A

samples that tend to over- or under-represent certain segment of pop that can bias survey results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Undercoverage

A

Type of sample caution. Occurs when some groups in the source pop are left out or underrepresented. Will undermine achieving equal selection probabilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Volunteer Bias

A

Type of sample caution. Occurs b/c self-selected participants of a survey are atypical of pop. ex. web survey volunteers have a particular view point causing hem to participate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Nonresponse Bias

A

Type of sample caution. Large % not represented, Occurs when large % of individs refuse to participate in survey. nonrepsonders differ from responders, which skews survey.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Probability Sample

A

Each member of pop has known probability of being selected. Include SRS, stratified random samples, cluster samples, and multistage sampling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Stratified random sample

A

draws independent SRS from a homogeneous “groups” or “strata.” Ex. divide pop into age groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Cluster samples

A

Randomly selects large units (clusters) consisting of smaller subunits. Ex. list of household addresses to study all individs in cluster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Comparative study

A

Learn relationship b/w an exploratory variable and a response variable. Compare group expose vs. not expose to exploratory factor.

  • two types: Experimental and Non-Experimental (observationa)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Experimental studies

A

Investigator assigns exposure to one group and not the other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Nonexperimental Studues

A

investigator classifies groups as exposed or nonexposed w/o intervention aka. Observational studies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Exploratory Variable (IV)

A

Treatment or exposure that explains or predicts change in the response variable.

aka. (IV) Independent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Response Variable (DV)

A

Outcome or response being investigated.

aka. (DV)Dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Lurking variables

A

Extraneous factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Confounding Variables

A

Distortion in an association b/w exploratory variable and response variable by influence of extraneous factors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Factors

A

Exploratory variables in experiments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Treatment

A

Specific set of factors applied to subject

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Intersection

A

Factors in combination produce effects that could not be predicted by looking at the effect of the factors separately.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Trials

A

Experiments involving human subjects. Two types: Controlled and Randomized Controlled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Randomized control trial

A

Assigned treatment is based on chance. Helps sort out effect of treatment from those of lurking variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Equipoise

A

Balanced doubt about benefits and rick

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Discrete variable

A

Finite number of values b/w any 2 points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Continuous variable

A

infinite number of values b/w 2 points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Shape (graph)

A

Configuration of data points as they appear on a graph. Described in terms of :

  • skewness: shape reflects mirror image
  • modality: number of peaks -
  • kurtosis: “peakedness” of distrubution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Location (graph)

A

Distribution summarized by its center (Central tendency)

  • Mean: center of distribution. “arithmetic avg.” is distrib. balancing point -
  • Median -
  • Mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Depth of data Point

A

Corresponds to its rank from wither top or bottom of ordered list of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Spread (graph)

A

Refers to distribution/variability of data points.

Measures of Spread

  • Range
  • Quartiles
  • Stnd. Dev.
  • variance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Class intervals

A

Group data in intervals with equal or unequal spacing before tallying freq.

Endpoint Conversion: ensure observations falls within interval

    • include left boundary and exclude the right
    • include right boundary and exclude left
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Relative Frequency

A

Proportion equation: freq. counts/ by total.

Expressed in %

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Cumulative Frequency

A

Proportion that falls in or below a certain level.

Equation: add two consecutive Rel. Frequencies.

Expressed in %

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Bar Chart

A

Display freq. with bars that correspond to height of freq.

Best for categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Histogram

A

Bar chart with line connecting freq. .

Best for Quantitative variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Descriptive Statistics

A

Set of observations that describe the characteristics of a sample.

ex: Cetntral tendency (mean, median, mode), Variability (St. Dev. variance, range, quartiles)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Inferential Statistics

A

Set of statistical techniques that provide predictions about the population based on info in the pop sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Univariate Statistics

A

Involve one variable at a time (i.e. age, height, weight)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Bivariate statistics

A

Involve two variables of the sample examined simultaneously (pre/post test)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Multivariate Statistics

A

Involve 2 or more variables in the same analysis

45
Q

Stemplot

A

graphical technique that organizes data in a histogram-like display

46
Q

mean

A

Arithmetic average of data VALUES. Balancing point in a set. Highly susceptible to outliers and skew.

Formula:

  • sample: (Σ n)/n = (X bar); Population: (Σ N)/N = µ

Functions: 1) predict individ. value drawn at random from sample, 2) predict value drawn at random from pop

* Best to pair with Stn. Dev for symmetrical distributions

47
Q

Median

A

Midpoint of a distribution in CASES. More ROBUST (resilient to outliers and skew.)

Formula: put in order, calculate (n+1)/2, count places to midpoint.

* Best to pair with IQR for asymmetrical distributions. always Q2, 50th percentile

48
Q

Mode

A

Most frequently occurring value in data set.

Useful in only ;arge sets with repeating values.

49
Q

Variability

A

Measure of spread. Fundamental interest of behavior scientists.

50
Q

Range

A

Measure spread of distribution. simplest measure of variability. Max -Minimum distribution Limitations; known to be biased or or highly unstable; increases w/ sample size. *Should always be supplemented with another unit of measure.

51
Q

Quartile

A

Intuitive way to describe variability by dividing data set into 4 segments: - Q0 (min) = 0% - Q1 (lower hinge) =25% - Q2 (median) = 50% - Q3 (upper hinge) = 75% - Q4 (Max) = 100% Find MEDIAN to identify quartiles

52
Q

Hinges

A

Orded array of “folds” upon itself.

53
Q

Interquartile Range

A

Summary spread of measure that captures middle 50% of data points in set.

  • 5 poitn sumary (Q0 - Q4)

IQR = Q3-Q1 (when Q3 is median b/w Q2 and Q4; Q1 is MEDIAN b/w Q0 and Q2; Q3 is the overall median)

Not sensitive to extreme values.

54
Q

Box-and-Whiskers plot

A

Displays five-point summaries and “potential outliers” in graphical form.

aka. box plot.
box: spans IQR

55
Q

Fences

A

lower = Q1 - (1.5)IQR. Upper = Q3 + (1.5)IQR Values below fences are “lower outside values” Values above upper fence are “upper outside values” Smallest values inside lower fence is the :lower inside values” Largest value inside upper fence is “upper inside value”

56
Q

Variance

A

Common measure of spread.

Population: σ^2 = SS/(N) Sample: S^2 = SS/(n-1)*

SS=Sum of Squared deviations

*substract 1 from n to force a larger variance and SD (makes it an unbiased estimate)

57
Q

Variability

A
  • Always present average with variability as to not misrepresent data.
  • 2 data sets can have the same average but differenct variability.
58
Q

Standard Deviation

A

Common measure of spread Unbiased estimate of samples (good scientists are CONSERVATIVE!)

Formula: Square root of variance

  • Sensitive to outliers and skews
  • Useful for making comparisons
  • smaller the SD, the more HOMOGENIOUS the set
59
Q

Chebychev’s Rule

A

For Data sets: At least 3/4s of the date points lie within two stn. devs. of the mean.

60
Q

Normal Rule

A

For data sets: applies only to distributions with a particular NORMAL shape..

    • 68.3% of points fall within mean + 1 stb. dev. -
  • 95.4% of data points lie within mean + 2 stn. devs -
  • 99.7% of data points lie within mean + 3 stn. devs.

aka. 68-95-99.7 rule

Properties of Noral Curve:

  • Asymmetrical
  • unimodal
  • bellshaped
  • mean, median and mode are equal
61
Q

Symmetrical vs. Asymmetrical Distribution

A

Symmetrical: Mean = Median

Asymmetrical: Mean not = Median -

  • Positive Skew: Mean > Median -
  • Negative skew: Mean < Median
62
Q

Sum of Squares

A

Each data points deviation from the data set mean, squared, then all sumed. aka. SS +E (X1 - Xbar)^2 Calculating formula: SS= Ex^2 -((EX)^2/ N). 1) Sum data points and square, then divie by n. 2) Square each data point and then sum, 3) value of 2-1. *mathematically the same as above, needed for SPSS.

63
Q

Probability

A

proportion of times an event is expected to occur.

Between 0 (never) and 1 (always)

Founded on ralative frequencies.

64
Q

Probability: random variable

A

Numerical quantity that takes on different values depending on chance

65
Q

Probability: population

A

set of all possible outcomes for a random variable

66
Q

Probability: Event

A

An outcome or set of outcomes for a random variable

67
Q

Probability: Discrete random variables

A

Countable set of possible outcomes. Fractional units not possible. ex. variable # of luekemia cares in the US in 1995, variable # of successes in n independent treatments,

68
Q

Probability: Continuous Random variable

A

outcome quantities with unbroken continuum of possible values. Ex. variable amount of time it takes to complete a task; average weight or height of a newborn.

69
Q

4 Properites of probability functions

A

1) Range of Prob. - individ. props are never less than 0 and never more than 1 . 01 2) Total Prob. - probs in the sample space must sum to 1. Pr(S) =1 3) Complements - prob of a complement is equal to 1 minus prob of event . Pr (_A_) = 1 - Pr(A) 4. Disjoint events - events are disjiont if they cannot exist concurrently. Pr(A or B) = Pr(A) + Pr(B)

70
Q

Z score

A

States the number of std. devs by which the original score lies above or below the mean of a normal curve. Formula: z = (x^i - x_)/ s - z distribution aka. standard Normal curve. - Mean = 0; s= 1 - Method to interpret raw score; takes into account mean value and variability of set of raw scores.

71
Q

Types of scores

A
  • Raw Score (x): individual observed scores on measured variables. - Deviation of score (s) - standard score (Z)
72
Q

Normal Curve

A
  • Bell shape, symmetrical, unimodal. - Same Mean, Median, and Mode - precise relationship b/w area under curve and Std. Dev.
73
Q

Law of Probability

A

Use statistical framework that allows researchers to determine how likely it is that the research findings based on sample data are VALID. Proportion of times an event is expected to occur in the population. Prob. ranges from 0 to 1

74
Q

Inference

A

Act of using data in a sample to make generalizations about its population.

Goals:

  • hypothesis testing
  • estimate value of population parameters
75
Q

Statistical Population

A

entire collection of values that conclusions are drawl on.

76
Q

Hypothetical Population

A

Infinitely large population of potential values that could ensure following study.

77
Q

Parameters vs. statistics

A

Parameter: numerical characteristics of a statistical population (population level) Statistic: value calculated in a sample. (sample level) - use different symbols (i.e u, σ vs. X_, s for mean)

Statistic –> statistical inference –> Parameter –> Random selection –> Statistic

78
Q

Sampling distribution of a mean

A

The hypothetical distribution of mean from all possible samples of size n taken from the same population.

Characteristics:

  • follows central limit theorem
  • unbiased estimator of population mean.
  • Samples means are less variable than individ. distribution. (square root law)
79
Q

Central Limit Theorem

A

Sampling distribution of x̅ tends toward Normality even when the underlying population is not Normal

i.e. Distrubution gets narrower as sample size increases

80
Q

Standard error of the mean (SE)

A

Standard Deviation of x̅

Formula: SEx= σ/ √(n)

Law of large numbers: As an SRS gets larger and larger, its sample mean x̅ gets closer and closer to the true value of pop. mean.

81
Q

Null hypothesis

A

Statement of NO difference H^o: u = “some number”

Reject H0 = True (Type I error, a)/ False (correct decision)

Fail to Reject Ho=True (correct decision)/ False (Type II error, ß)

Alpha:

  • Probabilty of Type I error
  • Chnce you are willing to take in mistakenly rejecting a true null hypothesis

Beta:

  • Probability of Type II error
  • Chnce you are wiling to take in mistakenly accepting a false null hypothesis
82
Q

Alternative hypothesis

A

Statement that claims a difference from null hypothesis.

Ha: u <,>, –> one-sided z-test

Ha: µ not = –> two-sided z-test

83
Q

Zstat

A

Statistical distance of samples mean X_ from the hypothesized value of u this provides the weight of evidence for or against Ho. Zstat = (X_ - uo)/ SE_X_

84
Q

Point Estimation

A
  • Provides a single estaimtate of the parameter
  • No info regarding probability of accuracy; best “guestimate”
85
Q

Central Limit Theorem

A

If populiation is not Normal, the distribution of sample means approaches Normal distribution as the size of sample gets larger.

86
Q

Hypothesis Testing Steps

A
  1. Define hypothesis: Hoand Ha.
  2. Test Statistic: calculate SE and Z/Tstat
  3. Determine P-value: Z/Tstat for CL
  4. Decide Significance level: Compare Z/Tstat to P-value. Statistically signifigant or not?
  5. State Conclusion
87
Q

Interval Estimation

A

Provides a range of values (CI) that seekd to capture the parameter

  • Confidence Interval between two limit values.
88
Q

t-Test

A

Testing statistical hypothesis about µ when

1) σ is unknown
2) samples size is small (n > 30)

89
Q

Degrees of Freedom (df)

A

Value indicating the # of independent pices of info a sample can provide for purposes of statistical inference.

90
Q

Determining CI for µ

A

x̅ ± t¤ /2* SE

Mean Difference shoudl fall between upper and lower bound,

Ex. 90% CI –> ¤ = .1 –> .1/2 = .05 –> (1-.05) =.095

Look up in t-stat table: df and P(.095)

91
Q

Single Sample

A

Reflect experience of a single group. NO control group, but results are cmpared to norms or expected values

92
Q

Paired Sample

A

Uses Data from two samples in which each data point in the first samples is matched to a data point in the 2nd sample.

Ex. Pre- and Post-sample from same subject

93
Q

Independent Samples t-Test

A

Use when comparing two samples in order to draw inferences about groups differences in the population.

  • Two levels of a nominal level variable; dependent variable approximates interval-scale characteristics. I.e DV = #tv hrs; RV = males, females
  • assumption of equal variances .
  • St. Dev of such sampling distribution is standard error of the difference.
94
Q

Independent Samples

A

Usese two smapels from separate populations. Data points are unrelated.

Ex. Eperimental study with treatment and control

95
Q

ANOVA

A

One-way analysis of variance

  • compares 3 or more groups defined by one factor.
  • variation is the response analyized to understand group differences; in place of independent t-Test.
  • Ho: µ1= µ2= … = µk

EX: patients assigned to three treatment groups and measured on stress score (DV) in reaction to treatment (IV)

96
Q

Mean of Squares Bewteen (MSB)

(ANOVA)

A

Quantifies variance of group means around the grand mean.

MSB = SSB/ dfB

SSB =n(x - grand Xbar)2 +…. –> (group mean - grand mean)2 x group n +…

  • measures variability between the groups comparing to grand mean.
97
Q

Mean Square Within (MSW)

ANOVA

A

Quantifies variability of data points in a group around its mean.

MSW = SSW/ dfW

SSW = (x - Xbar)2 +……. –> (individual point - group mean)2 + ….. then sum all SS together

  • Measures variability within each data group.
98
Q

F-statistic

(ANOVA)

A
  • Ratio of MSB and MSW.
  • Large F-stat suggests the observed mean differences are NOT merelry due to random noise.
  • Fstat = MSB/MSW
  • When converting f-stat to P-values: DF: numerator dfB/ denominator dfW
99
Q

Levene Test

A

Tests for variances assumed equal. Use when comparing two or more groups (samples).

Ho: σ12=σ22 = σ32

Accept null when p-value is greater than CI.

100
Q

Correlation Coefficient (r)

A

Strength of a linear relationship.

1- < r 0 < r <1

Stength

  • Close to 1: when all point fall on a line with an upward slope
  • Close to 0: lack of linear correlation

Direction:

  • Upward slope = postive number
  • Downward slope = negative number

3 r’s:

  • metric…
101
Q

Coefficient of determination (r2)

A

Statistic that quantifies the proportion of variance in Y explained by X.

Expressed by coverting r2 to % - x% of varience of Y is explained by X

102
Q

Single Regression Line

A

Expresses functional relationship b/w X and Y by fitted a line to observed data.

  • Observed y = predicted y + residual
  • Residual = observed y - predicted y

Least Squares regression Line: drawn to minimize sum of squares

Formula: ŷ = a +bx; ŷ = predicted y, a = interception of regression at Y axis , b = slope.coefficient

b = r (sy/sx)

a = Ybar = b(Xbar)

Notes:

  • Not rebust
  • b show relationship b/w X and Y in same units as measure. r is unit-free measeure of strength
  • X must be IV; Y must be DV
103
Q

Confidence Interval for Population Slope

A

Hypothesis:

  • Ho:B = 0
  • Ha:B not = 0

t-stat = b/ SEb

CI formula: b +/- tn-2, 1 - (¤/2)* SEb

  • If “0” is captured in the CI for population slope, data is NOT sig.
104
Q

Multiple Regression

A

Address multiple exploratory variable (IVs) in relation for response variable (DV).

IMPROVES prediction by using two or more variables to predict a dependent variable.

Formula: Y’ = a + b1X1+ b2X2 ….

105
Q

Kurtosis

A

Refers to the “peakedness” of a distribution.

  • Leptokurtic: narrow peak
  • PLatykurtic: flat peak (plataeu)
106
Q

Chi-Squared Test

A
  • Measure os association b/w 2 nominal variables
  • magnitude of Pearson Chi-Square reflects the amount of discrepancy between observed frequencies and expected frequencies.
  • does not make any assumptions about the shape of the distribution nor about the homogeneity of variances.

Formula = Observed - Expected/ Expected

107
Q

PARAMETRIC VERSUS NONPARAMETRIC STATISTICS

A
  • Use nonparametric stats when:
    • the parametric assumptions cannot be justified: normal distribution, equal variances, etc.
    • data as gathered are measured on nominal or ordinal data
108
Q

Properties of Sampling distribution

A
  • mean of a sampling distribution of means will be the same as the mean of scores in the population (µ).
  • Central Limit Theorem
  • Allows us to determine the probability that the particular sample obtained will be unrepresentative.
    *
109
Q

One -Sample Z test

A
  • Used to compare a sample mean to a (hypothesized) population mean and determine how likely (chance) it is that the sample came from that population.
  • Compare the probability associated with statistical results (i.e. probability of chance) with a predetermined alpha level.