Lecture 2: describing data with numbers Flashcards
1
Q
discrete
A
- levels of the variable can only be described as different types of categories
- measuring somethings in separate categories
2
Q
continuous
A
- levels of the variable can take on a range of values that are not restricted to a list of simple categories
- take on any value along a range
3
Q
nominal scale
A
- levels represent different categories or groups
- most basic
- discrete categories and count how many people fit into that category
- assign numbers to objects where different numbers indicate different objects
4
Q
ordinal scale
A
- minimal qualitative distinctions
- rank order in terms of some quantity
- there is now an ordering element of the different categories
- assign numbers to objects but now that number has meaning (ex 1st place and 2nd place in a race)
5
Q
interval scale
A
- quantitative properties
- intervals between levels are equal in size
- can be summarized using means
- no absolute zero
- tells you the difference between 1st and second pace weather it was 20 minutes or 4 seconds
- numbers have order but there are also equal intervals between adjacent categories (difference in height will be one inch throughout)
6
Q
ratio scale
A
- detailed quantitative properties
- equal intervals
- absolute zero can be summarized using mean
- differences are meaningful plus ratios are meaningful and there is a true zero point (ex weight in pounds: 10lbs is twice 5lbs and zero pounds would mean no weight)
7
Q
frequency distributions
A
- the most basic form of data analysis
- can be in tabular or graphical format
- indicated how often different values are present in a data set
- think of a table for a cladogram
8
Q
derive a frequency distribution (group frequency distribution)
A
- raw scored are transformed into a tally by counting the number of cases for each value
- instead of making a cladogram think of tallying up the results rather than displaying what percent no one got
9
Q
central tendency of a distibution
A
- most representative score or value
- where the average is on a graph
- vertical measurement on a graph
10
Q
dispersion
A
- extent of deviation from central tendency
- how spread out are the scores
- a horizontal measurement on a graph
11
Q
Skewness
A
- asymmetry in distribution
- positive skew: long tail in positive direction
- negative skew: long tail in negative direction
12
Q
Σ
A
- sigma
- sum of all values
13
Q
disadvantages of mean
A
- subject to distorting effects of outliers
14
Q
median
A
- point that divides the distribution into two equal parts
15
Q
mode
A
- the most frequently occurring score
16
Q
what is variability
A
- how close to the center of the distribution are the scores
17
Q
variance
A
- (Σ(X-M)^2)/N-1
- N = number of values
- X = any given distribution
- M = mean
- the sample variance formula uses N-1 in the denominator because sample information is being used to estimate a population characteristic
- the sample must be squared to avoid negative numbers
18
Q
standard deviation
A
- removes the squared components of variance by taking the square root of the variance
- calculates how spread out each measure is from the mean
- greater accuracy since all values are included
19
Q
range
A
- the difference between the smallest (minimum) and the largest (maximum) values
20
Q
correlation
A
- correlation coefficient: a statistic that indicates the strength and direction of the relationship between two variables
- PPMCC = “r”
- 0 = no correlation
- -1 = negative correlation
- +1 = positive correlation
- r indicated the degree of linear relationship between two variables the data may have a strong nonlinear relationship that r does not reveal
21
Q
effect size
A
- a general term for the strength of relationship between variables; r is one indicator of effect size; when used this way r is usually squared to produce r squared
- r squared can be interpreted as the proportion of variability in one variable accounted for by variation in another variable
22
Q
type I error
A
- a false positive
- results are just due to chance
when the threshold is set to 5% which is the convention, then the researcher as a 5% chance or less of making a type I error
23
Q
type II error
A
- a missed opportunity
- more likely to occur when the threshold is set too low and or when the sample was too small
- when the researcher is actually right but throws the findings away since they seem to be due chance