BioStats Module 1 Flashcards
What is the difference between descriptive and inferential statistics
Descriptive - displays data so that it can be organized, summarized, or described allowing for trending
Inferential - tests a sample to see if generalizations can be made about population. You can infer something about a population based on sample
Independent Variable
predictor variable - Which variable are you saying will make a change? the variable that is manipulated by the investigator.
educational tool used on two groups of people to see if skills increase.
educational tool = independent variable
Dependent variable
The variable that depends on something - outcome variable - what is going to change? affected by the independent variable which variable shows the outcome of the experiment
2 groups educational tool used to increase skill level
skill level it the dependent variable
Discrete Variables and discrete dichotomous
Can be counted like kids or heart rate
cannot be 0.5
Dichotomous - only two choices like sex
continuous variables
Can be measured like height and weight can get more precise
based on precision of measurement tool
What are the 4 levels of measurement
Nominal - qualitative
ordinal - qualitative
interval - quantitative
ratio - quantitative
Nominal variable data
Qualitative - used to label variables without providing any quantitative value. cannot be ordered or measured . You cannot order or measured in a meaningful way
named categories, yes or no, race
preferred mode of transportation (car, bus, train, bicycle, tram, etc.)
Ordinal data
Qualitative - categorical data with innate or natural order to them and the distances between them are not known.
good, fair, poor
neither agree nor disagree, agree, strongly agree
Interval data
quantitative - data which is measured along a scale in which each point is placed at an equal distance.
temperature, pH, SAT score, credit score
Ratio data
quantitative meaningful zero and ratio or equal proportion is present. ie INCOME you can make zero dollars and someone making 100k is twice as much as someone making 50k
uses same statistical tests as interval
Variables are not innately nominal, ordinal, interval or ratio. What does this mean?
it depends on how the data is measured in the study
Weight - could be interval because you detail precise
could also be ordinal if you check boxes for a range of weights 80-120 kg check
Bar Chart
Most appropriate for variables measured at nominal and ordinal level that are discrete NOT continuous
bars are proportional to the number of cases
easy to look at data SPACE between bars
Histogram
similar to bar chart but the bars touch each other and is used to show continuous interval or ratio data
organizes data in several intervals instead of each data point receiving its own bar. Allows you to understand the general data distribution better
Stem and leaf plots
look like bar chart on its side column on left is stem on right is leaf 1 I 9 2 I 2 5 6 7 8 9 3 I 0 4 6 7 4 I 2 3 4 6 8 8 9 5 I 2 3 4 4 5 6 I 2
shows individual data points and shows data distribution
Frequency Table
Graphing Frequency table
displays values, frequency, valid percent, and cumulative percent
Valid percent is important if you have missing data
Graphing Frequency distribution
terms for curves
Leptokurtic (thin)
Mesokurtic (normal)
Platykurtic (flat)
positive skew
Tail slopes to the right
negative skew
Tail slopes to the left
Boxplot (Box and whicker plot)
shows median, mean, 25th and 75th percentile and any outliers
What is central tendency
describes where typical data set would be - mean, median or mode
Mean
average
cannot be done for nominal data MUST be interval or ratio level data
CANNOT be categorical data like some college
Median
ordinal, interval or ratio
middle score where half is smaller and half is larger
could be used for not good, good, very good because to is ORDERED
Mode
any type of data can be used
mode is the most frequently occurring data
1, 3,4,6,7,7
mode is 7 as it occurs more frequently than others
Dispersion
standard deviation and variance
range, interquartile range
Standard deviation small
close to mean or median more clustered or closer together
variance
square root of the standard deviation
used with data sets that are not equal
40 RN3s and 125 RN1s
Range
max - min = range
smaller = closer together larger = further spread apart
Interquartile range
gives middle 25th percent and 75th percent