Exam 2 - Data Analysis & Statistics Flashcards
Define descriptive statistics
- Collection and presentation of data used to explain characteristics within a sample
- Describe, summarize, and synthesize collected data
Define inferential statistics
- Known as analytical statistics, as it is analyzing the data
- Make inferences or draw conclusions about a population based on a sample
- Used for population based conclusions
- Only tests the null hypothesis
What are the four levels of measuring/categorizing variables?
1) Ratio (continuous)
- Continuum of numeric values with equal intervals between them and a meaningful zero point
- Ex: weight, height, number of credits completed
2) Interval (continuous)
- Continuum of numeric values with equal intervals between them and does not have a meaningful zero point
3) Ordinal
- More commonly seen
- Using numbers to rank order at attribute, intervals are not equal
- Ex: students level of standing (freshman, sophomore, junior, senior), Likert scale (“1. Strongly Agree, 2. Agree, 3. Neither Agree nor Disagree, 4. Disagree, 5. Strongly Disagree”)
4) Nominal (categorical)
- Using numbers to categorize or label attributes into groups of categories
- Ex: gender; Male is coded as “1” and Female as “2”
What are measures of central tendency?
- Use one number to represent all the data you have
- Should be thought about with distribution as well – arrange scores of one variables from lowest to highest
- Researchers want to know characteristic of distribution - Is it highly spread out, more aligned in the center, most common score, etc
- We want to be able to describe the tendency of data to cluster around the middle of a data set
Mode
most frequently occurring number found in a data set
Median
represents the middle point of the data set; half the data is above, half is below
Mean
total average of the data set
Define homogeneous and heterogeneous data
- Measures of variability
- Homogeneous: When values are similar
- Heterogeneous: When the values vary widely
Standard deviation
- most commonly reported measure of variability
- Standard deviation: Average amount of spread within the distribution
Standard error
- Standard deviation of the sampling distribution
o Low SE – any repeated study would produce a similar estimate, so study results should be close to the true value
o High SE – the sample used for calculations may not be that close to the population of interest - Impractical to repeat studies over and over again, so instead use a statistic to estimate the standard error that tells the reliability of the single study conducted
- Common to use +/- values, error bar on graphs
o If the error bars don’t overlap then you can’t be sure that the means might be different without statical testing
o Can also look at if the error bars overlap – the difference between the 2 means is not significant
What is the difference between standard deviation and standard error?
- Standard deviation of the sample is the degree to which individuals within the sample differ from the sample mean
- Standard error of a sample is the estimate of how far the sample mean is from the population mean
Null hypothesis
default position that there is no relationship between the 2 (no effect, no difference, no association)
* that any differences between the 2 values is due to random chance
* if null hypothesis is rejected by statistical tests = researchers state that there is a significant difference between the 2
P-value
asks the question “how likely are we to observe a difference as large as this in the absence of any intervention effect?”
* the answer is the probability = p-value
Hypothesis testing and p-values should address what 3 central questions in nutrition research?
Whether or not there are:
* An effect of one variable or another
* A difference between two intervention groups
* An association between 2 variables
How do you interpret the meaning of a p-value?
P < 0.05 means there is only a 5% chance the difference was due to random chance = significant, reject the null hypothesis
Ex: research hypothesis – supplementation with omega-3 fatty acids will improve cancer treatment side effects among cancer patients
* Null hypothesis – there would be no difference in mean in side effects among patients consuming omega-3 fatty acids
* Level of significance set at 0.05
* P-value of the t-test was 0.03
* Conclusion: significant, reject the null hypothesis – there is a significant difference