Exam 2 - Data Analysis & Statistics Flashcards

Question 1

Q

Define descriptive statistics

Answer

A

Collection and presentation of data used to explain characteristics within a sample
Describe, summarize, and synthesize collected data

Question 2

Q

Define inferential statistics

Answer

A

Known as analytical statistics, as it is analyzing the data
Make inferences or draw conclusions about a population based on a sample
Used for population based conclusions
Only tests the null hypothesis

Question 3

Q

What are the four levels of measuring/categorizing variables?

Answer

A

1) Ratio (continuous)
- Continuum of numeric values with equal intervals between them and a meaningful zero point
- Ex: weight, height, number of credits completed

2) Interval (continuous)
- Continuum of numeric values with equal intervals between them and does not have a meaningful zero point

3) Ordinal
- More commonly seen
- Using numbers to rank order at attribute, intervals are not equal
- Ex: students level of standing (freshman, sophomore, junior, senior), Likert scale (“1. Strongly Agree, 2. Agree, 3. Neither Agree nor Disagree, 4. Disagree, 5. Strongly Disagree”)

4) Nominal (categorical)
- Using numbers to categorize or label attributes into groups of categories
- Ex: gender; Male is coded as “1” and Female as “2”

Question 4

Q

What are measures of central tendency?

Answer

A

Use one number to represent all the data you have
Should be thought about with distribution as well – arrange scores of one variables from lowest to highest
Researchers want to know characteristic of distribution - Is it highly spread out, more aligned in the center, most common score, etc
We want to be able to describe the tendency of data to cluster around the middle of a data set

Question 5

Q

Mode

Answer

A

most frequently occurring number found in a data set

Question 6

Q

Median

Answer

A

represents the middle point of the data set; half the data is above, half is below

Question 7

Q

Mean

Answer

A

total average of the data set

Question 8

Q

Define homogeneous and heterogeneous data

Answer

A

Measures of variability
Homogeneous: When values are similar
Heterogeneous: When the values vary widely

Question 9

Q

Standard deviation

Answer

A

most commonly reported measure of variability
Standard deviation: Average amount of spread within the distribution

Question 10

Q

Standard error

Answer

A

Standard deviation of the sampling distribution
o Low SE – any repeated study would produce a similar estimate, so study results should be close to the true value
o High SE – the sample used for calculations may not be that close to the population of interest
Impractical to repeat studies over and over again, so instead use a statistic to estimate the standard error that tells the reliability of the single study conducted
Common to use +/- values, error bar on graphs
o If the error bars don’t overlap then you can’t be sure that the means might be different without statical testing
o Can also look at if the error bars overlap – the difference between the 2 means is not significant

Question 11

Q

What is the difference between standard deviation and standard error?

Answer

A

Standard deviation of the sample is the degree to which individuals within the sample differ from the sample mean
Standard error of a sample is the estimate of how far the sample mean is from the population mean

Question 12

Q

Null hypothesis

Answer

A

default position that there is no relationship between the 2 (no effect, no difference, no association)
* that any differences between the 2 values is due to random chance
* if null hypothesis is rejected by statistical tests = researchers state that there is a significant difference between the 2

Question 13

Q

P-value

Answer

A

asks the question “how likely are we to observe a difference as large as this in the absence of any intervention effect?”
* the answer is the probability = p-value

Question 14

Q

Hypothesis testing and p-values should address what 3 central questions in nutrition research?

Answer

A

Whether or not there are:
* An effect of one variable or another
* A difference between two intervention groups
* An association between 2 variables

Question 15

Q

How do you interpret the meaning of a p-value?

Answer

A

P < 0.05 means there is only a 5% chance the difference was due to random chance = significant, reject the null hypothesis

Ex: research hypothesis – supplementation with omega-3 fatty acids will improve cancer treatment side effects among cancer patients
* Null hypothesis – there would be no difference in mean in side effects among patients consuming omega-3 fatty acids
* Level of significance set at 0.05
* P-value of the t-test was 0.03
* Conclusion: significant, reject the null hypothesis – there is a significant difference

Question 16

Q

Define parametric tests and what they are used for

Answer

A

Used when specific conditions have been met:
* Use of probability sampling
* Normal distribution of data
o If the researcher doesn’t know this, there are other statisitical tests that can be ran to know if parametric tests can be used
* Measurement of variables at the interval or ratio level
* Reduction of error

Question 17

Q

Define non-parametric tests and what they are used for

Answer

A

Utilized when the 4 conditions are not met for the parametric tests
Considered less powerful than parametric tests
Used for interval data that do not have a normal distribution or for data that are nominal/ordinal in nature (Use chi-square statistic test)

Question 18

Q

What is ANOVA and when is it used?

Answer

A

Analysis of variance, used to evaluate the mean differences between 2+ groups
Data must be interval or ratio
Tests whether differences exist between means

Question 19

Q

What is a student’s t-test and when is it used?

Answer

A

Most basic statistical test and is most often used to compare 2 groups
Typically the p-value for most tests is set at p < 0.05 (Stricter values can also be chosen)