Module 4: Statistical Methods for Nutrition Research Flashcards

1
Q

In data collection, what are the 4 levels of measurement?

A
  1. Nominal data
  2. Ordinal data
  3. Interval scale
  4. Ratio data

Think pinot NOIR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Nominal data?

A

AKA: CATEGORICAL data

It is data that comprises of categories that cannot be rank ordered, rather each category is just different from the other

The least complex of the 4 levels of measurement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What level of measurement:

-cannot be placed in any order
-no judgement can be made about the relative size or distance from one category to another (aka no math operations can be done)
-consists of absolute codes OR names that are mainly used for tallying at the end
-used to keep mutual exclusiveness between subjects

A

NOMINAL DATA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Ordinal data?

A

Consists of the natural order; operates off rankings or ratings, but the distances between the differences do NOT have a relative degree.

AKA: It is data the comprises of categories that can be rank ordered, but the distance between each category cannot be measured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What level of measurement:

-only statistical judgement and limited analysis can be performed
-the interval between adjacent values is not consistent

A

ORDINAL DATA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

(EXAMPLE) Which level of measurement is this:

On a scale, a participant is directed to rank their satisfaction level between 1 to 10.

A

ORIDINAL DATA because although a ranking of 7 is one step above a ranking of 6, this does not necessarily mean that the difference of 9 and 10 is the same as the difference between the ranking of 6 and 7.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

(EXAMPLE) Which level of measurement is this:

Running a race and finishing 1st and 2nd.

A

ORDINAL DATA

The difference in finishing time in a 1st place runner and the 2nd place runner are not the same as the 2nd and 3rd place runners.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Interval Scale?

A

Has a constant interval, but lacks a true zero point. As a result one can add and subtract values on an interval scale, but one cannot multiple or divide units. It’s similar to oridinal, but the difference/interval between values ARE EQUALLY SPLIT.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

(EXAMPLE) Which level of measurement is this:

A 12-hour analog clock.

A

Interval Scale because the clock has equal intervals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Ratio data?

A

The most complex and the preferred scale

It has all the properties of interval data, but it does possess a true zero point. This allows multiplication and division, using ratio data scales.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

(EXAMPLE) Which level of measurement is this:

Height, weight, duration.

A

Ratio data. All of these have a value of 0, that represents nothing being there.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

(EXAMPLE) Which level of measurement is this:

Income or money-earned in a time period.

A

Ratio data, as it could be zero or any other amount.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Descriptive Statistics?

A

Statistical methods used to DESCRIBE your population or sample, without trying to extrapolate that data to another population.

Organizing and summarizing collected data using graphs and numbers

Showing the shape of the data to determine if it is skewed or normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the 2 categories of Descriptive Statistics?

A
  1. Measures of Central Tendency
  2. Measures of Variability
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are measures of Central Tendency? Definition.

A

A single value that attempts to describe a set of data, by identifying the central position within that set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are some examples of measurements that are looking at Central Tendency?

A
  1. Mean
  2. Median
  3. Mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are measures of Variability? Definition.

A

Statistics that describe the amount of difference and spread within a set of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are some examples of measurements that are looking at Variability?

A
  1. Standard deviation (includes distribution of data)
  2. Variance
  3. Range
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the sum of all the values in a data set divided by the number of values in the data set?

A

MEAN

Considered a measurement of Central Tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the middle score of the data, when the data has been arranged in order of magnitude?

A

MEDIAN

Considered a measurement of Central Tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the value of the data that appears the most often?

A

MODE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are types of data use the mode?

A

Nominal, Ordinal, and Interval data (numerical and categorical data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does negatively skewed mean?

A

It means that the data is “skewed to the left”

The tail of the data is to the left, meaning that more of your data points are below the mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

If the tail of the data is to the left, what does this mean?

A

It means that more of your data points are BELOW the mode. The data is negatively skewed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What does normal (no skew) mean?

A

Evenly distributed data

rare

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What does positively skewed mean?

A

It means that the data is “skewed to the right”

The tail of data is to the right, meaning that more of your data points are ABOVE the mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

If the tail of the data is to the right, what does that mean?

A

It means that more of your data points are ABOVE the mode. The data is positively skewed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Define Standard Deviation.

A

The average deviation of the scores from the mean

Measures the variability of the data

It gives an idea of how close the entire data set is to the average value, or conversely how widely dispersed the data is around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

When it is best to use the standard deviation?

A

When the data is normally distributed, otherwise it is a less reliable measure of variability and should be used in combination with other measures, like range, or the interquartile range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What does SEM mean?

A

Standard Error of the Mean

The measure of how precise an estimate is; it reflects the variability of an estimator from sample to sample, not the variability of data within the one sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

(TRUE/FALSE)

Standard deviation = SEM

A

FALSE

Standard Deviation does NOT equal the Standard Error of the Mean (SEM)

32
Q

Define Variance.

A

It measures how far a set of numbers is spread out and is calculated by averaging the square standard deviation scores.

It is always a ‘zero’ or a positive number

33
Q

(TRUE/FALSE)

The variance can be a negative value.

A

FALSE

Variance can never be a negative number because a data spread could never be smaller than zero.

34
Q

What would a variance of zero mean? High variance?

A

Zero = all the values are identical

High variance = data points are VERY spread out around the mean

35
Q

Define Range.

A

The spread or the distance between the lowest and the highest value of a variable.

36
Q

Define Quartiles.

A

Segment any distribution that’s ordered from low to high into four equal parts

37
Q

Define Interquartiles.

A

AKA midspread, middle 50%, or H-spread

It is the distance or range between the 25th percentile and the 75th percentile

It is another way to assess variance in your data

38
Q

Define Inferential Statistics.

A

Using sample data to make a conclusion about a larger or broader population

It uses probability to determine how confident we can be that the conclusions we have made are correct

39
Q

Define P-value.

A

It conveys the PROBABILITY that a difference of association is due to chance

40
Q

If the P-value is less than 0.05, what does that tell us about the hypothesis?

A

Statistical Significance
We can reject the null hypothesis and accept the alternative hypothesis

41
Q

What is the alternative hypothesis?

A

States your research prediction of an effect or relationship

42
Q

What is the null hypothesis?

A

NULL = NONE

Predicts no effect or no relationship between variables

43
Q

If the P-value is greater than 0.05, what does that tell us about the hypothesis?

A

Statistically Insignificant
We can accept the null hypothesis and reject the alternative hypothesis.

44
Q

What is Trending?

A

A controversial term, meaning the data was “close to significant but not significant”

45
Q

What is Statistical Power?

A

The probability that you will find a significant difference or relationship if a difference or relationship truly exists in the population. Also, the probability that a test of significance will pick up on an effect that is present.

AKA- the probability of CORRECTLY rejecting the null hypothesis when appropriate.

46
Q

What are the various measures of Inferential Statistics?

A
  1. P-value
  2. Statistical Power
47
Q

What are the 4 Factors affecting Power?

A
  1. Significance Level (or alpha)
  2. Sample Size
  3. Variance, in the measured response variable
  4. Magnitude of the effect of the variable
48
Q

Define magnitude of the effect of the variable.

A

The difference between the hypothesized value of the parameter and its true value.

For example: The difference between means of the variable of interest between 2 treatment groups. The larger the magnitude of the effect, the more powerful the test is.

49
Q

Explain the inherent variability in the measured response variable.

A

As the variability increases, often observed by the standard deviation of the mean and the samples, the power of the test of significance DECREASES.

50
Q

What is a Priori Power Analysis?

A

It is preformed as part of the research planning process BEFORE research begins. It allows you to determine the sample size you need in order to reach the desired level of power.

Often calculated using the results of a researcher’s pilot study, or another published similar study.

51
Q

What is a Post-hoc Power Analysis?

A

It is preformed AFTER your study has been conducted, using your own data and observed effect sizes.

This analysis can be used to assist in explaining any potential non-significant results, by determining if your sample size was large enough to see those results.

It can be controversial, in that some researchers do not see the value in exploring the possible reasons why they did not see significant results.

52
Q

What are T-tests?

A

A statistical test that is used to compare the means of two groups and identify if there is a statistically significant difference between them

53
Q

1-sample (single mean) t-test

A

The test variable is compared against a test value, which is a known or hypothesized value of the mean in a population.

AKA It is not something you are collecting, it something you are expecting based on literature or someone from another study got this mean, and you are comparing it to the one that you have collected.

54
Q

2-sample (two independent means) t-test

A

A method used to test whether the unknown population means of two groups are equal or not.

55
Q

Paired t-test

A

Used to calculate the difference between paired observations

Example: a before and after test

56
Q

Two-tailed test

A

This test allots half of your alpha to testing the statistical significance in one direction, and the other half of your alpha to testing the statistical significance in the other direction

Regardless of the direction of the relationship you are hypothesize, you are testing for the possibility of the relationship in BOTH directions

This means that 0.025 of your alpha on the left and 0.025 of your alpha on the right

57
Q

One-tailed test

A

Allots all your alpha to testing the statistical significance in ONE direction of interest. This means that 0.05 or 5% is in one tail of the distribution of your statistics.

Meaning you are testing for the possibility of a relationship in one direction and completely disregarding the possibility of a relationship in the other direction.

58
Q

When is a one-tailed test appropriate?

A

If you consider the consequences of missing an effect in the untested direction, conclude that they are negligible and no way irresponsible or unethical, then you can go ahead and use the one-tailed test

59
Q

When would a one-tailed test not be appropriate to use?

A

Choosing a one-tailed test for a sole purpose of attaining significance is NOT appropriate

AKA - Running a two-tailed test and you don’t find anything, so you go back and run a one-tailed test just to find something significant.

60
Q

What does ANOVA stand for?

A

ANalysis Of VAriance

61
Q

One-way ANOVA

A

Statistical test used to determine whether there are any statistically significant differences between the means of independent groups on ONE dependent variable

62
Q

Two-way ANOVA

A

Use this when you want to know how TWO independent variables in combination affect a dependent variable.

used to estimate how the mean of a quantitative variable changes according to the levels of TWO categorical variables

63
Q

Repeated Measures ANOVA

A

used to test hypotheses regarding BOTH the equality of the group means AND changes in a dependent variable OVER TIME

64
Q

MANOVA

A

Multivariate ANalysis Of VAriance

can assess differences between the means of independent groups (like ANOVA), but can also assess MULTIPLE DEPENDENT VARIABLES simultaneously

You can have one-way or two-way MANOVAs

65
Q

Chi-Square Test

A

Commonly used for testing relationship between categorical variables

66
Q

Chi-Square goodness of fit

A

Determines if sample data matches a population

67
Q

Chi-Square test for independence

A

Compares 2 variables in a contingency table to see if they are related

68
Q

What does a small chi-test result indicate?

A

The observed data fits the expected data extremely WELL. There IS a relationship

Vice versa for a large chi-test result

69
Q

Define Correlation.

A

Measures the degree of a relationship between 2 variables.

70
Q

Define Regression Analysis.

A

Evaluates the relationship between 1 or more independent variables (predictors) and a dependent variable to determine HOW the independent variable affect the dependent variable, OR the effect that changes in an independent variable can trigger and PREDICT changes in the dependent variable

71
Q

Explain the difference between CORRELATION AND REGRESSION.

A

Correlation measures the DEGREE of the relationship between 2 variables

Regression examines HOW one variable affects the other

72
Q

Simple Regression Analysis

A

Concerned with specifying the relationship between a single numeric dependent variable (the value you are trying to predict) and one numeric dependent variable (predictor)

73
Q

Multiple Regression Analysis

A

Allows researchers to assess the strength of the relationship between the dependent variable and SEVERAL independent variables, as well as allowing them to assess the IMPORTANCE of each of the predictors to the relationship, often including the effect of the other predictors.

74
Q

R-Square

A

In a regression model, indicates the percentage of variation that is explained by your regression model out of the total variable.

75
Q

What is Factor Analysis?

A

A technique used to reduce a large number of variables into a fewer number of factors.

This technique extracts maximum common variance from all variables and puts them into a common score to then use for further analysis to explore relationships among the data.

The test explores which variables in a data set are most related to each other.

76
Q

What is Structural Equation Modeling (SEM)?

A

A multivariate statistical analysis technique combining factor analysis and multiple regression analysis to analyze structural relationship between measured variables and latent constructs and then to provide them into a VISIBLE IMAGE.