Statistics I Flashcards

1
Q

2 Types of Statistics

A

Descriptive Statics

Inferential statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is descriptive statistics?

A

describe how many observations were recorded and how frequently each score or category of observations occurred in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is inferential statistics?

A

Show cause-and-effect relationships and test scientific hypotheses or theories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

4 Types of Data/Variable

A

Nominal Data
Ordinal Data
Interval Data
Ratio Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is nominal Data

A

Data that have names or arbitrary numeric assignments
For example: a participant’s state of residence, gender, yes or no responses (yes/no are considered binomial variables, they only have two responses)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Ordinal Data

A

Data that can be arranged in ascending or descending order

For example: highest level of education, likert style survey questions (disagree/neutral/ agree).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is interval Data?

A

Data with no true zero. Not a frequently used scale
For example: evaluation of change in participants’ total cholesterol over time, with a score assigned based on the difference. The difference between 200 and 250 mg/dL is the same as that between 150 and 200 mg/dL.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is ratio data?

A

Expresses the proportion of the difference between measured values. The numbers on a scale with a meaningful zero. This type of data is used widely in nutrition research
For example: blood pressure, weight, height, total cholesterol. In continuing with the total cholesterol example, a 50mg/dl increase from 200 to 250 mg/dL would be a 25% increase. If the total cholesterol started at 150mg/dL, the increase to 200 mg/dL would be a 33% increase.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the measures of central tendency?

A

Mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Mean

A

Calculation of the mean is one of the most commonly used statistics in nutrition research
The mean is determined by summing the values for all observations and dividing by the total number of observations- the average.

The mean is simply the sum of all values divided by the number of values in a sample.

5+5+8/3=6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Median

A

The median is the middle value when al data are placed in ascending/descending order. This means that there are the same number of values that are greater than the median as are less than the median.

The median, unlike the mean, is not affected by extremely large or small values.

When there are an even number of observation, we average the two middle values to get the median.
2 4 6 9 10=6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Mode

A

The mode is the number that occurs the most often in a set of data. The most frequently occurring value in set of observations.
Similar to median, mode is not affected by extermely large or small values
Sometimes there are two (or more) modes. When ther are two modes, the data is said to be bi-modal.
2 3 6 6 8 9 10=6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Percentile

A

A percentile provides information about how data are spread over an interval from the smalles value to the largest value. It indicates what percentage of a sample was measured below or above a given value.

Admission test scores for colleges and universities are frequently reported in terms of percentiles (eg. You will all score at or above the 95th percentile on the RD exam). BMI and growth charts for children are also reported in terms of percentiles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Quartiles

A

Values that split the data set into 4 equally sized parts
First (lower) quartile = 0 - 25th percentile
Second (lower-middle) quartile = 26 – 50th percentile
Third (upper-middle) quartile = 51 - 75th percentile
Fourth (upper) quartile = 76 – 100th Percentile
Splitting a dataset into quartiles for meaningful results requires larger sets of data. Quartiles are often then used to compare the first to the other 3 quartiles or vise versa and fourth compared to the other 3 quartiles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Measures of Variability (6)

A
Range
Interquartile range
variance
standard deviation
standard error
coefficient of variation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Range

A

The range of a data set is the difference between the largest and smallest data values (i.e. its span). Thus, by subtracting the lowest value in a set of observations from the highest value, you derive the range.
Range is the simplest measure of variability.
Range is very sensitive to the smallest and largest data values.
Age range is a very common statistics seen in human research. You might find something like this in the literature:
Subjects age ranged from 50 – 78 years. This gives us an idea of the sample characteristics, middle and older adults. However, just stating the age range was 28 years, isn’t very helpful

17
Q

Interquartile Range

A

The interquartile range (IQR) of a data set is the difference between the 4th quartile and the 1st quartile.

The IQR is the range for the middle 50% of the data

IQR overcomes the sensitivity to extreme data values that is present in variability range when examining a complete data set.

18
Q

Variance

A

Variance is the measure of variability that utilizes all of the data, determining the dispersion around the expected value.
Variance is based on the difference between the value of each observation (xi) and the mean (x for a sample, u for a group or population).

19
Q

How to calculate Variance?

A

Separately subtract the mean from each value and square this difference
Sum these values
Divide by the total number of measures -1.

20
Q

Standard Deviation

A

The SD of a data set is the positive square root of the variance
SD is measured in the same units as the data making it more easily comparable, relative to the variance
If the data set is a sample, the SD is denoted as s.
If the data se tis a population, the SD is denotes as σ (sigma

21
Q

Standard Error (SE)

A

SE is used to describe the estimated standard deviation for a sampling distribution. It is the value most often presented in research articles and is often refereed to as the Standard Error of the Mean (SEM)

SEM is calculated as the square root of the variance divided by sample size.