Stats - Week 2 Flashcards

1
Q

Describing Distributions we look at what?

A

Measures of shape (Kurtosis and Skewness), central tendency (mean, median, and mode), “spread” or variation (range and variance & standard deviation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Distribution shapes!

A

Normal distribution, Positive skew (left), Negative skew (right), Leptokurtic(Positive kurtosis), Platykurtic (Negative kurtosis)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Measures of central tendency =

A

= estimate the “center” of our data. Mode, Median, and Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

MODE:

A

most frequent score in a distribution. A distribution can be bimodal or multimodal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

MEDIAN:

A

Middle score or 50th percentile
Arrange the scores in ascending order
Median = middle score if # of scores is odd
(average of middle 2 scores if # of scores is even)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

MEAN:

A

Arithmetic average of the scores in a distribution.
The symbol for the mean of a population is omega;
The symbol for the mean of a sample is. (SUM) Mu is population and sample is x-bar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which is most influenced by the skew: mean, median, or mode?

A

Mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Measures of Variation Defined

A

The more variation in your data, the less precisely you can estimate the population’s location (e.g., mean) from the sample information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Measures of variation are?

A

highest score minus lowest score(ie. Data hours of day spent on phone = 3, 4, 6, 7 so 7-3 = range of 4) and sum of squared errors “sum of squares” (gives the total deviation from the mean)(take every data point and subtract from mean and then square it to get rid of negative and then add all together. Want all numbers to be positive so we can actually see a variance.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Range

A

(highest score minus lowest score)(ie. Data hours of day spent on phone = 3, 4, 6, 7 so 7-3 = range of 4)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

sum of squared errors “sum of squares”

A

gives the total deviation from the mean
(ie. take every data point and subtract from mean and then square it to get rid of negative and then add all together. Want all numbers to be positive so we can actually see a variance.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Variance

A

the average of the sum of squared deviations.
◦Is always a positive number
◦Accentuates the extreme differences

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Standard Deviation

A

the standard deviation of a random variable, sample, statistical population, data set, or probability distribution is the square root of its variance.

Why? A measure of variance that is expressed in the same unit of measurement of the original data. Used more!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Z-scores (or “standard scores)

A

Tell how many standard deviations a raw score is from the mean.
e.g., z = 1.96 means 1.96 SDs above the sample mean.

I.e., deviations from the mean in SD units.
Any standardized variable has a mean = 0 and SD (& variance) = 1
(Does NOT mean variable is normally distributed)

Permits a standard way to compare across scores / measures

Z scores allow us to determine probabilities.E.g., the probability of a randomly selected student passing a class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Null Hypothesis “significance Testing” Steps

A

Step 1: State the hypothesis.
Step 2: Set the criterion for rejecting the null hypothesis.
Step 3: Compute the test statistic.
Step 4: Decide whether to reject the null hypothesis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Null hypothesis Defined

A

Norelationship between variables (in population!)

Or, no difference between groups

17
Q

Alternative hypothesis (H1)

A
The null (no relationship) is not true. 
(non-directional hypothesis).

Or, there is a relationship, or difference between groups in one direction (a directional hypothesis).
E.g., “I/O is better than Clinical”

18
Q

Alpha () level (p-value)

A

probability of a Type I error (e.g., 5%).

A p-value, or probability value, is a number describing how likely it is that your data would have occurred by random chance (i.e. that the null hypothesis is true).

The level of statistical significance is often expressed as a p-value between 0 and 1. The smaller the p-value, the stronger the evidence that you should reject the null hypothesis.

19
Q

Type I error

A

Wrongly rejecting the NULL hypotheses.
(concluding that “there is a relationship” or “two groups differ” in the population, when it just ain’t so.)
Ie. Saying a man is pregnant

20
Q

Set Criterion for Rejecting Ho

Statistical Significance

A

We decide how much risk to tolerate.

Traditional cutoff for statistical significance = .05 / .01 / .001

21
Q

Type II error:

A

Wrongly failing to reject the null hypothesis.
(Saying “no difference”/“no relationship” when there is one in the population.)
β (beta)= probability of making a Type II error
Statistical power = (1 – β).
Probability of rejecting the null hypothesis (H0) when it is false

We want an “80% chance of finding an effect, assuming there is one in the population” (power of .80)

Ie. saying a pregnant women isn’t pregnant.

22
Q

Critical value

A

indicate region of rejection:
Values (of a statistic) of the sampling distribution that are improbable if the NULL is true.
E.g., (z > 1.65) OR (z > 1.96 or < -1.96): for p-value of

23
Q

Directional H1 is a one or two tailed test?

A

one-tailed test

24
Q

Nondirectional H1 is a one or two tailed test?

A

two tailed test

25
Q

Compute test statistic

A

-Some ratio of MODEL / ERROR
(variance explained by our model / unexplained variance)e.g., z, t-ratio, f-ratio, chi square (we’ll compute these later).

-Compare test statistic value (e.g., z) to critical value.
e.g., is test statistic > 1.65?

26
Q

Correlation/regression

A

Is the relationship significantly different from zero?

27
Q

t-test/anova

A

Difference between groups greater than zero ?

28
Q

Correlation defined

A

How similar or how related your variables are. Do these variables walk together, vary in similar ways. As one increases does the other increase. Is there a relationship. An index of the linear relatedness of two variables.

Ranges between relationship -1 and +1

The sign of the correlation coefficient indicates?

  • whether the relationship is positive or negative.
  • Positive = as x increases, y increases. Negative = is as x increases, y decreases.

Absolute value of the coefficient indicates?

  • How strong the relationship is and Pearson’s r is what we use to show that relationship.
  • R value = how strong the relationship is. Between -1 and +1
29
Q

Positive Skewness

A

when the tail on the right side of the distribution is longer or fatter. The mean and median will be greater than the mode.

30
Q

Negative Skewness

A

when the tail of the left side of the distribution is longer or fatter than the tail on the right side. The mean and median will be less than the mode.

31
Q

Kurtosis

A

The measure of outliers present in the distribution.