Final (stats) Flashcards

Question

Confidence intervals

Answer 1

give an estimate of how well the sample mean represents the population mean. Range of likely values for population parameter Uses reliability coefficient (s) and SEM

Answer 2

Applies the scientific method to data with random fluctuation

Answer 3

effect of data does not represent real effect in hypothesis but is merely a result of random fluctuation. Hypothesis that there will be no difference nor relationship between variables.

Answer 4

hypothesis formulated based on existing knowledge, theories, or observations. Difference between variables is specified (one group is greater / uses the other)

Answer 5

One-tailed → clear directional prediction based on prior knowledge Two-tailed → no specific direction expected / both directions equally plausible The decision to use one tailed vs. two tailed test must be made prior to conducting analysis.

Answer 6

The p-value is the probability of finding your result when the null hypothesis is true. Computed under the assumption the null is supported Region of high probability = high p-value (shaded blue) Region of low probability = low p-value (shaded gray) if the p-value is 0.05 % that tells us that we have a 5% chance of the data supporting the null hypothesis. the confidence level tells us that given this value, we think that there is a 95 % chance that the data will fall into the alternative hypothesis

Answer 7

called alpha levels pre determined (usually 0.05) Alpha value is a probability value The probability threshold decided is low enough for us to decide the null hypothesis is unlikely to be true. If p-value is less than alpha, it is unlikely the null is true.

Answer 8

If p-value is low enough, we reject the null hypothesis and conclude a significant difference. (when p is < a → less than a)

Answer 9

Assumes the sample represents the population Follows a normal population distribution (regular bell shape)

Answer 10

No assumptions The area of study is better represented by the median (not the normal distribution) Very small sample size Ordinal or ranked data, or outliers cannot be removed.

Answer 11

compares the means between two groups Based on t distribution T-value measures the size of the difference relative to variation in sample data.

Answer 12

Grouping categories are independent and unrelated Ex. different people, animals, or things where values of one group do not affect the other.

Answer 13

Grouping categories are related Ex. the same person at two points in time. If the t test statistic is greater than the critical value, the null can be rejected. A smaller sample means fatter tails (greater likelihood that values will be outliers → bad thing). A larger sample indicates that the value will be closer to the mean.

Answer 14

compares the means among three or more groups

Answer 15

Is the freedom to vary n (sample) - 1 = degrees of freedom Indicates the number of independent pieces of information

Answer 16

AKA THE Z-VALUE a specific value or threshold used to determine the acceptance or rejection of a statistical test or hypothesis

Answer 17

Type I error (false positive) → rejection of a null hypothesis that is actually true in the population (theres a significance when there actually isn't) Type II error (false negative) → failure to reject a null hypothesis that is actually false in the population (theres no significance when there actually is) Type II is especially a risk when doing multiple t-tests instead of doing ANOVA

Answer 18

You reduce the chances of incorrectly rejecting the null hypothesis (type I error) in any of the individual tests, but it also increases the possibility of making a Type II error (false negatives), meaning that you might fail to detect a true effect. Looks at the alpha level and divides that by the number of comparisons being made.

Answer 19

Doesn’t tell us which means are different, it only determines that there is a difference. Hence, we can follow that with a post-hoc test. They help identify where the significant differences lie, providing more specific and detailed information about the relationships between the groups or conditions being compared

Answer 20

Measures the association of two variables Uses correlation when we want to quantify the strength and direction of a relationship.

Answer 21

r < 0.30 → weak to no correlation r = 0.60 → moderate to strong relationship r > 0.70 → substantial to very strong relationship

Answer 22

correlation coefficient Refers to a measure of the strength and direction of the linear relationship between two variables. It ranges between -1 and 1, where -1 indicates a perfect negative linear relationship, 1 indicates a perfect positive linear relationship, and 0 indicates no linear relationship. Essentially it tells you strength (magnitude) and direction (positive or negative sign) for the relationship. The Pearson correlation coefficient is the specific calculated value.

Answer 23

coefficient of determination The amount of variance in one variable that is explained or predicted by variance in another variable. It ranges between 0 and 1, where 0 indicates that the independent variables explain none of the variability in the dependent variable, and 1 indicates that the independent variables explain all of the variability.

Answer 24

Minimize the sum of the squared difference between data and curve fit line. y = mx + b

Answer 25

Single correlation refers to the relationship between two variables, typically measured using the Pearson correlation coefficient (r-value) Multiple correlation refers to the relationship between a dependent variable and multiple independent variables.

Answer 26

The probability of committing a type I (rejected the null but it’s actually true) error is whatever our alpha value is set to. Ex. for an alpha value of 0.05, we have a 5% chance of committing type I error.

Answer 27

The probability of committing a type II error (failure to reject the null when it is false) is denoted by beta, which has to do with power (its typical value is 0.8 → 80% chance we are sure we aren't committing a type II error). Beta is the probability that the experiment will yield a not significant result High power = high chance that your experiment will find a statistically significant result when one is present.

Answer 28

the probability of rejecting the null hypothesis when it is false (good thing). Is the probability (can’t be negative)... Of making a correct decision That a significance test will pick up an effect that is present Of avoiding a type II error

Answer 29

Effect size Large discrimination (shown in bottom right) indicates that we won’t need much power to be sure that the groups are different since they are more spread out and distinct. Hence, more power is needed to detect smaller differences. Sample size Smaller sample sizes require larger amounts of power to detect differences Sample variance Higher variance yields small amounts of power. Alpha value Alpha level is proportional to power lowering the alpha value increases the strictness of the test, making it harder to reject the null hypothesis

Answer 30

used to determine the presence or absence of a particular condition in an individual. Validity is evaluated by a test’s ability to assess the presence (sensitivity) or absence (specificity) of a medical condition. Tries to answer a yes or no, often from a non-binary variable. Thus, there must be a cut off point to help create a yes or no answer.

Answer 31

test predicts condition and they have condition

Answer 32

test predicts condition but they do not have condition

Answer 33

test predicts no condition but the have condition

Answer 34

test predicts no condition and they do not have condition

Answer 35

portion of population that has a condition (true positive + false negative) / everyone (total population)

Answer 36

proportion of people with conditions that test positive (relative to all individuals with the condition). True positives / total people with condition (TP + FN)

Answer 37

proportion of people without condition that test negative (relative to all individuals without the condition) True negatives / total people without condition (TN + FP)

Answer 38

proportion of people with condition that tested positive for the condition (relative to all p tests). True positives / total positives (TP + FP)

Answer 39

proportion of people without condition who test negative for the condition (relative to all negative tests) True negatives / total negative tests (TN + FN)

Answer 40

ability to identify true results (True positives + true negatives) / total number of tests (TP + FP + TN + FN)

Answer 41

proportion of the population that has a condition (True positive + false negative) / total population

Answer 42

related to confidence intervals tells you how many standard deviations you are away from the mean. If a z-score is equal to 0, it is on the mean

Answer 43

CI % = x +/- z*(s/√n) x = sample mean (overall and stays consistent) z = z-value typically taken from a graph. corresponds to the desired confidence interval % s = standard deviation n = sample output is a range

Answer 44

99% --> 2.576 95% --> 1.96 90% --> 1.645

Answer 45

r value * 100

Answer 46

t test statistic is greater than the critical value, the null can be rejected.

Final (stats) Flashcards

(72 cards)