Stats Vocab List Flashcards

Question 1

Q

Sensitivity

Answer

A

A test’s ability to identify someone as positive, or true positives: (Positive Given Disease Present)

Question 2

Q

Specificity

Answer

A

A test’s ability to identify exclusively the agent, to not be misled by alternative cases. (Negative Given Disease Absent)

Question 3

Q

False Postive

Answer

A

A test’s chance of overcoverecting, detecting disease that is not present. (Positive Given Disease Absent)

Question 4

Q

False Negative

Answer

A

A test’s chance of missing disease, of undershooting. (Negative Given Disease Absent.)

Question 5

Q

Disjointed

Answer

A

Mutually exclusive outcomes. Ex: Getting a Head and a Tail on the same coin flip.

Question 6

Q

OR

Answer

A

One or the other, not both. Contrast: At least.

Question 7

Q

Addition Rule

Answer

A

Chance of either event happening (non-disjointed): P (A or B) = P(A) + P(B) - P(A & B).

Disjoint: P (A or B) = P(A) +P (B)

Question 8

Q

Dependent

Answer

A

Used in conditional probability to describe when events influence each other. Someone who plays basketball is more likely to be taller then the average American. P (A & B) = P (A) times P(B given A) or the chance that A & B happen is equal to the chance A happens times B happens when A has already happened.

Question 9

Q

Complete

Answer

A

Every possible outcome is in the sample space - you’ve represented everything.

Question 10

Q

Two-Way Table

Answer

A

A type of table where two variables are represented in frequencies i.e. Women who like pokemon vs digimon as a yes/no question.

Question 11

Q

Stem & Leaf Plot

Answer

A

Type of quantitative plot where a stem plot representing the ten’s place, possibly with a half-stem section, and the rest of the data behind it as leaves. Like an enumerated dot plot. Best for single quantifiable variables with small amount of values, and good individual values, with options for comparing groups and shapes.

Question 12

Q

Histogram

Answer

A

Type of graph for two quantitative variables, commonly including frequency. Big bars. Best for large amounts of numbers in small distributions where we care most about the shape.

Question 13

Q

Independent

Answer

A

Events that do not impact the other. The probability of P(A given B) = P(A) * P(B).

Question 14

Q

Cases

Answer

A

Experimental units, what data is collected from.

Question 15

Q

Variable

Answer

A

The value (quantity/quality) changing and measured through statistics.

Question 16

Q

Simulation

Answer

A

Running a theoretical match to compare to real data, and the likelihood of that occurring, several times.

Question 17

Q

Skew

Answer

A

Data significantly trailing off from the median. Side of the trail, often against walls.

Question 18

Q

Normal Distribution

Answer

A

Describes a common distribution pattern with a single median/mode peak and an equivalent mean (affected by skew), distribution symmetrical. SD: 68, 95, 99.7 (Empirical Rule)

Question 19

Q

Uniform

Answer

A

Describes a distribution pattern of flatness, with no real mode or more likely distributions.

Question 20

Q

Standard Normal Distribution

Answer

A

Idealized normal curve with equivalent m’s at 0, an SD of 1, and perfect symmetry.

Question 21

Q

Standard Deviation

Answer

A

Describes the distance of a point from the center (median), as a measure of spread. For a normal curve, this is the distance to the inflection point from the center.

Calculated as the sum of difference from mean for each value squared (squared first, then added), all over some n or n-1, dependent on sample versus population.

Question 22

Q

Uni/Bimodal Distribution

Answer

A

Pattern of distribution with several peaks or modes. A common sign of several distinct groups smushed into a sample space.

Question 23

Q

Quartiles

Answer

A

Separations of the normal distribution. Lower quartile is 25 to 50%, and the upper quartile is 50 to 75%, or Q1-Med-Q3.

Question 24

Q

Quantitative

Answer

A

Distinctions in variables that are quantifiable. Running speed of predators.

Question 25

Q

Categorical

Answer

A

Distinctions in variables that are typal. Species of predator.

Question 26

Q

Cumulative Frequency

Answer

A

Type of graph where cumulative or total frequency forms the y-axis. Easier to find quartiles, median and such.

Question 27

Q

Bar Graph

Answer

A

Histogram look-a-like using categorical data.

Question 28

Q

Rescaling

Answer

A

Multiplying every value in a graph by the same non-zero number - center is old center * d, spread is changed (shrink or stretch). Shape constant

Question 29

Q

Recentering

Answer

A

Adding a constant to all values. Median and mean grow by c. Shape and spread constant.

Question 30

Q

Sensitivity to Outliers

Answer

A

Vulnerable summary statistics are highly affected - i.e. mean, standard distribution. Median, quartiles and IQR are resilient.

Question 31

Q

Trend, Strength, Linerality:

Answer

A

Used for two variable quantitative plots like scattergrams. Trend describes relationship as positive or negative, linearity for what shape the relationship takes (line, curved) , and strength how close the relationship fits to the data. May vary - heteroscedasticity.

Question 32

Q

Lurking Variable

Answer

A

A hidden third variable muddling relationship between dependent and independent.

Question 33

Q

Midrange

Answer

A

Midpoint between minimum and maximum in a data set.

Question 34

Q

Range

Answer

A

Numerical distance between minimum and maximum

Question 35

Q

Binomial Distribution

Answer

A

X is acting as the number of successes in n independent trials, with p probability of success. P (x=K successe) = nCx p^k (1-p)^n-k.

Question 36

Q

Binomial Mean (Expected Value)

Question 37

Q

Binomial Standard Deviation

Answer

A

= (np(1-p))^0.5

Question 38

Q

Geometric Distribution:

Answer

A

X acts as the number of trials until success. With P (x=k successes) = (1-p)^k-1 p

Question 39

Q

Geometric Mean (Expected Value)

Answer

A

=1/p. Trials till nth success is mean * n.

Question 40

Q

Geometric Standard Deviation

Answer

A

= (1-p)^0.5 / p

Question 41

Q

Confidence Interval

Answer

A

A P confident interval consists of those population percentages p where sample proportion p hat is reasonably likely. = p hat + or minus margin of error.

Question 42

Q

Rate of Capture

Answer

A

For 95%, 95 out of 100 times the true proportion of the population should be within the calculated interval.

Question 43

Q

Margin of Error:

Answer

A

= Z score of desired confidence interval * (pq/n) ^0.5.

Question 44

Q