Stats Vocab List Flashcards

1
Q

Sensitivity

A

A test’s ability to identify someone as positive, or true positives: (Positive Given Disease Present)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Specificity

A

A test’s ability to identify exclusively the agent, to not be misled by alternative cases. (Negative Given Disease Absent)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

False Postive

A

A test’s chance of overcoverecting, detecting disease that is not present. (Positive Given Disease Absent)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

False Negative

A

A test’s chance of missing disease, of undershooting. (Negative Given Disease Absent.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Disjointed

A

Mutually exclusive outcomes. Ex: Getting a Head and a Tail on the same coin flip.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

OR

A

One or the other, not both. Contrast: At least.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Addition Rule

A

Chance of either event happening (non-disjointed): P (A or B) = P(A) + P(B) - P(A & B).

Disjoint: P (A or B) = P(A) +P (B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Dependent

A

Used in conditional probability to describe when events influence each other. Someone who plays basketball is more likely to be taller then the average American. P (A & B) = P (A) times P(B given A) or the chance that A & B happen is equal to the chance A happens times B happens when A has already happened.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Complete

A

Every possible outcome is in the sample space - you’ve represented everything.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Two-Way Table

A

A type of table where two variables are represented in frequencies i.e. Women who like pokemon vs digimon as a yes/no question.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Stem & Leaf Plot

A

Type of quantitative plot where a stem plot representing the ten’s place, possibly with a half-stem section, and the rest of the data behind it as leaves. Like an enumerated dot plot. Best for single quantifiable variables with small amount of values, and good individual values, with options for comparing groups and shapes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Histogram

A

Type of graph for two quantitative variables, commonly including frequency. Big bars. Best for large amounts of numbers in small distributions where we care most about the shape.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Independent

A

Events that do not impact the other. The probability of P(A given B) = P(A) * P(B).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Cases

A

Experimental units, what data is collected from.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Variable

A

The value (quantity/quality) changing and measured through statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Simulation

A

Running a theoretical match to compare to real data, and the likelihood of that occurring, several times.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Skew

A

Data significantly trailing off from the median. Side of the trail, often against walls.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Normal Distribution

A

Describes a common distribution pattern with a single median/mode peak and an equivalent mean (affected by skew), distribution symmetrical. SD: 68, 95, 99.7 (Empirical Rule)

13
Q

Uniform

A

Describes a distribution pattern of flatness, with no real mode or more likely distributions.

14
Q

Standard Normal Distribution

A

Idealized normal curve with equivalent m’s at 0, an SD of 1, and perfect symmetry.

15
Q

Standard Deviation

A

Describes the distance of a point from the center (median), as a measure of spread. For a normal curve, this is the distance to the inflection point from the center.

Calculated as the sum of difference from mean for each value squared (squared first, then added), all over some n or n-1, dependent on sample versus population.

16
Q

Uni/Bimodal Distribution

A

Pattern of distribution with several peaks or modes. A common sign of several distinct groups smushed into a sample space.

17
Q

Quartiles

A

Separations of the normal distribution. Lower quartile is 25 to 50%, and the upper quartile is 50 to 75%, or Q1-Med-Q3.

18
Q

Quantitative

A

Distinctions in variables that are quantifiable. Running speed of predators.

19
Q

Categorical

A

Distinctions in variables that are typal. Species of predator.

20
Q

Cumulative Frequency

A

Type of graph where cumulative or total frequency forms the y-axis. Easier to find quartiles, median and such.

21
Q

Bar Graph

A

Histogram look-a-like using categorical data.

22
Q

Rescaling

A

Multiplying every value in a graph by the same non-zero number - center is old center * d, spread is changed (shrink or stretch). Shape constant

23
Q

Recentering

A

Adding a constant to all values. Median and mean grow by c. Shape and spread constant.

24
Q

Sensitivity to Outliers

A

Vulnerable summary statistics are highly affected - i.e. mean, standard distribution. Median, quartiles and IQR are resilient.

25
Q

Trend, Strength, Linerality:

A

Used for two variable quantitative plots like scattergrams. Trend describes relationship as positive or negative, linearity for what shape the relationship takes (line, curved) , and strength how close the relationship fits to the data. May vary - heteroscedasticity.

26
Q

Lurking Variable

A

A hidden third variable muddling relationship between dependent and independent.

27
Q

Midrange

A

Midpoint between minimum and maximum in a data set.

28
Q

Range

A

Numerical distance between minimum and maximum

29
Q

Binomial Distribution

A

X is acting as the number of successes in n independent trials, with p probability of success. P (x=K successe) = nCx p^k (1-p)^n-k.

30
Q

Binomial Mean (Expected Value)

A

=np.

31
Q

Binomial Standard Deviation

A

= (np(1-p))^0.5

32
Q

Geometric Distribution:

A

X acts as the number of trials until success. With P (x=k successes) = (1-p)^k-1 p

33
Q

Geometric Mean (Expected Value)

A

=1/p. Trials till nth success is mean * n.

34
Q

Geometric Standard Deviation

A

= (1-p)^0.5 / p

35
Q

Confidence Interval

A

A P confident interval consists of those population percentages p where sample proportion p hat is reasonably likely. = p hat + or minus margin of error.

36
Q

Rate of Capture

A

For 95%, 95 out of 100 times the true proportion of the population should be within the calculated interval.

37
Q

Margin of Error:

A

= Z score of desired confidence interval * (pq/n) ^0.5.

38
Q
A