Statistics Flashcards

1
Q

Sample of convenience

A

A collection of individuals that happen to be available at the time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sampling error

A

The chance difference between an estimate and the population parameter being estimated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Bias

A

A systematic discrepancy (tending in a certain direction) between an estimate and the true population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Error

A

A random difference (not tending in any direction) between an estimate and the true population characteristic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

(Larger, normal, small) samples on average will have smaller sampling error

A

Larger

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Increase the number of individuals in your sample

A

Decrease sampling error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ensure random sampling

A

Reduce sampling bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Variance

A

Average squared deviation from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Coefficient of variation

A

Expresses how big the standard deviation is in relation to the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Variation in sample means decreases with ____

A

Increased sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Standard error

A

The standard deviation of a sampling distribution (predicts the sampling error of the estimate)

The standard error of an estimate of a mean is the standard deviation of the distribution of sample means.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

95% Confidence Interval

A

Provides a plausible range of the parameter (95% of all 95% confidence intervals calculated from samples will include the population mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Pseudoreplication

A

Error that occurs when individual measurements are not independent but are treated as though they are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Test statistic

A

A number calculated to represent the match between a set of data and the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

P-value

A

The probability of getting the data or something more unusual if the null hypothesis were true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Type I Error

A
  • Rejecting a true null hypothesis
  • Pr[Type I Error] = a
  • Does not depend on sample size
17
Q

Type II Error

A
  • Not rejecting a false null hypothesis
  • Pr[Type II Error] = B
  • B lowers with larger sample sizes
  • The smaller the B the more power a test has
18
Q

Power

A

The ability of a test to reject a false null (Power = 1 - B)

19
Q

Poisson distribution

A

Describes the probability that a certain number of events occur in a block of time or space, when those events happen independently of eqch other and occur with equal probability at every point in space/time

20
Q

Central limit theorem

A

The sum or mean of a large number of measurements randomly sampled from any population is approximately normally distributed

21
Q

Null distribution for a test statistic

A

The probability distribution of alternative outcomes when a random sample is taken from a hypothetical population in which the null hypothesis is true

22
Q

Paired design

A
  • Data from two groups are paired
  • Each member of a pair shared much in common except for the tested categorical variable
  • Accounts for extraneous variation
  • Mean of the differences
23
Q

Transformation require:

A

1) Same transformation applied to each individual/group
2) One-to-one correspondence with original value (no ambiguity)
3) Monotonic (order stays the same)

24
Q

Goals of experiments

A

1) Eliminate bias
2) Reduce sampling error

25
Features that reduce bias
1) Controls 2) Randoom assignment of treatments (averages the effects of confounding variables) 3) Blinding/anonymizing
26
How to reduce sampling error
Increase signal to noise ratio Lower "noise" by increasing sample size and reducing variation within groups (all other factors as equal as possible)
27
Design features to reduce sampling error
1) Replication: carry out study on multiple independent objects 2) Balance: nearly equal sample sizes in each treatment 3) Blocking: Grouping experimental units and applying different treatments within each group (accounts for extraneous variables) 4) Extreme treatments: stronger treatments
28
Matching
Pair individuals in treatment group with control individuals with similar values for confounding variables (reduces bias by limiting confounding and reduces sampling error analogous to blocking)
29
r^2
Describes the proportion of variation in one variable that can be predicted from the other variable (the proportion of variance in Y that can be predicted by the regression line)
30
Attenuation
The estimated correlation will be lower if X or Y are estimated with error