STATS Flashcards

1
Q

Population

A

A population is the entire collection of objects or outcomes about which information is sought

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sample

A

A sample is a subset of a population, containing the objects or outcomes that are actually observed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Simple random sample

A

A simple random sample (SRS) of size n is a sample chosen by a method in which each collection of n population items is equally likely to comprise the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

N

A

n is the number of values in your sample. If you measure the heights of students in a class of 27, then n = 27

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Median

A

Median (also called the “midrange” – the middle number of the ordered set of values
– If n is odd, then the median is middle number * 1,2,4,5,6 median = 4
– If n is even, then the median is the average of values in middle position
* 2,3,4,7,9,10 median = 4+7=11/2=5.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Mode

A

Mode – the most frequently occurring value in a sample
– 2,2,2,3,3,3,3,5,5,6,6 mode = 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Range

A

Range – the difference between the largest
and smallest values in a sample
– 23, 33, 35, 55, 70 range is 70-23=47

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Mean

A

the sum of the values divided by the number of values. __
– 1,3,4,5,7 sum = 20 X = 20/5 = 4
Means are not always meaningful by themselves!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Variance

A

how far a set of random numbers are from their mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Standard deviation

A

the square root of the variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

discrete data

A

acountofwholeevents,objects or persons. For example, the number of people with a certain illness is a discrete quantity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Continuous data

A

themeasureofaquantitysuch as length, volume, or time, which can occur at any value. For example, the concentration of glucose in the blood is a continuous quantity. Even if the instrument you are using rounds off values to whole numbers, these quantities are still continuous.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Standard deviation

A

Expresses the degree to which each data result
tend to vary about the mean value
– SD is the square root of the variance of the sample
– SD - measures precision
– SD - used to set confidence limits upon which control result acceptability is determined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Standard deviation formula

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Standard deviation formula

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Range of standard deviation
Example-
Average: 90mg/dl
Normal range-

A

Average for population- +/- 2SD
example= glucose
Average: 90mg/dL
1SD= 10mg/dL
90 + 2SD= 110mg/dL
90- 2SD= 70mg/dL

normal range= 70-110 mg/dL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Exceptions to Range

A

When any value is not normal
- drug screens, disease markers, Morphology
When the test is qualitative
- HIV, throat culture, genetic testing
When monitoring response to medication
- refer to therapeutic range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Level Jennings chart

A

– Graphically display the assay (QC) values of replicated
controls vs. time or consecutive runs
– Confidence limits are calculated from the mean and SD. It is customary to use +/- 2 SD as the confidence limits. (95% confidence limit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Level Jennings chart

Quality control charts are

A

Quality control charts are used to record the results of measurements on control samples, to determine if there are systematic or random errors in the method being used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Accuracy

A

The closeness to which a value comes to the true value- established by calibration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Precision

A

The reproducibility of a value
- evaluated by use of QC materials- evaluates the degree of fluctuation in the measurements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

To be reliable

A

A method must be both accurate and precise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Measures of precision

A

– Variance
– Standard Deviation
– Coefficient of Variation – F-Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Measures of accuracy

A

– T-Test
– Linear Regression Analysis

25
Sensitivity
a measurement that determines the probability of actual positives = # true positives/[# true positives + # of false negatives]
26
Specificity
– a measurement that determines the probability of actual negatives = # true negatives/[# of true negatives + # of false positives]
27
Sampling errors
one of the major difficulties in obtaining reliable results involves the sample collection procedure – Time of day the sample is obtained – The patient’s position, state of physical activity – Storage condition and aging of sample
28
Procedural errors
– Aging of chemicals/reagents – Personal bias (limited experience) – Laboratory bias (because of variations in standards, reagents, environment, methods, and equipment) – Experimental error (resulting from change in method, instruments, or personnel)
29
Procedural errors
– Aging of chemicals/reagents – Personal bias (limited experience) – Laboratory bias (because of variations in standards, reagents, environment, methods, and equipment) – Experimental error (resulting from change in method, instruments, or personnel)
30
Outliers
-Outliers are data points that are much larger or smaller than the mean of sample points * Outliers should not be deleted without considerable thought and documentation.
31
Evaluating QC data - Co-efficient of Variation (CV)
Allows comparison of different test methods and to compare data from one laboratory to that of another lab by expressing the SD of each set as a percentage of the mean
32
Evaluating QC data - Co-efficient of Variation (CV)
Allows comparison of different test methods and to compare data from one laboratory to that of another lab by expressing the SD of each set as a percentage of the mean
33
CV formula
* CV is expressed as a percent %. * CV = (SD/X)*100 – SD = SD of a procedure – X=mean Acceptable CV of 5% or less (most labs use 3%)
34
Using CV- Index of variability
* Procedures with increasing CV values demonstrate decrease precision, since this reflects greater variability among the replicate samples
35
Example of CV index of variability – Procedure A has a SD of 3mg/dl and a mean value of 100. – Procedure B has a SD of 5mg/dl and a mean of 250
– Procedure A has a SD of 3mg/dl and a mean value of 100. – Procedure B has a SD of 5mg/dl and a mean of 250 Which procedure would you recommend and why CV ProcedureA=(3/100)*100=3% – CV ProcedureB=(5/250)*100=2% – Procedure B would be recommended, based on the lower CV value, indicating greater precision of the test with less variability.
36
Evaluation of Peer QC data - SDI
Standard Deviation Index—Useful to evaluate performance when comparing to another lab’s performance
37
Standard Deviation Index formula
SDI= Lab mean- peer group mean/ Peer group Standard deviation Acceptable results= -1.0 to +1.0 SDI
38
methods comparison stats
* Compare the new method against old * Are differences significant? * Two types: – Graphs – T-Test
39
Scatterplots
A graph that can be used to give a rough impression of the shape of a sample, giving good indication of where the sample values are concentrated and where gaps are.
40
Histograms
a graphical display that gives an idea of sample “shape”, indicating regions where sample points are concentrated and regions where they are sparse
41
Symmetry and skewness
* A histogram is symmetric if the mean and median are approx. equal...its right half mirrors the left half. * Only one peak is termed “unimodal”, 2 peaks is “bimodal”
42
Symmetry and skewness Histograms that are not what
* Histograms that are not symmetric are skewed
43
Right or positively skewed
* A histogram with a long right-hand tail is said to be skewed to the right or positively skewed – Themeanisgreaterthanthemedian
44
Left or negatively skewed
A histogram with a long left-hand tail is said to be skewed to the left or negatively skewed – Whenthemeanislessthanthemedian
45
Students t-Test
* Comparison of data * Are they “significantly” different? * Is the difference “statistically significant”? * Possibility that differences are due to chance?
46
Student t-Test Null hypothesis
(H0) No significant difference between the numbers (differences are due to chance)
47
Alternate Hypothesis
(Ha) There IS a significant difference between the methods (differences are not due to chance alone)
48
* We ran a serum cortisol low control (20 μg/mL) for nine consecutive days on two different instruments 17, 21, 23, 18, 19, 20, 18, 22, 23 16, 19, 24, 23, 17, 19, 23, 21, 24 Are these two sets of numbers close enough
* If they are, we ACCEPT the Null Hypothesis (the differences are due to chance and are not significant) p> 0.05 * If they are not, we REJECT the Null Hypothesis (the differences are NOT due to chance which means that these two methods are not equal (alternative hypothesis) p< 0.05
49
* We ran a serum cortisol low control (20 μg/mL) for nine consecutive days on two different instruments 17, 21, 23, 18, 19, 20, 18, 22, 23 16, 19, 24, 23, 17, 19, 23, 21, 24 Are these two sets of numbers close enough
* If they are, we ACCEPT the Null Hypothesis (the differences are due to chance and are not significant) p> 0.05 * If they are not, we REJECT the Null Hypothesis (the differences are NOT due to chance which means that these two methods are not equal (alternative hypothesis) p< 0.05
50
Probability
* Can be different for different analyses
51
probability Usually use a what
* Usually use 95% probability: – There is a 5% chance or greater probability the results are due to pure chance (random variance) – If the value falls below 5%, then it is no longer just random variance – In other words, if you calculate a t-Test value, and your p< 0.05 (or p<5%), then you REJECT the Null Hypothesis (differences ARE statistically significant)
52
* Old method for glucose: 100, 112, 125, 111, 77, 89 Mean variance
Mean = 102 Variance = 302
53
New method for glucose 101, 113, 120, 88, 105, 93
Mean = 103 Variance = 144
54
Variance
A measure of how far a set of numbers are from their mean.
55
Standard deviation Old method for glucose: 100, 112, 125, 111, 77, 89
Mean = 102 Variance = 302 SD = 17
56
Standard deviation Old method for glucose: 100, 112, 125, 111, 77, 89
Mean = 102 Variance = 302 SD = 17
57
CV * Old method for glucose: 100, 112, 125, 111, 77, 89
Mean = 102 Variance = 302 SD = 17 %CV = 16.7%
58
T-tests * Old method for glucose 100, 112, 125, 111, 77, 89 New method 101, 113, 120, 88, 105, 93
p=0.887 p>0.05, so there is no significant difference
59
Advanced analysis
* Linear Regression analysis - Accuracy * More than one sample – Analysis of variance (ANOVA) * Goodness of Fit – Chi squared test * Multivariate analysis