STATS Flashcards

1
Q

Population

A

A population is the entire collection of objects or outcomes about which information is sought

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sample

A

A sample is a subset of a population, containing the objects or outcomes that are actually observed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Simple random sample

A

A simple random sample (SRS) of size n is a sample chosen by a method in which each collection of n population items is equally likely to comprise the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

N

A

n is the number of values in your sample. If you measure the heights of students in a class of 27, then n = 27

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Median

A

Median (also called the “midrange” – the middle number of the ordered set of values
– If n is odd, then the median is middle number * 1,2,4,5,6 median = 4
– If n is even, then the median is the average of values in middle position
* 2,3,4,7,9,10 median = 4+7=11/2=5.5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Mode

A

Mode – the most frequently occurring value in a sample
– 2,2,2,3,3,3,3,5,5,6,6 mode = 3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Range

A

Range – the difference between the largest
and smallest values in a sample
– 23, 33, 35, 55, 70 range is 70-23=47

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Mean

A

the sum of the values divided by the number of values. __
– 1,3,4,5,7 sum = 20 X = 20/5 = 4
Means are not always meaningful by themselves!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Variance

A

how far a set of random numbers are from their mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Standard deviation

A

the square root of the variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

discrete data

A

acountofwholeevents,objects or persons. For example, the number of people with a certain illness is a discrete quantity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Continuous data

A

themeasureofaquantitysuch as length, volume, or time, which can occur at any value. For example, the concentration of glucose in the blood is a continuous quantity. Even if the instrument you are using rounds off values to whole numbers, these quantities are still continuous.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Standard deviation

A

Expresses the degree to which each data result
tend to vary about the mean value
– SD is the square root of the variance of the sample
– SD - measures precision
– SD - used to set confidence limits upon which control result acceptability is determined

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Standard deviation formula

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Standard deviation formula

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Range of standard deviation
Example-
Average: 90mg/dl
Normal range-

A

Average for population- +/- 2SD
example= glucose
Average: 90mg/dL
1SD= 10mg/dL
90 + 2SD= 110mg/dL
90- 2SD= 70mg/dL

normal range= 70-110 mg/dL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Exceptions to Range

A

When any value is not normal
- drug screens, disease markers, Morphology
When the test is qualitative
- HIV, throat culture, genetic testing
When monitoring response to medication
- refer to therapeutic range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Level Jennings chart

A

– Graphically display the assay (QC) values of replicated
controls vs. time or consecutive runs
– Confidence limits are calculated from the mean and SD. It is customary to use +/- 2 SD as the confidence limits. (95% confidence limit)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Level Jennings chart

Quality control charts are

A

Quality control charts are used to record the results of measurements on control samples, to determine if there are systematic or random errors in the method being used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Accuracy

A

The closeness to which a value comes to the true value- established by calibration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Precision

A

The reproducibility of a value
- evaluated by use of QC materials- evaluates the degree of fluctuation in the measurements

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

To be reliable

A

A method must be both accurate and precise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Measures of precision

A

– Variance
– Standard Deviation
– Coefficient of Variation – F-Test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Measures of accuracy

A

– T-Test
– Linear Regression Analysis

25
Q

Sensitivity

A

a measurement that determines the probability of actual positives
= # true positives/[# true positives + # of false negatives]

26
Q

Specificity

A

– a measurement that determines the probability of actual negatives
= # true negatives/[# of true negatives + # of false positives]

27
Q

Sampling errors

A

one of the major difficulties in obtaining reliable results involves the sample collection procedure
– Time of day the sample is obtained
– The patient’s position, state of physical activity – Storage condition and aging of sample

28
Q

Procedural errors

A

– Aging of chemicals/reagents
– Personal bias (limited experience)
– Laboratory bias (because of variations in standards, reagents, environment, methods, and equipment)
– Experimental error (resulting from change in method, instruments, or personnel)

29
Q

Procedural errors

A

– Aging of chemicals/reagents
– Personal bias (limited experience)
– Laboratory bias (because of variations in standards, reagents, environment, methods, and equipment)
– Experimental error (resulting from change in method, instruments, or personnel)

30
Q

Outliers

A

-Outliers are data points that are much larger or smaller than the mean of sample points
* Outliers should not be deleted without considerable thought and documentation.

31
Q

Evaluating QC data - Co-efficient of Variation (CV)

A

Allows comparison of different test methods and to compare data from one laboratory to that of another lab by expressing the SD of each set as a percentage of the mean

32
Q

Evaluating QC data - Co-efficient of Variation (CV)

A

Allows comparison of different test methods and to compare data from one laboratory to that of another lab by expressing the SD of each set as a percentage of the mean

33
Q

CV formula

A
  • CV is expressed as a percent %.
  • CV = (SD/X)*100
    – SD = SD of a procedure
    – X=mean
    Acceptable CV of 5% or less (most labs use 3%)
34
Q

Using CV- Index of variability

A
  • Procedures with increasing CV values demonstrate decrease precision, since this reflects greater variability among the replicate samples
35
Q

Example of CV index of variability

– Procedure A has a SD of 3mg/dl and a mean value of 100.
– Procedure B has a SD of 5mg/dl and a mean of 250

A

– Procedure A has a SD of 3mg/dl and a mean value of 100.
– Procedure B has a SD of 5mg/dl and a mean of 250
Which procedure would you recommend and why
CV ProcedureA=(3/100)100=3%
– CV ProcedureB=(5/250)
100=2%
– Procedure B would be recommended, based on the lower CV value, indicating greater precision of the test with less variability.

36
Q

Evaluation of Peer QC data - SDI

A

Standard Deviation Index—Useful to evaluate performance when comparing to another lab’s performance

37
Q

Standard Deviation Index formula

A

SDI= Lab mean- peer group mean/ Peer group Standard deviation

Acceptable results= -1.0 to +1.0 SDI

38
Q

methods comparison stats

A
  • Compare the new method against old
  • Are differences significant?
  • Two types: – Graphs
    – T-Test
39
Q

Scatterplots

A

A graph that can be used to give a rough impression of the shape of a sample, giving good indication of where the sample values are concentrated and where gaps are.

40
Q

Histograms

A

a graphical display that gives an idea of sample “shape”, indicating regions where sample points are concentrated and regions where they are sparse

41
Q

Symmetry and skewness

A
  • A histogram is symmetric if the mean and median are approx. equal…its right half mirrors the left half.
  • Only one peak is termed “unimodal”, 2 peaks is “bimodal”
42
Q

Symmetry and skewness

Histograms that are not what

A
  • Histograms that are not symmetric are skewed
43
Q

Right or positively skewed

A
  • A histogram with a long right-hand tail is said to be skewed to the right or positively skewed
    – Themeanisgreaterthanthemedian
44
Q

Left or negatively skewed

A

A histogram with a long left-hand tail is said to be skewed to the left or negatively skewed
– Whenthemeanislessthanthemedian

45
Q

Students t-Test

A
  • Comparison of data
  • Are they “significantly” different?
  • Is the difference “statistically significant”?
  • Possibility that differences are due to chance?
46
Q

Student t-Test

Null hypothesis

A

(H0) No significant difference
between the numbers (differences are due to chance)

47
Q

Alternate Hypothesis

A

(Ha) There IS a significant difference between the methods (differences are not due to chance alone)

48
Q
  • We ran a serum cortisol low control (20 μg/mL) for nine consecutive days on two different instruments
    17, 21, 23, 18, 19, 20, 18, 22, 23
    16, 19, 24, 23, 17, 19, 23, 21, 24
    Are these two sets of numbers close enough
A
  • If they are, we ACCEPT the Null Hypothesis (the differences
    are due to chance and are not significant) p> 0.05
  • If they are not, we REJECT the Null Hypothesis (the differences are NOT due to chance which means that these two methods are not equal (alternative hypothesis) p< 0.05
49
Q
  • We ran a serum cortisol low control (20 μg/mL) for nine consecutive days on two different instruments
    17, 21, 23, 18, 19, 20, 18, 22, 23
    16, 19, 24, 23, 17, 19, 23, 21, 24
    Are these two sets of numbers close enough
A
  • If they are, we ACCEPT the Null Hypothesis (the differences
    are due to chance and are not significant) p> 0.05
  • If they are not, we REJECT the Null Hypothesis (the differences are NOT due to chance which means that these two methods are not equal (alternative hypothesis) p< 0.05
50
Q

Probability

A
  • Can be different for different analyses
51
Q

probability

Usually use a what

A
  • Usually use 95% probability:
    – There is a 5% chance or greater probability the
    results are due to pure chance (random variance)
    – If the value falls below 5%, then it is no longer just random variance
    – In other words, if you calculate a t-Test value, and your p< 0.05 (or p<5%), then you REJECT the Null Hypothesis (differences ARE statistically significant)
52
Q
  • Old method for glucose:
    100, 112, 125, 111, 77, 89

Mean

variance

A

Mean = 102
Variance = 302

53
Q

New method for glucose

101, 113, 120, 88, 105, 93

A

Mean = 103
Variance = 144

54
Q

Variance

A

A measure of how far a set of numbers are from their mean.

55
Q

Standard deviation
Old method for glucose: 100, 112, 125, 111, 77, 89

A

Mean = 102 Variance = 302 SD = 17

56
Q

Standard deviation
Old method for glucose: 100, 112, 125, 111, 77, 89

A

Mean = 102 Variance = 302 SD = 17

57
Q

CV
* Old method for glucose:
100, 112, 125, 111, 77, 89

A

Mean = 102
Variance = 302
SD = 17
%CV = 16.7%

58
Q

T-tests *
Old method for glucose 100, 112, 125, 111, 77, 89

New method 101, 113, 120, 88, 105, 93

A

p=0.887
p>0.05, so there is no significant difference

59
Q

Advanced analysis

A
  • Linear Regression analysis - Accuracy
  • More than one sample – Analysis of variance (ANOVA)
  • Goodness of Fit – Chi squared test
  • Multivariate analysis