Mod. 4 Health Data Analytics Flashcards
Organizations must protect records against _______, _______, ________ and _______.
Loss
Defacement
Tampering
Unauthorized use
Effective confidentiality policies should have specified confidentiality policies for
- release of information
- removal of medical records
- protection of PHI
What does HIPAA stand for?
Health Information Portability and Accountability Act
What are the four goals of electronic health records?
- Guide clinical practice
- Interconnect clinicians
- Personalize care
- Improve population health
What are some groups that may access PHI without written authorization?
- governing body
- senior leadership
- healthcare personnel involved with the patient’s care at the time
- Quality improvement, risk management and utilization management
- Health information management
What are some information security methods?
- Separate portion of records (such as psych)
- Restricted access to computer files
- Adequate back up plan and firewalls
- Requirement of signed forms for release of information
How is data defined?
Abstract representations of things, facts, concepts and instructions that are stored in defined format
How is information defined?
Obtained when data are translated into results and statements that are useful for decision making
What is RA(S)CI?
Responsible, Accountable/Approve, Supportive, Consulted, Informed
When do you use RA(S)CI?
Identify roles and responsibilities during an organizational change process
What role is “responsible” in RA(S)CI and what’s expected of them?
The Doer
Actively participate and contribute to best of ability; person/people working on the activity
What role is “Accountable/Approve” in RA(S)CI and what’s expected of them?
The Buck stops here
Person ultimately accountable for results; position with yes/no authority
What role is “supportive” in RA(S)CI and what’s expected of them?
The Helper
People to support responsible person, not always used; helping out at direction of responsible people
What role is “consulted” in RA(S)CI and what’s expected of them?
In the loop
People with particular expertise to contribute to specific questions, involved prior to decision or action
What role is “informed” in RA(S)CI and what’s expected of them?
Tell me after
People affected by activity/decision and need to be kept informed but do not participate; need to know
What are the benefits of RA(S)CI?
- Determines ownership
- Promotes teamwork by clarifying roles
- Increases efficiency
- Improves communication
- reduces misunderstanding
What does a clinical information system do?
Support direct care processes (lab/radiology, results, etc.)
What does an administrative (non-clinical) information system do?
Aid day-to-day operations (billing, financial, HR)
What does a decision information system do?
Deal with strategic planning functions (case/effect)
What is a health information system?
A health information system (HIS) refers to a system designed to manage healthcare data
What are some examples of administrative financial information systems?
- Payroll
- Accounts payable
- Patient accounting
What are some examples of administrative human resource information systems?
- employee records
- labor analysis
- turnover
What are some examples of administrative office information systems?
- Word processing
- scheduling
- spreadsheets
What are some uses for decision information systems?
- strategic planning and marketing
- performance evaluation
- clinical pathways
- identifying positive and negative outcomes
- includes risk (count) and severity-adjusted (measurement) data
What is a chart-based (EHR-based) analysis system?
Nursing or medical record analysts review records/EHRs
What is the disadvantage to a chart-based analysis system?
higher cost/smaller sample size
What is a code based analysis system?
Based on retrospective administrative data and uses clinical data spanning entire stay
What are the advantages of code based analysis?
lower cost and larger sample size
What questions quality professionals should ask when evaluating information management systems?
- Does it capture/store and retrieve clinical and financial information from a variety of sources?
- Does it interface with other information systems?
- Does it allow for establishment of trigger/threshold measures?
- Does it allow for critical alerts?
- Does it support accreditation and regulatory requirements?
What are the two types of data?
- Categorical (count)
- Continuous (measured)
What are the two types of categorical data?
nominal and ordinal
What are the two types of continuous data?
interval and ratio
What is nominal categorical data?
- (count, discrete, qualitative) considered attributes data with no quantitative value
- Binary data (two possible values)
What are examples of nominal categorical data?
- Surgical patients (pre-operative/post-operative)
- Patient education (attended video session/did not attend video session)
What is ordinal categorical data?
Characteristics (nominal data) are put into categories and ranked ordered. Categories are not arbitrary
What are examples of ordinal categorical data?
Nursing staff rank (nurse level 1, 2, 3)
Education (AA, BS, MS, PhD)
Attitude toward research (Likert scale) (1. Strongly agree…5. Strongly disagree)
What is continuous data?
Continuous or “measured” data are assigned scales that theoretically have no gaps (i.e., variables data)
What is interval continuous data?
The distances between each point is equal and there is no true zero
What is an example of interval continuous data?
Values on a thermometer
What is ratio continuous data?
The distances between each point is equal and there IS a true zero
What is an example of ratio continuous data?
height and weight
Which of the two types of data is least powerful statistically and why does it matter in practical terms?
Categorical
When comparing patient outcomes after process change, fewer data points (and subjects) are needed if data in continuous form are collected
What is an example of why categorical data is less powerful statistically?
Blood pressure: can categorize subjects as either hypertensive or non-hypertensive or recording the measured levels of systolic and diastolic pressure. The latter is more powerful and allows more flexibility in analysis
Continuous data is usually reported as?
Mean, median, min., max. percentiles
What is Functional Independence Measure (FIM)?
measures the level of a patient’s disability (lower score (i.e., 1) equals needs more assistance for daily living)
What does it mean to compare data sources?
Examine processes and results against a reference point either internally or externally with competitors and other organizations providing similar services
What does it mean to benchmark data sources?
Examine processes and results that represent best practices for similar activities inside or outside the healthcare industry
What is the goal of benchmarking?
To identify how to improve the outcomes (not identify the difference between organizations). Enables organizations to set target or goal for process improvement
Benchmarking involves asking the right questions. What are some questions you should ask?
- What is the best practice?
- What are we doing? How are we doing it?
- How well are we doing it?
- What are the measurement results?
- Why are we looking for improvement?
What are the first two steps in interpreting data and using information?
Plan and Organize
- Anticipate barriers, identify responsibilities and lay groundwork for multidisciplinary collaboration
- develop data dictionary
After planning and organization, what are the next steps in interpreting data and using information?
Pilot data collection; verify and correct
- Begin limited data collection as pilot
- identify data limitations and errors
- modify data collection plan as needed
- collect data
After piloting and collecting the data, what is the next step in interpreting and using data?
Analyze and Present findings
- look at trends
- how is the data likely to be interpreted?
- who should receive the data?
- for what purpose?
After analyzing and presenting findings, what are the next steps in interpreting and using data?
Study and Develop Recommendations
- Perform variation analysis
- review additional data
- conduct retrospective medical reviews
- perform process analysis
After studying and developing recommendations, what is the next step in interpreting and using data?
Take action
- empower teams to make decisions and implement changes
- educate and train staff
- report findings
After taking action, what is the next step in interpreting and using data?
Monitor performance
- have proposed changes been implemented?
- how could compliance be enhanced?
- what effect are changes having on patient outcomes?
After monitoring performance, what is the next step in interpreting and using data?
Communicate results
Population (N) is ________
total aggregate or group
Sample (n) is ___________
a portion of the population
What does sampling accomplish?
Provides a logical way of making statements about a larger group and allows quality professionals to make statements from the sample to the population depending on the type of sampling used
What is probability sampling and an example?
Can generalize findings. E.g., sending a survey to every 5th patients in the list
What is nonprobability sampling and an exmaple?
Cannot generalize findings. E.g., surveying the first 5 patients who show up every week
What are the three probability sampling types?
- Simple random sampling
- systematic sampling
- stratified sampling
What is simple random sampling?
Each individual in the population has an equal chance to be chosen (e.g., drawing names out of a hat)
What is systematic sampling?
After random selection of 1st case, draw every nth case (e.g., every 5th patient)
What is stratified sampling?
The population is divided into groups, each member has an equal probability (e.g., patients with a specific disease)
What are the five types of nonprobability sampling?
- Snowball sampling
- Convenience sampling
- Expert sampling
- Quota sampling
- Purposive sampling
What is snowball sampling?
Subjects suggest other subjects (sub-type of convenience sampling)
E.g., cancer patients in a clinic who identify other cancer patients they know
What is convenience sampling?
Any available group of subjects is used (lack of randomization) E.g., information about participants who took one instructor's CPHQ class but not all classes taught
What is purposive sampling?
A particular group is subjectively selected based on criteria
E.g., nursing group represents cross section of women
What is expert sampling?
Experts in a given area are selected due to their access information
E.g., survey department managers about staff satisfaction
What is quota sampling?
A judgement is made about the most representative sample
E.g., 15 charts per month/5% or 30-which ever is greater
With the exception of case studies, _______ samples yield a more valid and accurate study and are representative of the population.
larger
As the actual difference between groups gets smaller, the size of the sample required gets _______.
bigger
Continuous data requires a ______ sample size than categorical data.
smaller
Regardless of the shape of the original population distribution, as sample size increases, the shape of sampling distribution becomes a normal ____ ____
bell curve
_____ analysis determines appropriate sample size
Power
Reliability is the extent to which an instrument yields the same results on repeated trials. T or F?
True
How does Test-retest technique assess reliability?
Determines stability by administering a test to a sample of people on two occasions and comparing
How does Split-half technique assess reliability?
Assesses internal consistency by correlating scores on half the test with scores on the other half
What is the reliability coefficient
The numerical index of the test’s reliability
The closer the reliability coefficient is to [#], the more reliable the tool.
1
Reliability coefficients of greater than or equal to [#] are considered acceptable, although greater than or equal to [#] is desired.
0.7, .8
What is interrater reliability?
Degree to which two raters, operating independently, assign the same ratings in the context of observational research or in coding qualitative materials
What is validity?
The degree to which an instrument measures what it is intended to measure
Is validity more or less difficult to establish than reliability?
More
A thermometer is a ________ instrument, but it is not valid for measuring height.
reliable
What is content (face) validity?
Degree to which instrument adequately represents the universe of content.
Content validity, although necessary, is not a sufficient indication that the instrument measures what it is intended to measure. True or False
True
E.g. Patient Satisfaction survey would be inadequate in terms of content validity if it did not cover major dimensions of care (waiting, access, provider-patient interaction, etc.)
What is construct validity?
Degree to which an instrument measures the theoretical construct or trait that it was designed to measure
What is an example of how construct validity can be used?
If a previously used satisfaction survey tool demonstrated validity, a new, abbreviated tool could be compared to the “old” tool to determine whether the new tool has construct validity
What is criterion-related validity?
Extent that the score on an instrument is related to a criterion (the behavior it’s supposed to predict)
What is an example of criterion-related validity?
Job description and performance evaluation. CPHQ exam and training
What is an example of construct validity?
Patient Satisfaction, personality types
What are measures of central tendency?
Statistical indexes that describe where a set of scores or values of a distribution cluster. Central refers to the middle value, and tendency refers to the general trend of numbers
The three most common measure of central tendency are ________, ________ and _________
mean, median and mode
What is an example of when a healthcare quality manager or researcher might ask questions relating to central tendency?
What is the Apgar score (quick test to assess baby’s health, given twice: 1 minute after birth and 5 minutes after) of infants born in new birthing suites?
What is the mean (M)?
Sum of all scores or values divided by total number of scores. Also known as the average
The mean is most commonly used of all measures of central tendency but is also most sensitive to extreme scores. True or False?
True
With Mean, zero is a numerical value and must be included in the division of the sum. True or False?
True
It is appropriate to use the mean for interval or ratio data when variables can be added and the values show a bell-shaped or normal distribution. True or False?
True
What is the median?
“The middle” Measure of central tendency that corresponds to the middle score.
How do you determine the median?
Arrange values in rank order:
- If the total number is odd, count up (or down) to the middle value.
- If there are several identical values clustered in the middle the median is that value.
- If the total number of values is even, compute the mean of the middle two values
What is an example of when the median should be used rather than the mean and why?
If the values are not representative of the entire group due to extreme outliers. Median is not sensitive to extreme scores or statistical outliers
It is appropriate to compute the median for ordinal, interval or ratio data but not nominal data. True or False?
True
What is the mode?
Score or value that occurs most frequently in a distribution of scores
Mods are viewed as a quick and easy method to determine an “average,” but they tend to fluctuate widely from sample to sample. This, the mode is reported infrequently, expect when used as a descriptor for “typical” values. True or False?
True
What’s an example of reporting mode as a descriptor for a “typical” value?
Describing demographic data from the sample population as “The typical response was a Hispanic man who was married.”
Is there a mode when all values have the same frequency?
No
When there are two values that have the same frequency, there can be two modes. True or False?
True
What is range (dispersion)?
Difference between highest and lowest values
In contrast to central tendency, variability looks at the _________, or how the measures are spread out.
dispersion
What are the advantages of range?
It is a quick estimate of variability and it provides information about the two endpoints of a distribution
What are the disadvantages of range?
Its instability because it is based on only two scores, its tendency to increase with sample size and its sensitivity to extreme values
Range is best reports as a maximum and minimum (e.g., test scores range from 98 to 60). True or False?
True
What is standard deviation (SD)?
An average of the deviations from the mean
When is standard deviation most frequently used?
For measuring the degree of variability in a set of scores
The larger the spread of a distribution, the greater the dispersion/variability from the mean; consequently, the SD will be a _______ value and is said to be heterogeneous.
larger
What chart can be used to display data distribution and to determine if data are normally distributed?
A histogram
What is the interquartile range?
Common measure of interpercentile measures. A stable measure of variability based on excluding extreme scores and using only middle cases
How do you determine the interquartile range?
Line up the measures in order of size and then dividing the array into quarters. The range of scores that includes the middle 50% of the scores is the interquartile range (range of scores comprising the lowest quartile, or quarter, and the highest quartile.
What is a Cause-and-effect Diagram?
Used to display, explore and analyze all the potential causes related to a problem or condition and to discover the root causes of variation.
What is a Cause-and-effect Diagram also known as?
Ishikawa or Fishbone diagram
When should you use a cause-and-effect/fishbone/Ishikawa diagram?
- During root cause analysis
- when brainstorming about potential causes of a problem or source of variation
- looking f or topics generating the most ideas
What are the common categories of a cause-and-effect/fishbone/Ishikawa diagram?
The 5Ps
- Patron (user)
- People (workers)
- Provision (supply)
- Places to work (environment)
- Procedures (methods/roles)
What is a flow chart?
Graphical display of a process outlining the sequence and relationship of the pieces of the process
When should you use a flow chart?
- Part of root cause analysis (RCA) or Failure Mode and Effects analysis (FMEA)
- When designing a new process or redesigning a current one
How do you create a flowchart?
- Decide on starting and end points (ovals)
- Brainstorm steps and place in rectangles
- Decision points are diamonds with a positive and negative response for each
- Connect all with lines and arrows for flow of process
What does a flow chart tell you?
- How to identify inefficiencies, omissions, gaps and redundancies
- How to determine steps in a process
- How to determine risk factors
What is a run chart?
A line graph plotted over time (kept in time order)
What questions does a run chart help answer?
- How much variation do we have
- Is the process changing significantly over time
- Has our change resulted in improvement
- Did I hold improvement
How do you use the X and Y axes in a run chart?
-Y (vertical) is what you care about (e.g., pt. sat. score). X (horizontal) is ordering data (e.g., time). If data is in time order, connect lines with a dot (if not, don’t)
How many data points do you need to start a run chart?
One if that’s all you have
How many data points do you need to determine a median?
10-12
When you have ___ to ___ data points, you should revise the median. Then, continue revising if the median is no longer useful
10-30
What is a Pareto Chart?
Tool used to prioritize a series of problems or possible causes of problems.
What does a Pareto Chart display?
A series of bars in which the varying height of the bars clearly displays the priority for problem solving
What does a Pareto chart ask?
- Which variables out of many are occurring most?
- Which variables of causes should we focus on?
How many data points make a Pareto chart useful?
30 or more
The Pareto principle states ____% of the problems or effects come from ___% of the causes (doesn’t have to be exact but should be close). So you should tackle the most frequent causes to achieve the most improvement.
80, 20
In a Pareto chart, the bars are always in order of ________
occurrence (right=most, left=least)
When do you use a Pareto chart?
- When data can be arranged in categories
- When rank of each category is important
- When you need to focus on most important problems or causes
- Can stratify by creating another Pareto chart from the largest bar
What is the difference between a bar and histogram chart (aka frequency plot)?
-Bar charts should be used when data is sparse. As data increases, histograms should be used to organize and summarize data.
Why is a histogram useful?
It plots the frequency of each interval to reveal patterns of the data and show their spread (including outliers) and whether there is symmetry or skew. This may reveal problems in the data and may influence the choice of measure of central tendency and spread.
How do you construct a histogram cahrt?
- Accumulate at least 25 data points (to give at least 5 bars)
- Rank the data from smallest to largest
- Calculate the range by substracting smallest value from largest
- Estimate the number of bars (equals the square root of the number of data points)
- Determine width of bars by dividing range by number of bars (rounding if needed)
- Plot data, label
When do you use a histogram?
- Show data distribution o spread
- Show whether the data are symmetric shape (approx. same on both sides) or skewed (right or left)
- Show whether there are extreme data values (outliers)
What doesn’t a histogram tell you?
Whether a process is stable
What is a pie chart useful for?
Understanding all the responses on a measure. Usually expressed as percentages.
It answers: Where do more of the incidents occur? Which location experiences the least incidences?
When do you use a pie chart?
- Graphical display to show distribution or spread of comparative data
- During any stage of PDSA to communicate data distribution (visual representation but not an analysis of the data)
What is a scatter plot or diagram used for?
To determine the extent to which two variables (quality effects or process causes) relate to one another
Scatter plots are often used in combination with _______ or ______ diagrams/charts?
fishbone or Pareto
How do you construct a scatter plot?
- Collect at least 25 points of data for two variables
- Draw and label the data over the equal distances on the graph
- spread the data over equal distance on the graph
- Plot the paired sets of data by marking the intersection of their values
What is a positive correlation on a scatter plot?
Line trending up from the left to the right (one value increases, the other increases)
What is no correlation on a scatter plot?
No line, “scatter” of data
What is a negative correlation on a scatter plot?
Line trending down from left to right (one variable increases, the other decreases)
What is a peak or trough pattern on a scatter plot?
A “wave” or a “valley”
In terms of the strength of a relationship shown in a scatter plot, what does a tight relationship mean?
The factors appear to be responsible for most of the variation
In terms of the strength of a relationship shown in a scatter plot, what does a loose relationship mean?
Other factors probably affect the data
In terms of the strength of a relationship shown in a scatter plot, what do outliers mean?
Special causes are probably present
What question does a control (AKA Shewhart) chart answer?
Is this process stable (statistically stable)
What is the difference between a run chart and a control chart?
Control chart can be used to declare process stable, run chart cannot. Run charts are control charts without the limits
How is a control chart displayed?
Data over time showing
- mean
- statistically calculated upper and lower control limits at 3 standard deviation
When would you use a control chart?
When you need to determine:
- how much variation is present/causes of variation
- Is my change an improvement/sustained
What is Statistical Process Control (SPC)
An approach to monitoring quality by looking at whether a process or outcome is within the bounds of what is expected (control charts can be used for this.
What does the center line in a control chart depict?
The mean
How do you calculate the upper control limits (UCL) and lower control limits (LCL) for a control chart?
Adding and subtracting three standard deviations to or from the mean
How do you calculate standard deviation (SD)?
- Take the square of the difference between each data point and mean (finding the sum of those values)
- Divide the sum by the sample size minus 1 (=variance)
- Take the square root of the variance for the SD
When do you use an XmR chart and what is it?
Used when data are obtained on a periodic basis, such as once a day/week. It shows individual values (X) and calculates limits based on the moving range (mR)
What is common cause variation and how does it appear on a control chart?
This type of variation is exhibited as points between the control limits in no particular pattern (this is variation that normally would be expected from a process)
What is special-cause variation?
Variation that is not inherent in the process (unpredictable). This type of variation is exhibited as points that fall outside the control limits, or within the limits but in a pattern
What are the five rules to identifying a special cause?
- Rule 1: Is it outside the limit (on the line doesn’t count)
- Rule 2-Is it a shift (run or 8 or more points in a row above or below center line (point on the center line doesn’t make or break a shift)
- Rule 3: Is it a trend (6 or more consecutive points increasing or decreasing. Ties between two or more points don’t make or break a trend)
- Rule 4: 2 or 3 near control limit (in outer third, can be upper or lower), if there isn’t an upper or lower limit, doesn’t apply
- Rule 5: 15 or more points in a row close to the center line
What’s the question to ask if it’s a special cause?
What’s different?
What’s the question to ask if it’s a common cause?
What’s common to the process that’s impacting the data?
How many data points should you have to create a control chart?
12 or more
How many data points should you have to update your control limits in a control chart?
20-30
If you have a special cause in a control chart, you create new limits. True or False?
True
What is a force field analysis
A method to systematically identify the various forces that facilitate or increase the likelihood of success and the opposite factors that decrease or retrain the likelihood of success or improvement (like pro/con chart)
What is a t-Test?
t-Test assesses whether the means of two groups are statistically different from each other
What is an example of a t-Test?
Evaluating the effects of two different treatments. Can be two groups that are independent (control group and experimental group) or dependent (pre- and post-treatment scores)
What is regression analysis?
A correlation between two variables is used to evaluated the usefulness of a prediction. (e.g., height and weight). Higher correlation, means a more accurate prediction. Values range from r=-1 to r=1, 0 means no correlation
What are parametric tests and what are two examples of test types?
Based on assumptions about the distribution of the underlying population from which the sample was taken. t-Test and regression analysis
What is multiple regression analysis?
Estimates the effects of two or more independent variables on dependent measure
What are nonparametric tests?
A class of statistical procedures that do not rely on assumptions about the shape or form of the probability distribution from which they were drawn
What is an example of a nonparametric test?
Chi-square
What is Chi-Square (x^2) test?
Measure the statistical significance of a difference in proportions and is the most commonly reported statistical test in medical literature
What is an example of the Chi-Square test?
15 of 30 men (50%) with an appointment failed to keep them, while only 10 of 40 women (25% failed to appear. The referent rate of missed appointments in men would be 0.5/0.25=2 (men are twice as likely not to show up)
P value (probability) in a Chi-Square test is statistically significant when it is less than ____________
0.05
What are tests of significance used for?
To determine probability that a relationship between two variables is just a chance occurrence
What is a confidence interval (CI)?
Provide a range of values that describe how much a sample statistic deviates from the population statistic (usually 95-90%)