Unit 14 Flashcards
How does the average annual CLABSI score (across all hospitals) change over the 5 years of data?
Time series analysis
What is the order from best to worst CLABSI score of the hospitals for 2011?
Ranking analysis
Is there a statistical relationship between CLABSI and CAUTI?
Correlation analysis
For 2011, do any of the hospitals show similar trends across all 6 metrics?
Multivariate analysis
What is the distribution of the CLABSI scores?
Distribution analysis
How far is each hospital over or under the target score of 1 for CAUTI in 2011?
Deviation analysis
You would like to create a parallel coordinate plot of 4 variables for several different hospitals. All of the variables are decimal numbers, but two of the variables range from 0 to 1000, and the other two range from 0 to 1. What is the best approach for handling these different ranges?
NOT
scale two variables with the lower range by multiplying by 100
Create a parallel coordinate plot for the variables with the greater range and then color code each point with the sum of the other two variables
Create two parallel coordinate plots, one for the variables with the greater range and a second for the variables with the smaller range.
The faces in a Chernoff display are what kind of graphical object? (Check all that apply)
Glyph
The following parallel coordinate plot shows the AHRQ PSIs for the 6 HAI indicators for all hospitals in the nation. Suppose you would like to see if hospitals within a state have similar profiles. How could you examine this? (Select all that apply)
Keep separate lines for each hospital on the parallel coordinate plot, but color the lines based on state. Highlight hospitals from a single state to look for any trend for that state.
AND
Create one parallel coordinate plot for each state. Each plot contains the hospitals for a single state.
What is the best multivariate display for finding similar profiles? For example, identifying hospitals with similar scores on all 6 AHRQ HAI PSIs for 2010?
Parallel coordinate plot
What is the best method for identifying hospitals with similar multidimensional profiles? (Select all that apply)
Use a tool that supports automated clustering of items on a parallel coordinates plot
AND
Use brushing to identify hospitals that meet certain criteria and look for patterns among the matching hospitals
Select all question types answerable by multivariate analysis
“Which items are most alike?”,
“Which items are exceptional?”,
“Which items are similar or how can we group items?”
Each row represents a hospital, each variable in the row is a string containing one of “low”, “normal”, or “high”
*Blue, White, Red, with Blue for the lowest and Red for the highest value
Each row represents a hospital, each variable in the row is a string containing one of “normal”, “moderate”, or “high”
*White, Light Red, Dark Red, with white for the lowest value and dark red for the highest value
Each row represents a hospital, each variable in the row contains a decimal value from 0 to 10, where 5 is a target that each hospital is supposed to meet or exceed
*Diverging palette from red for the lowest values to blue for the highest, centered at 5
Each row represents a hospital, each variable in the row contains a decimal value from 0 to 10 with 0 being good and 10 being the worst
*Sequential palette from very light red to very dark red
What are the major types of displays for multivariate analysis?
Glyphs
Multivariate Heatmaps
Parallel Coordinate Plots
You have 5 metrics for each of many hospitals. Four of these metrics range from 0 to 100 with 0 being the worst performance and 100 being the best. The fifth metric ranges from 0 to 10 with 0 being best and 10 being worst. What is the best approach for including all of these metrics in a parallel coordinate plot?
Scale the 0-10 metric so that it ranges from 0 to 100, with 0 being the worst performance and 100 the best.
-This rescaling means that this metric will visually match the other metrics where lower is worse performance and higher is better performance. This makes it easier to see and interpret trends.
Indicate whether each dataset is an example of time series data or temporal event data
A dataset containing one blood pressure and one weight measure every day for a month for 100 patients
Time series
A dataset containing patient logs of approximate start and end times at which they exercised, approximate meal time and calories consumed, and blood pressure taken approximately once in the morning and once in the evening. There are 14 to 60 days of data for each patient.
Temporal Event Analysis
A dataset containing the time of ED Admission, time of ordering and administering antibiotics, time of ordering and obtaining a blood culture, time of reading and result of mean arterial pressure taken at varying intervals, and time of admission to the hospital.
Time series
Which month had the most variation in the graph below?
February
To create an interactive visualization that allows a user to change the lag of one time series in order to compare it to a second time series, we create a calculated field such as “Lagged SOI” with the formula: Lookup(SUM([SOI]), [Lag (Months)])
What is “Lag (Months)”?
A parameter
To create an interactive visualization that allows a user to change the lag of one time series in order to compare it to a second time series, we create a calculated field such as “Lagged SOI” with the formula: Lookup(SUM([SOI]), [Lag (Months)]) and a parameter called “Lag (Months)”.
How do we add the control for Lag (Months)?
MAYBE?
Right click Lag (Months) and select Show Parameter Control
Select the most accurate statement(s) regarding time series
For time series data collected at regular intervals with no missing values, line graphs are one of the most accurate and effective means of display
Of the options below, what is the most effective visualization strategy for detecting possible lagged covariation in the two time series shown on this graph?
Create a dashboard showing the graph above, plus a scatterplot of SOI vs. Recruitment. Add a linear regression line to the scatterplot and modify the above graph to allow the user to vary the lag of one of the variables (SOI or Recruitment) to observe the correlation at different lags.
–Yes, this technique will allow the user to “shift” one of the time series lines left or right and observe whether this increases or decreases the correlation as shown in the scatterplot.
Which of the graphs below show evidence of seasonality (a cycle that repeats every year)?
A - Cycle Plot Lung Cancer Deaths’
AND
C - Segmented Time Series by Year Superstore Sales
The percent difference of each point in a time series from the previous point.
Rate of Change
Consecutive points that fall above 3 standard deviations of the mean.
Exceptions
Repetitive patterns in a time series.
Cycles
The overall direction of movement of the time series over the entire data set: rising, falling, or staying the same.
Trend
The extent to which two time series are correlated, possibly at some non-zero lag.
Co-Variation
The amount of random changes above and below the main trend throughout the time series.
Variability
Which of the following graphs represent multivariate time series analyses?
*Lung Cancer Deaths Male, Female, and Total
*Profit and Sales
*Lung Cancer
Which of the following graphs represent univariate time series analyses?
Total Children in Mother-Only Households