visualising correlation Flashcards
what is optimizing Aspect Ratio
Adjusting the x and y-axis scales to proportionally represent the data, ensuring accurate visualization of relationships between variables.
why must we remove the Fill Color
Using only outlines for data points to reduce over-plotting and make it easier to see overlapping points and identify patterns or trends.
what are the reference regions
Visual areas on a scatterplot used to compare data to a reference set of values, making it easier to interpret the relationship between variables.
how can we Visually Distinguishing Data Sets When Divided into Groups
Using different colors, symbols, or trend lines to differentiate between subsets of data within a scatterplot.
what are the Trend Lines
Lines that trace the basic shape of data from left to right, indicating the overall direction and relationship between variables in a scatterplot.
what is the Line of Best Fit
A line with the least possible amount of residuals, which can be used to predict values not in the dataset and represents the overall trend in the data.
what is Multiple Trend Lines
using different trend lines within a scatterplot to show different trends within subsets of the data.
what is a Crosstab Display
A method to separate data into individual scatterplots by categories or groups, reducing complexity and over-plotting and making it easier to compare correlation patterns.
what are the Grid Lines
Lines that help enhance comparisons between scatterplots by providing a visual reference for values on both the x and y axes.
what is the Coefficient of Determination (r2)
A statistical measure that describes the strength of correlation but not direction. It can be expressed as a percentage, indicating how much of the variation in one variable is determined by the other variable.
Difference between Crosstab Display and Scatterplot Matrix
Crosstab Display separates data into individual scatterplots based on categories or groups, allowing for easier comparison of correlation patterns within different categories.
Scatterplot Matrix arranges scatterplots in a matrix format to compare multiple pairs of quantitative variables simultaneously, identifying relationships and correlations between these variables.
what are the Correlation Analysis Techniques and Best Practices
Optimizing aspect ratio and quantitative scales
▰ Removing fill color to reduce over-plotting
▰ Comparing data to reference regions
▰ Visually Distinguishing Data Sets When Divided into Groups
▰ Using trend lines to enhance perception of the correlation’s shape, strength, and outliers
▰ Using multiple trend lines to see categorical differences
▰ Using trellis and crosstab displays to reduce complexity and over-plotting
▰ Using grid lines to enhance comparisons between scatterplots
what does it mean when a correlation is curvilinear
When a correlation is curvilinear, the relationship between values is not fixed to a
consistent amount
one example of curved upward correlation pattern
The growth of Netflix can be represented by an S-curve, starting with a slow adoption rate, followed by a period of rapid acceleration as more users embraced the streaming service, and finally, reaching a saturation point as the market became more saturated and competition increased.
exponential growth example
Compound interest that banks pay grows exponentially
through time.