Fairness in Data Analytics: Common Biases Flashcards
Understand the what, why and how of common biases when conducting fair data analytics.
What is sampling bias?
When a sample isn’t representative of the population being studied.
Why is sampling bias problematic?
It leads to misleading conclusions because the sample doesn’t reflect the true population.
How to avoid sampling bias?
Ensure your sample is random and includes diverse subgroups of the population.
What is confirmation bias?
The tendency to search for or interpret data in a way that confirms pre-existing beliefs.
Why is confirmation bias harmful in analysis?
It skews the analysis by focusing only on data that supports assumptions, ignoring contradictory evidence.
How to avoid confirmation bias?
Look at all the data objectively and challenge your own assumptions throughout the analysis.
What is selection bias?
Bias that occurs when individuals are not randomly selected, affecting the validity of the results.
Why is selection bias dangerous?
It distorts findings, as the chosen sample may have characteristics that are not representative of the target group.
How to correct selection bias?
Use randomized sampling methods and ensure all relevant groups are included.
What is survivorship bias?
Focusing only on the successful cases, ignoring those that failed or were excluded.
Why does survivorship bias mislead?
It creates a false perception by only analyzing surviving cases, leading to overly optimistic conclusions.
How to avoid survivorship bias?
Include data from all cases, both successes and failures, in your analysis.
What is response bias?
When survey respondents answer untruthfully or in a way they think is expected.
Why does response bias affect data?
It leads to inaccurate or skewed data, as respondents may not provide honest answers.
How to minimize response bias?
Design surveys to be neutral and ensure respondents feel safe providing honest answers.