Internet of Things, Bias Flashcards
Also Mis v. Dis & Big Data
True or False: It’s impossible to create a perfect machine learning model that doesn’t contain any bias or any variance.
True
Define Implicit Biases
- occur when we make assumptions based on our personal experience
- manifests itself as attitudes and stereotypes we hold about others unconsciously
- might look for information that would support our beliefs and hypotheses and disregard information that doesn’t
Define Group Attribution Bias
occurs when humans assume that one’s characteristics are always determined by what the group believes, or that a group’s decisions are influenced by members’ feelings
How it Manifests
1. In-Group Bias = When you give preference to your own group
2. Out-Group Bias = When you stereotype members of groups you don’t belong to
* ex. Engineers likely to view you as better qualified if you attended the same school)
Define Over-Generalization Bias
occurs when a person applies something from one event to all future events (assuming what you see in a data set, so you’ll likely find what you think you’d see)
Define Reporting Bias
occurs when you include only a subset of results in an analysis, which typically only covers a small fraction of evidence.
Forms
* Citation Bias = Analyzing data based on studies found in citations of other studies
* Language Bias = Excluding reports not written in the scientist’s native language
* Publication Bias = Choosing studies with positive findings rather than negative findings
Define Selection Bias
occurs when you try to do so in an attempt to mitigate bias
- ex. Choosing random people in a survey, and then only selectively using those people based on variables like first names, age, etc
Define Automation Bias
occurs when you take this AI-based recommendation and use it before verifying if the information was right
- ex. When typing in something in a search engine, suggestions may come up that may not be what you truly want it to, but you chose it because it was the first thing that showed up
Define Systemic Bias
occurs when certain social groups are favored and others are devalued (usually occurs when it’s not intentional)
It is possible to create a perfect machine learning model that does not contain any bias or variance.
False
Internet of Things
collective network of connected devices and the technology that facilitates communication between devices and the cloud (sensors, software, etc.) , as well as between the devices themselves
Synthetic Data
- potential ability to control the output that allows to produce a more balanced/clean/useful synthetic set
- mitigate bias by complimenting it what it unseen
- balance data
- provide insight on what kind of data is in the model
Variance
the variability in the model prediction
- High variance models pay too much attention to training data and don’t generalize on data they haven’t seen before, leading to good results on training data but high error rates on test data
Underfitting
when a machine learning model fails to capture the underlying trend of the data (because it is too simple to explain variance)
Overfitting
when a model is trained with so much data that it begins to learn from the noise and inaccurate data entries in the data set (forcefitting, too good to be true)
Underrepresented
saying something is more than what it is
* ex. Saying everyone did phenomenal on a test when only 5/18 did