WEEK 2 Flashcards

1
Q

Bias

A

Has evolved to become a preference in favor of or against a person, group of people, or thing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data bias

A

Is a type of error that systematically skews results in a certain direction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What do you have to think about when you collect data ?

A

As a data analyst, you have to think about bias and fairness from the moment you start collecting data to the time you present your conclusions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Sampling bias

A

is when a sample isn’t representative of the population as a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Unbiased sampling

A

Results in a sample that’s representative of the population being measured.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

3 DATA BIAS

A

observer bias, interpretation bias, and confirmation bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Observer bias

A

It’s the tendency for different people to observe things differently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Interpretation bias

A

The tendency to always interpret ambiguous situations in a positive, or negative way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Confirmation bias

A

Is the tendency to search for, or interpret information in a way that confirms preexisting beliefs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Observer bias is somtimes refered to

A

Experimental bias
Research bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

ROCCC process

A

Reliable
Original
Current
Comprehensive
Cited

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Reliable

A

With this data you can trust that you’re getting accurate, complete and unbiased information that’s been vetted and proven fit for use

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Original

A

There’s a good chance you’ll discover data through a second or third party source. To make sure you’re dealing with good data, be sure to validate it with the original source

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Comprehensive

A

The best data sources contain all critical information needed to answer the question or find the solution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Current

A

The usefulness of data decreases as time passes. If you wanted to invite all current clients to a business event, you wouldn’t use a 10-year-old client list. The same goes for data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cited

A

If you’ve ever told a friend where you heard that a new movie sequel was in the works, you’ve cited a source. Citing makes the information you’re providing more credible.

17
Q

There’s lots of places that are known for having good data.

A

Your best bet is to go with the vetted public data sets, academic papers, financial data, and governmental agency data.

18
Q

Bad data

A

Not Reliable
Not Original
Not Current
Not Comprehensive
Not Cited

19
Q

Not Reliable

A

Bad data can’t be trusted because it’s inaccurate, incomplete, or biased.
This could be data that has sample selection bias because it doesn’t reflect the overall population.
Or it could be data visualizations and graphs that are just misleading.

20
Q

Not Original

A

If you can’t locate the original data source and you’re just relying on second or third party information, that can signal you may need to be extra careful in understanding your data.

21
Q

Not comprehensive.

A

Bad data sources are missing important information needed to answer the question or find the solution. What’s worse, they may contain human error, too.

22
Q

Not current

A

Bad data sources are out of date and irrelevant. Many respected sources refresh their data regularly, giving you confidence that it’s the most current info available.

23
Q

Not cited.

A

If your source hasn’t been cited or vetted, it’s a no-go.

24
Q

Data ethics

A

Refers to well- founded standards of right and wrong that dictate how data is collected, shared, and used.

25
Q

Six different aspects of data ethics :

A

ownership, transaction transparency, consent, currency, privacy, and openness.

26
Q

Ownership

A

who owns data?
It isn’t the organization that invested time and money collecting, storing, processing, and analyzing it.

It’s individuals who own the raw data they provide, and they have primary control over its usage, how it’s processed and how it’s shared.

27
Q

Transaction Transparency

A

Which is the idea that all data processing activities and algorithms should be completely explainable and understood by the individual who provides their data.

28
Q

Concent

A

This is an individual’s right to know explicit details about how and why their data will be used before agreeing to provide it.

They should know answers to questions like why is the data being collected? How will it be used? How long will it be stored?

29
Q

Currency

A

Individuals should be aware of financial transactions resulting from the use of their personal data and the scale of these transactions.

30
Q

Privacy (information privacy or data protection.)

A

privacy means preserving a data subject’s information and activity any time a data transaction occurs.

31
Q

Person’s legal right to their data

A

This means someone like you or me should have protection from unauthorized access to our private data, freedom from inappropriate use of our data, the right to inspect, update, or correct our data, ability to give consent to use our data, and legal right to access our data.

32
Q

Openness

A

Free access, usage, and sharing of data.

33
Q

Data anonymization

A

is the process of protecting people’s private or sensitive data by eliminating that kind of information.

34
Q

Here is a list of data that is often anonymized

A

Telephone numbers

Names

License plates and license numbers

Social security numbers

IP addresses

Medical records

Email addresses

Photographs

Account numbers

35
Q

Open Data standards

A

Open data must be :
Available
Provided in terms of Reuse and redistribution
Object of universal particiation

36
Q

Data Interoperability

A

Is the ability of data systems and services to openly connect and share data.

For example, data interoperability is important for health care information systems where multiple organizations such as hospitals, clinics, pharmacies, and laboratories need to access and share data to ensure patients get the care that they need.

This is why your doctor is able to send your prescription directly to your pharmacy to fill. They have compatible databases that allow them to share information.

37
Q

Steps for ethical data use

A

self-reflect and understand what it is that you’re doing and the impact that it has

And then also, think about the various harms and risks associated with the work that you’re doing.

what’s the risk of holding onto this dataset? What’s the potential harm that could arise if you continue to look at the dataset and continue to store it and continue to retrieve this data?

Are you informing those that you’re collecting data from how it’s going to be used?

What’s the communication channel like?

What’s the communication channel like?