WEEK 1: The importance of integrity Flashcards

1
Q

Good alignment

A

Means that the data is relevant and can help you solve a business problem or determine a course of action to achieve a given business objective.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Some of those limitations you might come across

A

data from just one source.
data set keeps updating
not enough data to know if this number is too low or too high.
Outdated data
Data that’s geographically-limited

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How you can handle different types of insufficient data.

A

You can identify trends with the available data or
wait for more data if time allows;
you can talk with stakeholders and adjust your objective;
or you can look for a new data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Things to remember when determining the size of your sample

A

Don’t use a sample size less than 30.

The confidence level most commonly used is 95%, but 90% can work in some cases.

Increase the sample size to meet specific needs of your project:
For a higher confidence level, use a larger sample size
To decrease the margin of error, use a larger sample size
For greater statistical significance, use a larger sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data integrity

A

is the accuracy, completeness, consistency, and trustworthiness of data throughout its lifecycle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data replication

A

is the process of storing data in multiple locations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data can lacks integrity

A

Because different people might not be using the same data for their findings, which can cause inconsistencies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

data transfer,

A

which is the process of copying data from a storage device to memory, or from one computer to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

data manipulation

A

Is the process that involves changing the data to make it more organized and easier to read.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

the data warehouse or data engineering team

A

takes care of ensuring data integrity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Statistical power

A

can be calculated and reported for a completed experiment to comment on the confidence one might have in the conclusions drawn from the results of the study. It can also be used as a tool to estimate the number of observations or sample size required in order to detect an effect in an experiment.

You need a statistical power of at least 0.8 or 80% to consider your results statistically significant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Statistical power

A

Is the probability of getting meaningful results from a test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly