Course 4: Module 1 Flashcards

1
Q

Data integrity

A

The accuracy, completeness, consistency, and trustworthiness of data throughout its lifecycle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data replication

A

The process of storing data in multiple locations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data transfer

A

The process of copying data from a storage device to memory, or from one computer to another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data manipulation

A

The process of changing data to make it more organized and easier to read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Other threats to data integrity

A
  • Human error
  • Viruses
  • Malware
  • Hacking
  • System failures
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Types of insufficient data

A
  • Data from only one source
  • Data that keeps updating
    -Outdated data
    -Geographically limited
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ways to address insufficient data

A

-Identify trends with the available data
- Wait for more data if time allows
- Talk with stakeholders and adjust your objective
-Look for a new dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Population

A

All possible data values in a certain dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sample size

A

A part of a population that is representative of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Sampling bias

A

A sample isn’t representative of the population as a whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Random sampling

A

A way of selecting a sample from a population so that every possible type of the sample has an equal chance of being chosen

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Margin of error

A

Since a sample is used to represent a population, the sample’s results are expected to differ from what the result would have been if you had surveyed the entire population. This difference is called the margin of error. The smaller the margin of error, the closer the results of the sample are to what the result would have been if you had surveyed the entire population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Confidence level

A

How confident you are in the survey results. For example, a 95% confidence level means that if you were to run the same survey 100 times, you would get similar results 95 of those 100 times. Confidence level is targeted before you start your study because it will affect how big your margin of error is at the end of your study.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Confidence interval

A

The range of possible values that the population’s result would be at the confidence level of the study. This range is the sample result +/- the margin of error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Statistical significance

A

The determination of whether your result could be due to random chance or not. The greater the significance, the less due to chance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Statistical power

A

The probability of getting meaningful results from a test