Course 4: Module 1 Flashcards

Question 1

Q

Data integrity

Answer

A

The accuracy, completeness, consistency, and trustworthiness of data throughout its lifecycle.

Question 2

Q

Data replication

Answer

A

The process of storing data in multiple locations

Question 3

Q

Data transfer

Answer

A

The process of copying data from a storage device to memory, or from one computer to another

Question 4

Q

Data manipulation

Answer

A

The process of changing data to make it more organized and easier to read

Question 5

Q

Other threats to data integrity

Answer

A

Human error
Viruses
Malware
Hacking
System failures

Question 6

Q

Types of insufficient data

Answer

A

Data from only one source
Data that keeps updating
-Outdated data
-Geographically limited

Question 7

Q

Ways to address insufficient data

Answer

A

-Identify trends with the available data
- Wait for more data if time allows
- Talk with stakeholders and adjust your objective
-Look for a new dataset

Question 8

Q

Population

Answer

A

All possible data values in a certain dataset

Question 9

Q

Sample size

Answer

A

A part of a population that is representative of the population

Question 10

Q

Sampling bias

Answer

A

A sample isn’t representative of the population as a whole

Question 11

Q

Random sampling

Answer

A

A way of selecting a sample from a population so that every possible type of the sample has an equal chance of being chosen

Question 12

Q

Margin of error

Answer

A

Since a sample is used to represent a population, the sample’s results are expected to differ from what the result would have been if you had surveyed the entire population. This difference is called the margin of error. The smaller the margin of error, the closer the results of the sample are to what the result would have been if you had surveyed the entire population.

Question 13

Q

Confidence level

Answer

A

How confident you are in the survey results. For example, a 95% confidence level means that if you were to run the same survey 100 times, you would get similar results 95 of those 100 times. Confidence level is targeted before you start your study because it will affect how big your margin of error is at the end of your study.

Question 14

Q

Confidence interval

Answer

A

The range of possible values that the population’s result would be at the confidence level of the study. This range is the sample result +/- the margin of error.

Question 15

Q

Statistical significance

Answer

A

The determination of whether your result could be due to random chance or not. The greater the significance, the less due to chance.

Question 16

Q

Statistical power

Answer

A

The probability of getting meaningful results from a test

Question 17

Q