Stages of the data life cycle Flashcards

1
Q

What are the stages in the data life cycle?

A
  1. Plan
  2. Capture
  3. Manage
  4. Analyse
  5. Archive
  6. Destroy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define the Planning stage of the data life cycle

A

Deciding what kind of data is needed, how it will be managed throughout it’s life cycle, who will be responsible for it, and the optimal outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define the Capture phase of the data life cycle

A

Where data is collected from a variety of different sources and brought into an organisation. The data could be publicly available or from the company’s own database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define the Manage stage of the data life cycle

A

How we care for our data, how and where it’s stored, the tools used to keep it safe and secure, and the actions taken to make sure that it’s properly maintained.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define the Analyse stage of the data life cycle

A

Data is used to solve problems, make great decisions, and support business goals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define the Archive stage of the data life cycle

A

Keep relevant data stored for long-term and future reference.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define the Destroy stage of the data life cycle

A

Safely and securely disposing of data using secure data erasure software and shredding of physical documents to protect the private information of the company and it’s customers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a database?

A

A collection of data stored in a computer system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a stakeholder?

A

People who have invested time and resources into a project and are interested in the outcome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you determine stakeholder expectations?

A

By working out who the stakeholders are, what they want, when they want it, why they want it, and how best to communicate with them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What does it mean to define a problem?

A

Looking at the current state and identifying how it’s different from the ideal state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a spreadsheet formula?

A

A set of instructions that performs a specific calculation using the data in a spreadsheet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a spreadsheet function?

A

A preset command that automatically performs a specific process or task using the data in a spreadsheet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the difference between a formula and a function?

A

A formula is a set of instructions, whereas a function is a preset command. Formulas perform a specific calculation. Functions are preset commands that automatically perform a process or task, making it more efficient.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is an attribute in reference to a spreadsheet?

A

A characteristic or quality of data used to label a column in a table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an observation in relation to a spreadsheet?

A

All of the attributes for something contained in a row of a data table.

17
Q

What is a query?

A

A request for data or information from a database.

18
Q

What basic syntax forms every SQL query?

A

Select: to choose the columns you want to return.
From: to choose the tables where those columns are located.
Where: to filter for certain information.

19
Q

What is used to separate fields/ variables in a SELECT command?

A

a comma

20
Q

What is used to connect conditions in a WHERE command?

A

the word ‘AND’

21
Q

What is fairness in data analytics?

A

Ensuring that your analysis doesn’t create or reinforce bias.

22
Q

What is self-reporting?

A

A data collection technique where participants provide information about themselves.

23
Q

What is oversampling?

A

The process of increasing the sample size of non-dominant groups in a population.

24
Q

List five best practices to support fair analysis.

A
  1. Consider all of the available data.
  2. Identify surrounding factors.
  3. Include self-reported data.
  4. Use oversampling effectively.
  5. Think about fairness from beginning to end.