DATA ANALYSIS PROCESS Flashcards

1
Q

When framing a problem what should you remember?

A

1) Get the main issue of the business problem
2) Get an idea of what the timelines are suppose to be.
3) Understanding where the data sits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is data collection?

A

It is gathering and analysing data from various sources to answer research questions, identify trends and estimate probabilities.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Are these the two types of data collection methods we have in data analysis?

  • Primary data collection
  • Secondary data collection
A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the difference between the two methods of data collection?

A

Primary data collection is the most highly accurate form of data as it is collected straight from the source using a combination of methods such as interviews, focus groups, questionnaires and surveys.

Secondary data collection is readily available data that has been collected already without the use of any specific data collection methods. E.g. Sales reports, financial statements, Business Journals etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Within the data collection process we have to consolidate the data. What does that mean exactly?

A

It means that we gather and combine data from their sources and make them into one coherent and presentable dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do we define data cleaning?

A

It is the process of preparing data for analysis by removing or modifying incorrect, incomplete irrelevant, duplicated, or improperly formatted data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is it important to clean your data?

A

It’s important because the quality of the insights is reliable on high quality data where decisions will be derived from.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can you give examples of dirty data?

A

Duplicate data
Incomplete data
Inconsistent/ Inaccurate data

Duplicate data is data that appears multiple times
Incomplete data is spreadsheets wih missign values that are relevant to analysis
Inconsistent/ Inaccurate data is data that outdated or contains strucural errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can we clean our data?

A

We can clean our data by:
- Deleting unnecessary
data columns
- Identifying and removing
duplicates
- Removing blank cells
- Fixing inconsistencies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly