Data Data Data Flashcards

1
Q

Data Data Data, I cannot make bricks without clay

A

Sir Arthur Conan doyle data is the clay you use to make building blocks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is data analysis

A

The collection, transformation, and organization of data in order to draw conclusions and inform decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 6 steps of data analysis

A

The six steps of the data analysis process that you have been learning in this program are: ask, prepare, process, analyze, share, and act

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is 1. Ask

A

Assessing and gathering information in order to design the structure of the project or analysis
In the ask phase, you’ll work to understand the challenge to be solved or the question to be answered. It will likely be assigned to you by stakeholders. As this is the ask phase, you’ll ask many questions to help you along the way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is 2. Prepare?

A

Plan: what metrics are you using and ahow will you gather data and get those systems in place. When or how frequently
Next, in the prepare phase, you’ll find and collect the data you’ll need to answer your questions. You’ll identify data sources, gather data, and verify that it is accurate and useful for answering your questions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is 3. Process?

A

The data analysts also made sure employees understood how their data would be collected, stored, managed, and protected. As planned in step 2. Prepare

The process phase is when you will clean and organize your data. Tasks you perform here include removing any inconsistencies; filling in missing values; and, in many cases, changing the data to a format that’s easier to work with. Essentially, you’re ensuring the data is ready before you begin analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is 4. Analyze

A

Discover and document

The analyze phase is when you do the necessary data analysis to uncover answers and solutions. Depending on the situation and the data, this could involve tasks such as calculating averages or counting items in categories so you can examine trends and patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is 5. Share

A

This is the thing you don’t do in rec therapy if you don’t write papers. Present information in easy to understand formats and explain conclusions simply
Next comes the share phase, when you present your findings to decision-makers through a report, presentation, or data visualizations. As part of the share phase, you decide which medium you want to use to share your findings and select the data to include. Tools for presenting data visually include charts made in Google Sheets, Tableau, and R.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is 6. Act?

A

The last stage of the process for the team of analysts was to work with leaders within their company and decide how best to implement changes and take actions based on the findings
Last is the act phase, in which you and others in the company put the data insights into action. This could mean implementing a new business strategy, making changes to a website, or any other action that solves the initial problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the origin of statistics

A

Data analysis is rooted in statistics, which has a pretty long history itself. Archaeologists mark the start of statistics in ancient Egypt with the building of the pyramids. The ancient Egyptians were masters of organizing data. They documented their calculations and theories on papyri (paper-like materials), which are now viewed as the earliest examples of spreadsheets and checklists. Today’s data analysts owe a lot to those brilliant scribes, who helped create a more technical and efficient process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the variations and iterations of the data analysis process?

A

Emc
SAS
Project based
Big data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is EMC’s data analysis process

A

EMC Corporation’s data analytics process is cyclical with six steps:

Discovery

Pre-processing data

Model planning

Model building

Communicate results

Operationalize

EMC Corporation is now Dell EMC. This model, created by David Dietrich, reflects the cyclical nature of typical business projects. The phases aren’t static milestones; each step connects and leads to the next, and eventually repeats. Key questions help analysts test whether they have accomplished enough to move forward and ensure that teams have spent enough time on each of the phases and don’t start modeling before the data is ready. It is a little different from the data analysis process on which this program is based on, but it has some core ideas in common: the first phase is interested in discovering and asking questions; data has to be prepared before it can be analyzed and used; and then findings should be shared and acted on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is SAS’s iterative process

A

An iterative data analysis process was created by a company called SAS, a leading data analytics solutions provider. It can be used to produce repeatable, reliable, and predictive results:

Ask

Prepare

Explore

Model

Implement

Act

Evaluate

The SAS model emphasizes the cyclical nature of their model by visualizing it as an infinity symbol. Its process has seven steps, many of which mirror the other models, like ask, prepare, model, and act. But this process is also a little different; it includes a step after the act phase designed to help analysts evaluate their solutions and potentially return to the ask phase again.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Project-based data analytics process

A

A project-based data analytics process has five simple steps:

Identifying the problem

Designing data requirements

Pre-processing data

Performing data analysis

Visualizing data

This data analytics project process was developed by Vignesh Prajapati. It doesn’t include the sixth phase, or the act phase. However, it still covers a lot of the same steps described. It begins with identifying the problem, preparing and processing data before analysis, and ends with data visualization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Big data analytics process

A

Authors Thomas Erl, Wajid Khattak, and Paul Buhler proposed a big data analytics process in their book, Big Data Fundamentals: Concepts, Drivers & Techniques. Their process suggests phases divided into nine steps:

Business case evaluation

Data identification

Data acquisition and filtering

Data extraction

Data validation and cleaning

Data aggregation and representation

Data analysis

Data visualization

Utilization of analysis results

This process appears to have three or four more steps than the previous models. But in reality, they have just broken down what has been referred to as prepare and process into smaller steps. It emphasizes the individual tasks required for gathering, preparing, and cleaning data before the analysis phase.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the 5 essential skills of a data analyst

A
  1. Curiosity
  2. Context
  3. Technical mindset
  4. Data design and
  5. Data strategy
17
Q

5 whys

A

Ask why 5 times. You’re going 5 why’s deep to uncover the root cause

18
Q

Gap analysis

A

What are the gaps in process. Where am I now. Where do I want to be.

19
Q

Predict the future

A

Just far enough in advance to be right on time

20
Q

Predict the end

A

You can’t when there are so many starting points

21
Q

Issue

A

Topic or subject to investigate

22
Q

Question

A

Designed. DESIGNEDto discover information

23
Q

Problem

A

Obstacle or complication that needs a solution

24
Q

Oversampling

A

Increasing the sample size of non dominant groups