Course 1, Module 1 Flashcards

Introducing data analytics and analytical thinking

1
Q

What is data?

A

A collection of facts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is data analysis?

A

It is the collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a data analyst?

A

It is someone who collects, transforms, and organizes data in order to help make informed decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is data analytics?

A

It is the science of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why do businesses need to control their data?

A

So they can use it to:
1. improve processes,
2. identify opportunities and trends,
3. launch new products,
4. serve customers,
5. make thoughtful decisions.

In other words, for businesses to be on top of the competition, they need to be on top of their data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is people analytics (aka human resources analytics or workforce analytics)?

A

It is the practice of collecting and analyzing data on the people who make up a company’s workforce in order to gain insights to improve how the company operates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the 6 steps of the data analysis process?

A

ask, prepare, process, analyze, share, and act.

(See case study explaining each of these in “new data perspectives”).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the ask phase?

A

We define the problem to be solved, and we make sure that we fully understand stakeholder expectations (people who have invested time and resources into a project and are interested in the outcome)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the prepare phase?

A

Finding and collecting the data needed to answer questions. Data and results must be based on facts and unbiased.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the process phase?

A

Cleaning and organizing the data before it is ready for analysis. Data analysts find and eliminate any errors and inaccuracies that can get in the way of results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the analyse phase?

A

Involves using tools to transform and organize that information so that you can draw useful conclusions, make predictions, and drive informed decision-making.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the share phase?

A

Presenting the findings to decision-makers through a report, presentation, or data visualizations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the act phase?

A

Business puts all the data insights from the data analyst into action to solve the original problem (ask phase).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Data Intelligence?

A

It is a combination of applied data science and the social and managerial sciences.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data science is an umbrella terms for 3 disciplines, which ones?

A

Machine learning, statistics, and analytics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What differentiates the 3 disciplines of data science?

A

These are separated by how many decisions you know you want to make before you begin with them.

  1. If you want to make a few important decisions under uncertainty, that is statistics.
  2. If you want to automate (i.e. make many, many, many decisions under uncertainty), that is machine learning and AI.
  3. if you don’t know how many decisions you want to make before you begin? You want to encounter your unknown unknowns. You want to understand your world. That is analytics.
17
Q

What is the root of data analysis and when did it start?

A

It is statistics and it started in ancient Egypt with the building of the pyramids. The ancient Egyptians were masters of organizing data. They documented their calculations and theories on papyri (paper-like materials), which are now viewed as the earliest examples of spreadsheets and checklists.

18
Q

There are several variations of the data analysis process (going from data to decision). Describe the one presented by Google?

A

Ask: Define the problem and confirm stakeholder expectations

Prepare: Collect and store data for analysis

Process: Clean and transform data to ensure integrity

Analyze: Use data analysis tools to draw conclusions

Share: Interpret and communicate results to others to make data-driven decisions

Act: Put your insights to work in order to solve the original problem

19
Q

Where can data be found?

A

In data ecosystems and in the cloud.

20
Q

What is a data ecosystem?

A

The various elements that interact with one another in order to produce, manage, store, organize, analyze, and share data. Includes hardware, software tools, and the people who use them.

21
Q

What is the cloud?

A

A place to keep data online, rather than a computer hard drive.

22
Q

What is the difference between data science and data analysis?

A

Data science is defined as creating new ways of modeling and understanding the unknown by using raw data. Data scientists create new questions using data, while analysts find answers to existing questions by creating insights from data sources.

23
Q

What is the difference between data analysis and data analytics?

A

Data analysis is the collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making.
Data analytics in the simplest terms is the science of data. It’s a very broad concept that encompasses everything from the job of managing and using data to the tools and methods that data workers use each and every day.
So when you think about data, data analysis and the data ecosystem, it’s important to understand that all of these things fit under the data analytics umbrella.

UPDATE: Data analysis is a process, data analytics is a science.

24
Q

Who are subject matter experts?

A

Those who are familiar with the business problem and can review analysis results and help identify inconsistencies. Plus, their experience and human intuition are valuable to data-driven decision-making.

25
Q

What is data-driven decision-making?

A

Using facts to guide business strategy.

26
Q

What are analytical skills?

A

Qualities and characteristics associated with solving problems using facts.

27
Q

What are the 5 essential analytical skills?

A
  1. Curiosity - The analytical skill that involves wanting to learn something
  2. Understanding context - The analytical skill that has to do with how you group things into categories. Context describes the condition in which something exists or happens
  3. Having technical mindset - The analytical skill involves breaking processes down into smaller steps and working with them in an orderly and logical way
  4. Data design - The analytical skill that involves how someone organizes information
  5. Data strategy - The analytical skill that involves managing the processes and tools used in data analysis. It gives you a high-level view of the path you need to take to achieve your goals
28
Q

What is analytical thinking?

A

Identifying and defining a problem and then solving it by using data in an organized, step-by-step manner.

29
Q

What are the 5 key aspects to analytical thinking?

A
  1. visualization - the graphical representation of information
  2. strategy - helps data analysts see what they want to achieve with the data and how they can get there. Strategy also helps improve the quality and usefulness of the data we collect.
  3. problem-orientation - It’s all about keeping the problem top of mind throughout the entire project.
  4. correlation - Identify a relationship between 2 or more pieces of data (Correlation does not equal causation. In other words, just because two pieces of data are both trending in the same direction, that doesn’t necessarily mean they are all related)
  5. big-picture and detail-oriented thinking - Being able to zoom out and see possibilities and opportunities. Detail-oriented thinking is all about figuring out the specifics that will help you execute a plan.
30
Q

What are some questions data analysts ask?

A
  1. What is the root cause (i.e. the reason why a problem occurs) of a problem? We could use the 5 whys to answer that (asking “Why?” repeatedly until the answer reveals itself - see case study in “Use the five whys for root cause analysis”).
  2. Where are the gaps in our process? We would use Gap Analysis (lets you examine and evaluate how a process works currently in order to get where you want to be in the future) to answer that
  3. What did we not consider before? A great way to think about what information or procedure might be missing from a process, so you can identify ways to make better decisions and strategies moving forward.
31
Q

What is a quartile?

A

It divides data points into four equal parts or quarters.

32
Q

What are the different stages of the life cycle of data?

A
  1. Plan - A business decides what kind of data it needs, how it will be managed throughout its life cycle, who will be responsible for it, and the optimal outcomes. This actually happens well before starting an analysis project.
  2. Capture - Data is collected from a variety of different sources (e.g. a database, which is a collection of data stored in a computer system) and brought into the organization.
  3. Manage - This is about how we care for our data, how and where it’s stored, the tools used to keep it safe and secure, and the actions taken to make sure that it’s maintained properly.
  4. Analyse - The data is used to solve problems, make great decisions, and support business goals.
  5. Archive - Storing data in a place where it’s still available but may not be used again.
  6. Destroy - Remove data from storage and delete any shared copies of the data.
33
Q

What are the main tools used by data analysts?

A
  1. Spreadsheets (Google Sheets, Excel) - Digital worksheets that store, organize, and sort data.
  2. Query languages (SQL) - computer programming languages that allows you to retrieve and manipulate data from a database.
  3. Visualization tools (Tableau, Looker) - graphical representation of information to help stakeholders come up with conclusions that lead to informed decisions and effective business strategies .
34
Q

What are the differences between spreadsheets and databases?

A

Accessed through a software application.
Database accessed using a query language.

Structured data in a row and column format.
Structured data using rules and relationships.

Organizes information in cells.
Organizes information in complex collections.

Provides access to a limited amount of data.
Provides access to huge amounts of data.

Manual data entry.
Strict and consistent data entry.

Generally, one user at a time.
Multiple users.

Controlled by the user.
Controlled by a database management system.

In short, spreadsheets are suitable for organizing, cleaning, and analyzing small to medium datasets. Databases are ideal for storing, managing, and analyzing large and complex datasets.

35
Q

What is the difference between a formula and a function in a spreadsheet?

A

Formula: A set of instructions used to perform a calculation using the data in a spreadsheet (begin with = sign).

Function: A preset command that automatically performs a specified process or task using the data in a spreadsheet.

36
Q

What are the 3 main features of a spreadsheet?

A

cells (identified by letter followed by number), rows (ordered by number), columns (ordered by letter)

37
Q

What is an attribute in a spreadsheet?

A

An attribute is a characteristic or quality of data used to label a column in a table. More commonly, attributes are referred to as column names, column labels, headers, or the header row.

38
Q

What is an observation in a spreadsheet?

A

An observation includes all of the attributes for something contained in a row of a data table