17 Practice Exam Two Flashcards

1
Q

What does EDA stand for?

A

Exploratory data analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

If the result was a type II error, what was your conclusion?

A

Gerbils and hamsters can lift the same amount

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What type of error does the following dataset represent?

A

Duplicate data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which of the following represents the percent of observations in each category as compared to the whole?

A

Percentage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the interpretation of a p-value of 0.04 assuming an alpha of 0.05?

A

Accept the alternative hypothesis and reject the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

The idea that there will be no difference between the performance of two groups is what kind of hypothesis?

A

Null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which visualization would be most appropriate for the relationship between the weight of a ferret and milk production?

A

A scatter plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

A flat file delimited by commas is what file type?

A

CSV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which element should never be on the cover page of a report?

A

The appendix

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Data type validation is a process specifically used to avoid what type of error?

A

Invalid data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is an appropriate title for the following chart?

A

The Population of India Averaged for the Years 2015, 2016, and 2017 as Sub-Divided by Geographical Regions Determined by the 2018 Land Survey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does it mean for a dashboard to be real-time?

A

It has the absolute most up-to-date rates and figures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the act of automatically moving and analyzing online transactions called?

A

OLTP

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does the following code snippet represent? Data = ‘This book makes me happy.’ Data = [‘This’, ‘book’, ‘makes’, ‘me’, ‘happy’, ‘.’]

A

Parsing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which of the following is a valid data storage solution for audio files?

A

A data lake

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What type of analysis is most appropriate for checking the efficiency of each phase of a production process?

A

Performance analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Who is the most appropriate audience for a detailed report on grain-to-egg efficiency ratios?

A

Technical experts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What would be the result of an outer join on the provided tables?

A

Joined Table with NULLs for unmatched records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What type of report is most appropriate for a project manager at the end of every sprint?

A

A recurring report

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Find the mode of the following dataset: 5, 3, 8, 5, 3, 9, 3, 8, 2

A

3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What type of analysis is most appropriate for examining the connection between hours worked and mistakes made?

A

Link analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What means of updating a table is represented by adding new values to the bottom?

A

Active record

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What conclusion can you draw from the following visualization?

A

Around 350 students achieved a grade of C or higher

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What type of schema consists only of normalized tables?

A

A snowflake schema

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the following dataset an example of?
Recoding a category into a number
26
What is a detailed program that explains how software performs a specific query called?
An execution plan
27
What conclusion can you draw from the following visualization regarding data access?
Half of everyone who can access data is either a marketing analyst or a business analyst
28
A small, highly specialized data storage solution following a star schema would most likely be what?
A data mart
29
What type of report is most appropriate for a detailed report on a potential merger?
An ad hoc report
30
Which type of schema has two levels of dimension tables?
A galaxy schema
31
Which variable indicates when a variable stopped being active?
Active End
32
What is a key process of MDM?
Data consolidation
33
What type of visualization would be most appropriate for displaying the population of Europe by country?
A geographic map
34
What type of error does the following dataset represent?
Invalid data
35
What type of survey question does the following screenshot represent?
Single choice
36
Which of the following is a conditional operator?
OR
37
What is something to consider when checking for data quality?
Data integrity
38
What data-validating approach should you take if you believe the results of an analysis to be in error?
Data audits
39
Find the standard deviation of the following dataset: 62, 92, 43, 66, 37
21.7
40
What is the most appropriate visualization for expressing ideas held within a text file using natural language processing?
A word cloud
41
What is the single most important thing to do if you suspect that private data might have been breached?
Notify the impacted parties
42
What type of data is represented by the following dataset?
Structured
43
What should you do immediately after planning out your data story when creating a dashboard?
Get approval
44
In an A/B study, which p-value would cause you to accept the null hypothesis assuming an alpha of 0.1?
0.09
45
Which analysis is most appropriate for comparing the age of a customer persona against the normally distributed ages of actual customers?
Z-score
46
What sort of cardinality do the provided employee tables have?
One-to-one
47
What variable type would a variable called BirdPassed that tracks whether a bird passed by your window be?
Binary
48
What is the difference between your average clicks per minute and your competitor's?
7%
49
What is the average number of clicks per minute for the year for your website?
12.8
50
What is the average number of clicks per minute for the year for a major competitor?
13.7
51
What is the difference in clicks per minute between your website and your competitor's website?
9%
52
Given a t-value of 1.86, what is the confidence interval for the dataset 8, 7, 8, 8, 10, 6, 8, 8, 9, 8?
7.6 to 9.5
53
What type of join is represented in the following example?
Outer join
54
What analysis specifically tells you whether or not two categorical variables are related?
Chi-square
55
A social media ID is considered what type of protected data?
PII
56
What is duplicate data?
The same information recorded in multiple rows
57
What type of error is represented by a specification mismatch?
Specification mismatch
58
Which of the following is considered a public source of data?
Web scraping
59
Average, sum, and count are all examples of what?
Reduction
60
Under which circumstances should you check the quality of your data?
Data acquisition
61
What data-validating approach should you take if you need a formal process to apply to an entire database?
Data audits
62
What analysis is most appropriate to predict how wide a tree must be to hold a 400 lb sumo wrestler?
Simple linear regression
63
Making sure your data is not full of gaps and missing data is considered what data quality dimension?
Completeness
64
The following chart represents what type of distribution?
Normal
65
A person’s medical record is considered what type of protected data?
PHI
66
In general, dashboards are considered what type of report?
A self-service report
67
What type of analysis is most appropriate for predicting future values based on historical data?
Trend analysis
68
Which analytical tool is specialized for visualizations?
AWS QuickSight
69
What is the most appropriate data range for a report on machine efficiency at the end of the week?
Weeks
70
Deleting only the missing values and only as they are needed is what type of deletion?
Pairwise deletion
71
What type of report is most appropriate when requested for a one-time business question?
A static report
72
What part of the dashboard should you update to save time if you receive repeated questions?
The FAQs
73
Unstructured databases include which of the following data types?
Undefined fields and machine data
74
Which file type can be used to structure a website or pass data through a website?
XML
75
Find the middle quartile (Q2) of the dataset 70, 21, 34, 48, 27.
34
76
Watching things and taking notes as a form of data collection is called what?
Observation
77
What is the name of the action performed on a dataset when sorting?
Sorting
78
What do you call the process of filling gaps in the data by calculating the most likely value?
Imputation
79
How do nonparametric distributions relate to normal distributions?
Nonparametric distributions are sometimes normal
80
What type of analysis would be most appropriate to analyze the relationship between an employee’s job title and hair color?
Chi-square test for independence
81
What happens during a delta load?
Only load information that is new or has changed
82
What type of database schema is represented by a snowflake schema?
A snowflake schema
83
What type of visualization is most appropriate for showing the distribution of shirt sizes sold?
A histogram
84
Which of the following would you find in a structured database?
Key-value pairs
85
What analysis compares quantitative variables to see whether there is a relationship between them?
Correlation
86
What security process is described by translating data from plaintext to cyphertext?
Data encryption
87
What type of analysis compares two groups of quantitative variables to determine significant differences?
T-test
88
What is a major benefit of MDM?
Streamlining data access
89
What is the most suitable approach for creating a dashboard that automatically refreshes weekly?
Scheduled delivery
90
Which section of the data use agreement includes information on data destruction?
Data deletion
91
What is an execution plan?
An execution plan ## Footnote Review Chapter 3, Collecting Data – Optimizing Query Structure
92
Who comprises half of everyone who can access data?
Marketing analyst or business analyst ## Footnote Review Chapter 13, Common Visualizations – Charting Lines, Circles, and Dots
93
What is a data mart?
A data mart ## Footnote Review Chapter 2, Data Structures, Types, and Formats – Understanding the Concept of Warehouses and Lakes
94
What is a research report?
A research report ## Footnote Review Chapter 11, Types of Reports – Understanding Ad hoc and Research Reports
95
What schema type is described as a snowflake schema?
A snowflake schema ## Footnote Review Chapter 2, Data Structures, Types, and Formats – Going Through the Data Schema and its Types
96
What is the term for the end of an active data process?
Active End ## Footnote Review Chapter 2, Data Structures, Types, and Formats – Updating Stored Data
97
What is data consolidation?
Data consolidation ## Footnote Review Chapter 15, Data Quality and Management – Understanding Master Data Management (MDM)
98
What type of visualization is a geographic map?
A geographic map ## Footnote Review Chapter 13, Common Visualizations – Understanding Heat Maps, Tree Maps, and Geographic Maps
99
What does specification mismatch refer to?
Specification mismatch ## Footnote Review Chapter 4, Cleaning and Processing Data – Understanding Invalid Data, Specification Mismatch, and Data Type Validation
100
What type of question is a single choice question?
Single choice ## Footnote Review Chapter 3, Collecting Data – Collecting Your Own Data
101
What is the logical operator represented by 'OR'?
OR ## Footnote Review Chapter 5, Data Wrangling and Manipulation – Shaping Data with Common Functions
102
What is data integrity?
Data integrity ## Footnote Review Chapter 15, Data Quality and Management – Understanding Quality Control
103
What are reasonable expectations in data quality?
Reasonable expectations ## Footnote Review Chapter 15, Data Quality and Management – Validating Quality
104
What is the variance value mentioned?
21.7 ## Footnote Review Chapter 7, Measures of Central Tendency and Dispersion – Finding Variance and Standard Deviation
105
What type of visualization is a word cloud?
A word cloud ## Footnote Review Chapter 13, Common Visualizations – Understanding Infographics and Word Clouds
106
What should be done when data issues arise?
Notify the impacted parties ## Footnote Review Chapter 14, Data Governance – Knowing Use Requirements
107
What data type is structured?
Structured ## Footnote Review Chapter 2, Data Structures, Types, and Formats – Understanding Structured and Unstructured Data
108
What is the first step in the report development process?
Get approval ## Footnote Review Chapter 12, Reporting Process – Understanding the Report Development Process
109
What is the p-value mentioned?
0.3 ## Footnote Review Chapter 9, Hypothesis Testing – Learning p-Value and Alpha
110
What is a Z-score?
Z-score ## Footnote Review Chapter 8, Common Techniques in Descriptive Statistics – Understanding Z-Scores
111
What type of relationship is one-to-one?
One-to-one ## Footnote Review Chapter 14, Data Governance – Handling Entity Relationship Requirements
112
What data type is binary?
Binary ## Footnote Review Chapter 2, Data Structures, Types, and Formats – Going Through Data Types and File Types
113
What is the percentage mentioned?
7% ## Footnote Review Chapter 8, Common Techniques in Descriptive Statistics – Calculating Percent Change and Percent Difference
114
What is the confidence interval range provided?
7.3 to 8.7 ## Footnote Review Chapter 8, Common Techniques in Descriptive Statistics – Discovering Confidence Intervals
115
What type of join is a left join?
Left join ## Footnote Review Chapter 5, Data Wrangling and Manipulation – Merging Data
116
What statistical test is known as Chi-square?
Chi-square ## Footnote Review Chapter 10, Introduction to Inferential Statistics – Knowing Chi-Square
117
What does PII stand for?
PII ## Footnote Review Chapter 14, Data Governance – Understanding Data Classifications
118
What does duplicate data refer to?
The same information recorded in multiple rows ## Footnote Review Chapter 4, Cleaning and Processing Data – Managing Duplicate and Redundant Data
119
What is an outlier?
An outlier ## Footnote Review Chapter 4, Cleaning and Processing Data – Finding Outliers
120
What are web services in data collection?
Web services ## Footnote Review Chapter 3, Collecting Data – Utilizing Public Sources of Data
121
What does reduction refer to in data manipulation?
Reduction ## Footnote Review Chapter 5, Data Wrangling and Manipulation – Calculating Derived and Reduced Variables
122
What is data acquisition?
Data acquisition ## Footnote Review Chapter 15, Data Quality and Management – Understanding Quality Control
123
What is data profiling?
Data profiling ## Footnote Review Chapter 15, Data Quality and Management – Validating Quality
124
What is simple linear regression?
Simple linear regression ## Footnote Review Chapter 10, Introduction to Inferential Statistics – Simple Linear Regression
125
What is transposition in data manipulation?
Transposition ## Footnote Review Chapter 5, Data Wrangling and Manipulation – Shaping Data with Common Functions
126
What does completeness refer to in data quality?
Completeness ## Footnote Review Chapter 15, Data Quality and Management – Understanding Quality Control
127
What is the distribution type mentioned?
Uniform ## Footnote Review Chapter 7, Measures of Central Tendency and Dispersion – Discovering Distributions
128
What does PHI stand for?
PHI ## Footnote Review Chapter 14, Data Governance – Understanding Data Classifications
129
What is a self-service report?
A self-service report ## Footnote Review Chapter 11, Types of Reports – Knowing about Self-Service Reports
130
What type of analysis is trend analysis?
Trend analysis ## Footnote Review Chapter 6, Types of Analytics – Discovering Trends
131
What analytical tool is AWS QuickSight?
AWS QuickSight ## Footnote Review Chapter 11, Types of Reports – Knowing Important Analytical Tools
132
What time unit is mentioned for making a report?
Weeks ## Footnote Review Chapter 12, Reporting Process – Knowing What to Consider When Making a Report
133
What method is known as pairwise deletion?
Pairwise deletion ## Footnote Review Chapter 4, Cleaning and Processing Data – Dealing with Missing Data
134
What is a static report?
A static report ## Footnote Review Chapter 11, Types of Reports – Distinguishing Static and Dynamic Reports
135
What are the FAQs in reporting?
The FAQs ## Footnote Review Chapter 12, Reporting Process – Understanding Report Elements
136
What are undefined fields and machine data?
Undefined fields and machine data ## Footnote Review Chapter 2, Data Structures, Types, and Formats – Understanding Structured and Unstructured Data
137
What data format is XML?
XML ## Footnote Review Chapter 2, Data Structures, Types, and Formats – Going Through Data Types and File Types
138
What is the range value mentioned?
34 ## Footnote Review Chapter 7, Measures of Central Tendency and Dispersion – Calculating Range and Quartiles
139
What is an observation in data collection?
Observation ## Footnote Review Chapter 3, Collecting Data – Collecting Your Own Data
140
What does filtering refer to in data collection?
Filtering ## Footnote Review Chapter 3, Collecting Data – Optimizing Query Structure
141
What is interpolation in data processing?
Interpolation ## Footnote Review Chapter 4, Cleaning and Processing Data – Dealing with Missing Data
142
Are nonparametric distributions ever normal?
Nonparametric distributions are never normal ## Footnote Review Chapter 4, Cleaning and Processing Data – Understanding Non-Parametric Data
143
What is the Chi-square test for independence?
Chi-square test for independence ## Footnote Review Chapter 10, Introduction to Inferential Statistics – Knowing Chi-Square
144
What does it mean to only load new or changed information?
Only load information that is new or has changed ## Footnote Review Chapter 3, Collecting Data – Differentiating ETL and ELT
145
What is a snowflake schema?
A snowflake schema ## Footnote Review Chapter 2, Data Structures, Types, and Formats – Going Through the Data Schema and its Types
146
What is a histogram?
A histogram ## Footnote Review Chapter 13, Common Visualizations – Comprehending Charts with Bars
147
What are key-value pairs?
Key-value pairs ## Footnote Review Chapter 2, Data Structures, Types, and Formats – Understanding Structured and Unstructured Data
148
What is correlation in statistics?
Correlation ## Footnote Review Chapter 10, Introduction to Inferential Statistics – Calculating Correlations
149
What is data encryption?
Data encryption ## Footnote Review Chapter 14, Data Governance – Understanding Data Security
150
What is a T-test?
T-test ## Footnote Review Chapter 10, Introduction to Inferential Statistics – Understanding T-Tests
151
What does streamlining data access refer to?
Streamlining data access ## Footnote Review Chapter 15, Data Quality and Management – Understanding Master Data Management (MDM)
152
What is the term for subscription in reporting?
Subscription ## Footnote Review Chapter 12, Reporting Process – Understanding Report Delivery
153
What does data deletion refer to?
Data deletion ## Footnote Review Chapter 14, Data Governance – Knowing Use Requirements