Glossary Flashcards

1
Q

A/B testing

A

The process of testing two variations of the same web page to determine which page is more successful at attracting user traffic and generating revenue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Absolute reference

A

A reference within a function that is locked so that rows and columns won’t change if the function is copied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Access control

A

Features such as password protection, user permissions, and encryption that are used to protect a spreadsheet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Accuracy

A

The degree to which data conforms to the actual entity being measured or described.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Action-oriented question

A

A question whose answers lead to change.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Administrative metadata

A

Metadata that indicates the technical source of a digital asset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Aesthetic (R)

A

A visual property of an object in a plot.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Agenda

A

A list of scheduled appointments.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Aggregation

A

The process of collecting or gathering many separate pieces into a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Algorithm

A

A process or set of rules followed for a specific task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Aliasing

A

Temporarily naming a table or column in a query to make it easier to read and write.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Alternative text

A

Text that provides an alternative to non-text content such as images and videos.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Analytical skills

A

Qualities and characteristics associated with using facts to solve problems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Analytical thinking

A

The process of identifying and defining a problem, then solving it by using data in an organized step-by-step manner.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Annotation

A

Text that briefly explains data or helps focus the audience on a particular aspect of the data in a visualization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Anscombe’s quartet

A

Four datasets that have nearly identical summary statistics but contain different plotted values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Area chart

A

A data visualization that uses individual data points for a changing variable connected by a continuous line with a filled-in area underneath.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Argument (R)

A

Information needed by a function in R in order to run.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Arithmetic operator

A

An operator used to perform basic math operations such as addition, subtraction, multiplication, and division.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Array

A

A collection of values in spreadsheet cells.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Assignment operator

A

An operator used to assign values to variables and vectors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Attribute

A

A characteristic or quality of data used to label a column in a table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Audio file

A

Digitized audio storage usually in an MP3, AAC, or other compressed format.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

AVERAGE

A

A spreadsheet function that returns an average of the values from a selected range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

AVERAGEIF

A

A spreadsheet function that returns the average of all cell values from a given range that meet a specified condition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Bad data source

A

A data source that is not reliable, original, comprehensive, current, and cited (ROCCC).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Balance

A

The design principle of creating aesthetic appeal and clarity in a data visualization by evenly distributing visual elements.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Bar graph

A

A data visualization that uses size to contrast and compare two or more values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Bias

A

A conscious or subconscious preference in favor of or against a person, group of people, or thing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Big data

A

Large complex datasets typically involving long periods of time which enable data analysts to address far-reaching business problems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Boolean data

A

A data type with only two possible values, usually true or false.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Borders

A

Lines that can be added around two or more cells on a spreadsheet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Box plot

A

A data visualization that displays the distribution of values along an x-axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Bubble chart

A

A data visualization that displays individual data points as bubbles, comparing numeric values by their relative size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Bullet graph

A

A data visualization that displays data as a horizontal bar chart moving toward a desired value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Business metric

A

A standard of measurement used to solve a business task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

Business task

A

The question or problem data analysis resolves for a business.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

C#

A

An object-oriented programming language used to create games and mobile apps in the .NET open source developer platform.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

C++

A

An extension of the C programming language that is used to create console games such as those for Xbox.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Calculated field

A

A new field within a pivot table that carries out certain calculations based on the values of other fields.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Calculus

A

A branch of mathematics that involves the study of rates of change and the changes between values that are related by a function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

CASE

A

A SQL statement that returns records that meet conditions by including an if/then statement in a query.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Case study

A

A common way for employers to assess job skills and gain insight into how a candidate approaches common data-related challenges.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

CAST

A

A SQL function that converts data from one datatype to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

Causation

A

When an action directly leads to an outcome, such as a cause-effect relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Cell reference

A

A cell or a range of cells in a worksheet typically used in formulas and functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Changelog

A

A file containing a chronologically ordered list of modifications made to a project.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Channel

A

A visual aspect or variable that represents characteristics of the data in a visualization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Chart

A

A graphical representation of data from a worksheet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Circle view

A

A data visualization that shows comparative strength in data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Clean data

A

Data that is complete, correct, and relevant to the problem being solved.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Cloud

A

A place to keep data online rather than a computer hard drive.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

Cluster

A

A collection of data points on a data visualization with similar values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

COALESCE

A

A SQL function that returns non-null values in a list.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Code chunk

A

A piece of code added in an R Markdown file that is used to process, visualize, or analyze data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Coding

A

The process of writing instructions to a computer in the syntax of a specific programming language.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Column chart

A

A data visualization that uses individual data points for a changing variable represented as vertical columns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Combo chart

A

A data visualization that combines more than one visualization type.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

Compatibility

A

How well two or more datasets are able to work together.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

Completeness

A

The degree to which data contains all desired components or measures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

Computer programming

A

The process of giving instructions to a computer in order to perform an action or set of actions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

CONCAT

A

A SQL function that adds strings together to create new text strings that can be used as unique keys.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

CONCATENATE

A

A spreadsheet function that joins together two or more text strings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

Conditional formatting

A

A spreadsheet tool that changes how cells appear when values meet specific conditions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Conditional statement

A

A declaration that if a certain condition holds, then a certain event must take place.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

Confidence interval

A

A range of values that conveys how likely a statistical estimate reflects the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Confidence level

A

The probability that a sample size accurately reflects the greater population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

Confirmation bias

A

The tendency to search for or interpret information in a way that confirms pre-existing beliefs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

Consent

A

The aspect of data ethics that presumes an individual’s right to know how and why their personal data will be used before agreeing to provide it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Consistency

A

The degree to which data is repeatable from different points of entry or collection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

Context

A

The condition in which something exists or happens.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

Continuous data

A

Data that is measured and can have almost any numeric value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

CONVERT

A

A SQL function that changes the unit of measurement of a value in data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

Cookie

A

A small file stored on a computer that contains information about its users.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

Correlation

A

The measure of the degree to which two variables change in relationship to each other.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

COUNT

A

A spreadsheet function that counts the number of cells within a range that meet a specified condition.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

COUNTA

A

A spreadsheet function that counts the total number of values within a specified range that meet specified criteria.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

COUNTIF

A

A spreadsheet function that returns the number of cells within a range that match a specified value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

COUNT DISTINCT

A

A SQL function that only returns the distinct values in a specified range.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

CRAN (R)

A

An online archive with R packages, source code, manuals, and documentation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

CREATE TABLE

A

A SQL clause that adds a temporary table to a database that can be used by multiple people.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

Cross-field validation

A

A process that ensures certain conditions for multiple data fields are satisfied.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

CSS

A

A programming language used for web page design that controls graphic elements and page presentation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

CSV

A

A delimited text file that uses a comma to separate values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

Currency

A

The aspect of data ethics that presumes individuals should be aware of financial transactions resulting from the use of their personal data and the scale of those transactions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

Dashboard

A

A tool that monitors live incoming data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

Data

A

A collection of facts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

Data aggregation

A

The process of gathering data from multiple sources and combining it into a single summarized collection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

Data analysis

A

The collection, transformation, and organization of data in order to draw conclusions, make predictions, and drive informed decision-making.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

Data analysis process

A

The six phases of ask, prepare, process, analyze, share, and act, whose purpose is to gain insights that drive informed decision-making.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
91
Q

Data analyst

A

Someone who collects, transforms, and organizes data in order to draw conclusions, make predictions, and drive informed decision-making.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
92
Q

Data analytics

A

The science of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
93
Q

Data anonymization

A

The process of protecting people’s private or sensitive data by eliminating identifying information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
94
Q

Data bias

A

When a preference in favor of or against a person, group of people, or thing systematically skews data analysis results in a certain direction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
95
Q

Data blending

A

A Tableau method that combines data from multiple data sources.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
96
Q

Data composition

A

The process of combining the individual parts in a visualization and displaying them together as a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
97
Q

Data constraints

A

The criteria that determine whether a piece of data is clean and valid.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
98
Q

Data design

A

How information is organized.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
99
Q

Data-driven decision-making

A

Using facts to guide business strategy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
100
Q

Data ecosystem

A

The various elements that interact with one another in order to produce, manage, store, organize, analyze, and share data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
101
Q

Data element

A

A piece of information in a dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
102
Q

Data engineer

A

A professional who transforms data into a useful format for analysis and gives it a reliable infrastructure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
103
Q

Data ethics

A

Well-founded standards of right and wrong that dictate how data is collected, shared, and used.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
104
Q

Data frame

A

A collection of columns containing data similar to a spreadsheet or SQL table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
105
Q

Data governance

A

A process for ensuring the formal management of a company’s data assets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
106
Q

Data-inspired decision-making

A

Exploring different data sources to find out what they have in common.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
107
Q

Data integrity

A

The accuracy, completeness, consistency, and trustworthiness of data throughout its life cycle.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
108
Q

Data interoperability

A

The ability to integrate data from multiple sources and a key factor leading to the successful use of open data among companies and governments.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
109
Q

Data life cycle

A

The sequence of stages that data experiences, which include plan, capture, manage, analyze, archive, and destroy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
110
Q

Data manipulation

A

The process of changing data to make it more organized and easier to read.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
111
Q

Data mapping

A

The process of matching fields from one data source to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
112
Q

Data merging

A

The process of combining two or more datasets into a single dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
113
Q

Data model

A

A tool for organizing data elements and how they relate to one another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
114
Q

Data privacy

A

Preserving a data subject’s information any time a data transaction occurs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
115
Q

Data range

A

Numerical values that fall between predefined maximum and minimum values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
116
Q

Data replication

A

The process of storing data in multiple locations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
117
Q

Data science

A

A field of study that uses raw data to create new ways of modeling and understanding the unknown.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
118
Q

Data security

A

Protecting data from unauthorized access or corruption by adopting safety measures.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
119
Q

Data storytelling

A

Communicating the meaning of a dataset with visuals and a narrative that are customized for an audience.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
120
Q

Data strategy

A

The management of the people, processes, and tools used in data analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
121
Q

Data structure

A

A format for organizing and storing data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
122
Q

Data transfer

A

The process of copying data from a storage device to computer memory or from one computer to another.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
123
Q

Data type

A

An attribute that describes a piece of data based on its values, its programming language, or the operations it can perform.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
124
Q

Data validation

A

A tool for checking the accuracy and quality of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
125
Q

Data validation process

A

The process of checking and rechecking the quality of data so that it is complete, accurate, secure, and consistent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
126
Q

Data visualization

A

The graphical representation of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
127
Q

Data warehousing specialist

A

A professional who develops processes and procedures to effectively store and organize data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
128
Q

Database

A

A collection of data stored in a computer system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
129
Q

Dataset

A

A collection of data that can be manipulated or analyzed as one unit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
130
Q

DATEDIF

A

A spreadsheet function that calculates the number of days, months, or years between two dates.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
131
Q

Decision tree

A

A tool that helps analysts make decisions about critical features of a visualization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
132
Q

Delimiter

A

A character that indicates the beginning or end of a data item.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
133
Q

Density map

A

A data visualization that represents concentrations with color representing the number or frequency of data points in a given area on a map.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
134
Q

Descriptive metadata

A

Metadata that describes a piece of data and can be used to identify it at a later point in time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
135
Q

Design thinking

A

A process used to solve complex problems in a user-centric way.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
136
Q

Digital photo

A

An electronic or computer-based image, usually in BMP or JPG format.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
137
Q

Dirty data

A

Data that is incomplete, incorrect, or irrelevant to the problem to be solved.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
138
Q

Discrete data

A

Data that is counted and has a limited number of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
139
Q

DISTINCT

A

A keyword that is added to a SQL SELECT statement to retrieve only non-duplicate entries.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
140
Q

Distribution graph

A

A data visualization that displays the frequency of various outcomes in a sample.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
141
Q

Diverging color palette

A

A color theme that displays two ranges of data values using two different hues with color intensity representing the magnitude of the values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
142
Q

Donut chart

A

A data visualization where segments of a ring represent data values adding up to a whole.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
143
Q

dplyr (R)

A

An R package in Tidyverse that offers a consistent set of functions to complete common data-manipulation tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
144
Q

DROP TABLE

A

A SQL clause that removes a temporary table from a database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
145
Q

Duplicate data

A

Any record that inadvertently shares data with another record.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
146
Q

Dynamic visualizations

A

Data visualizations that are interactive or change over time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
147
Q

Elevator pitch

A

A short statement describing an idea or concept.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
148
Q

Emphasis

A

The design principle of arranging visual elements to focus the audience’s attention on important information in a data visualization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
149
Q

Engagement

A

Capturing and holding someone’s interest and attention during a data presentation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
150
Q

Equation

A

A calculation that involves addition, subtraction, multiplication, or division (also called a math expression).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
151
Q

Estimated response rate

A

The average number of people who typically complete a survey.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
152
Q

Ethics

A

Well-founded standards of right and wrong that prescribe what humans ought to do, usually in terms of rights, obligations, benefits to society, fairness, or specific virtues.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
153
Q

External data

A

Data that lives and is generated outside of an organization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
154
Q

Facets (R)

A

A series of functions that splits data into subsets in a matrix of panels.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
155
Q

Factor (R)

A

An object that stores categorical data where the data values are limited and usually based on a finite group such as country or year.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
156
Q

Fairness

A

A quality of data analysis that does not create or reinforce bias.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
157
Q

Field

A

A single piece of information from a row or column of a spreadsheet; in a data table, typically a column in the table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
158
Q

Field length

A

A tool for determining how many characters can be keyed into a spreadsheet field.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
159
Q

Fill handle

A

A box in the lower-right-hand corner of a selected spreadsheet cell that can be dragged through neighboring cells in order to continue an instruction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
160
Q

Filled map

A

A data visualization that colors areas in a map based on measurements or dimensions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
161
Q

Filtering

A

The process of showing only the data that meets a specified criteria while hiding the rest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
162
Q

Find and replace

A

A tool that finds a specified search term and replaces it with something else.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
163
Q

First-party data

A

Data collected by an individual or group using their own resources.

164
Q

Float

A

A number that contains a decimal.

165
Q

Foreign key

A

A field within a database table that is a primary key in another table (Refer to primary key).

166
Q

Formula

A

A set of instructions used to perform a calculation using the data in a spreadsheet.

167
Q

Framework

A

The context a presentation needs to create logical connections that tie back to the business task and metrics.

168
Q

FROM

A

The section of a query that indicates from which table(s) to extract the data.

169
Q

Function

A

A preset command that automatically performs a specific process or task using the data in a spreadsheet.

170
Q

Function (R)

A

A body of reusable code for performing specific tasks in R.

171
Q

FWF

A

A text file with a specific format which enables the saving of textual data in an organized fashion.

172
Q

GAM

A

A process for smoothing plots with a large number of points.

173
Q

Gantt chart

A

A data visualization that displays the duration of events or activities on a timeline.

174
Q

Gap analysis

A

A method for examining and evaluating the current state of a process in order to identify opportunities for improvement in the future.

175
Q

Gauge chart

A

A data visualization that shows a single result within a progressive range of values.

176
Q

GDPR

A

Policy-making body in the European Union created to help protect people and their data.

177
Q

Geolocation

A

The geographical location of a person or device by means of digital information.

178
Q

Geom (R)

A

The geometric object used to represent data.

179
Q

ggplot2 (R)

A

An R package in Tidyverse that creates a variety of data visualizations by applying different visual properties to the data variables in R.

180
Q

Good data source

A

A data source that is reliable, original, comprehensive, current, and cited (ROCCC).

181
Q

GROUP BY

A

A SQL clause that groups rows that have the same values from a table into summary rows.

182
Q

HAVING

A

A SQL clause that adds a filter to a query instead of the underlying table that can only be used with aggregate functions.

183
Q

head() (R)

A

An R function that returns a preview of the column names and the first few rows of a dataset.

184
Q

Header

A

The first row in a spreadsheet that labels the type of data in each column.

185
Q

Headline

A

Text at the top of a visualization that communicates the data being presented.

186
Q

Heat map

A

A data visualization that uses color contrast to compare categories in a dataset.

187
Q

Highlight table

A

A data visualization that uses conditional formatting and color on a table.

188
Q

Highlight table

A

A data visualization that uses conditional formatting and color on a table

189
Q

Histogram

A

A data visualization that shows how often data values fall into certain ranges

190
Q

HTML (Hypertext Markup Language)

A

The set of markup symbols or codes used to create a webpage

191
Q

HTML5

A

A programming language that provides structure for web pages and connects to hosting platforms

192
Q

Hypothesis

A

A theory that one might try to prove or disprove with data

193
Q

Hypothesis testing

A

A process to determine if a survey or experiment has meaningful results

194
Q

IDE (Integrated Development Environment)

A

A software application that brings together all the tools a data analyst may want to use in a single place

195
Q

Incomplete data

A

Data that is missing important fields

196
Q

Inconsistent data

A

Data that uses different formats to represent the same thing

197
Q

Incorrect/inaccurate data

A

Data that is complete but inaccurate

198
Q

Inline code

A

Code that can be inserted directly into the text of an R Markdown file

199
Q

INNER JOIN

A

A SQL function that returns records with matching values in both tables

200
Q

Inner query

A

A SQL subquery that is inside of another SQL statement

201
Q

Internal data

A

Data that lives within a company’s own systems

202
Q

Interpretation bias

A

The tendency to interpret ambiguous situations in a positive or negative way

203
Q

Java

A

A programming language widely used to create enterprise web applications that can run on multiple clients

204
Q

JOIN

A

A SQL function that is used to combine rows from two or more tables based on a related column

205
Q

Jupyter Notebook

A

An open-source web application used to create and share documents that contain live code, equations, visualizations and narrative text

206
Q

Label

A

Text in a visualization that identifies a value or describes a scale

207
Q

Labels and annotations (R)

A

A group of R functions used for customizing a plot

208
Q

Leading question

A

A question that steers people toward a certain response

209
Q

LEFT

A

A function that returns a set number of characters from the left side of a text string

210
Q

LEFT JOIN

A

A SQL function that will return all the records from the left table and only the matching records from the right table

211
Q

Legend

A

A tool that identifies the meaning of various elements in a data visualization

212
Q

LEN

A

A function that returns the length of a text string by counting the number of characters it contains

213
Q

Length

A

The number of characters in a text string

214
Q

Library

A

A directory containing all of a data analyst’s installed packages

215
Q

LIMIT

A

A SQL clause that specifies the maximum number of records returned in a query

216
Q

Line graph

A

A data visualization that uses one or more lines to display shifts or changes in data over time

217
Q

List

A

A vector whose elements can be of any type

218
Q

Live data

A

Data that is automatically updated

219
Q

Loess smoothing (R)

A

A process used for smoothing plots with fewer than 1,000 points

220
Q

Log file

A

A computer-generated file that records events from operating systems and other software programs

221
Q

Logical operator

A

An operator that returns a logical data type

222
Q

Long data

A

A dataset in which each row is one time point per subject, so each subject has data in multiple rows

223
Q

Mandatory

A

A data value that cannot be left blank or empty

224
Q

Map

A

A data visualization that organizes data geographically

225
Q

Mapping (R)

A

The process of matching up a specific variable in a dataset with a specific aesthetic

226
Q

Margin of error

A

The maximum amount that sample results are expected to differ from those of the actual population

227
Q

Markdown (R)

A

A syntax for formatting plain text files

228
Q

Mark

A

A visual object in a data visualization such as a point, line, or shape

229
Q

MATCH

A

A spreadsheet function used to locate the position of a specific lookup value

230
Q

Math expression

A

A calculation that involves addition, subtraction, multiplication, or division (also called an equation)

231
Q

Math function

A

A function that is used as part of a mathematical formula

232
Q

Matrix

A

A two-dimensional collection of data elements with rows and columns

233
Q

MAX

A

A spreadsheet function that returns the largest numeric value from a range of cells

234
Q

MAXIFS

A

A spreadsheet function that returns the maximum value from a given range that meets a specified condition

235
Q

McCandless Method

A

A method for presenting data visualizations that moves from general to specific information

236
Q

Measurable question

A

A question whose answers can be quantified and assessed

237
Q

Mental model

A

A data analyst’s thought process and approach to a problem

238
Q

Mentor

A

Someone who shares knowledge, skills, and experience to help another grow both professionally and personally

239
Q

Merger

A

An agreement that unites two organizations into a single new one

240
Q

Metadata

A

Data about data

241
Q

Metadata repository

A

A database created to store metadata

242
Q

Metric

A

A single, quantifiable type of data that is used for measurement

243
Q

Metric goal

A

A measurable goal set by a company and evaluated using metrics

244
Q

MID

A

A function that returns a segment from the middle of a text string

245
Q

MIN

A

A spreadsheet function that returns the smallest numeric value from a range of cells

246
Q

MINIFS

A

A spreadsheet function that returns the minimum value from a given range that meets a specified condition

247
Q

Modulo

A

An operator (%) that returns the remainder when one number is divided by another

248
Q

Movement

A

The design principle of arranging visual elements to guide the audience’s eyes from one part of a data visualization to another

249
Q

Naming conventions

A

Consistent guidelines that describe the content, creation date, and version of a file in its name

250
Q

Narrative

A

(Refer to Story)

251
Q

Nested

A

Code that performs a particular function and is contained within code that performs a broader function

252
Q

Nested function

A

A function that is completely contained within another function

253
Q

Networking

A

Building relationships by meeting people both in person and online

254
Q

Nominal data

A

A type of qualitative data that is categorized without a set order

255
Q

Normalized database

A

A database in which only related data is stored in each table

256
Q

Notebook

A

An interactive, editable programming environment for creating data reports and showcasing data skills

257
Q

Null

A

An indication that a value does not exist in a dataset

258
Q

Observation

A

The attributes that describe a piece of data contained in a row of a table

259
Q

Observer bias

A

The tendency for different people to observe things differently (also called experimenter bias)

260
Q

Open data

A

Data that is available to the public

261
Q

Open-source

A

Code that is freely available and may be modified and shared by the people who use it

262
Q

Openness

A

The aspect of data ethics that promotes the free access, usage, and sharing of data

263
Q

Operator

A

A symbol that names the operation or calculation to be performed

264
Q

ORDER BY

A

A SQL clause that sorts results returned in a query

265
Q

Order of operations

A

Using parentheses to group together spreadsheet values in order to clarify the order in which operations should be performed

266
Q

Ordinal data

A

Qualitative data with a set order or scale

267
Q

Outdated data

A

Any data that has been superseded by newer and more accurate information

268
Q

OUTER JOIN

A

A SQL function that combines RIGHT and LEFT JOIN to return all matching records in both tables

269
Q

Outer query

A

A SQL statement containing a subquery

270
Q

Ownership

A

The aspect of data ethics that presumes individuals own the raw data they provide and have primary control over its usage, processing, and sharing

271
Q

Package (R)

A

A unit of reproducible R code

272
Q

Packed bubble chart

A

A data visualization that displays data in clustered circles

273
Q

Pattern

A

The design principle of using similar visual elements to demonstrate trends and relationships in a data visualization

274
Q

PHP (Hypertext Preprocessor)

A

A programming language for web application development

275
Q

Pie chart

A

A data visualization that uses segments of a circle to represent the proportions of each data category compared to the whole

276
Q

Pipe (R)

A

A tool in R for expressing a sequence of multiple operations, represented with “%>%”

277
Q

Pivot chart

A

A chart created from the fields in a pivot table

278
Q

Pivot table

A

A data summarization tool used to sort, reorganize, group, count, total, or average data

279
Q

Pixel

A

In digital imaging, a small area of illumination on a display screen that, when combined with other adjacent areas, forms a digital image

280
Q

Population

A

In data analytics, all possible data values in a dataset

281
Q

Portfolio

A

A collection of materials that can be shared with potential employers

282
Q

Pre-attentive attributes

A

The elements of a data visualization that an audience recognizes automatically without conscious effort

283
Q

Primary key

A

An identifier in a database that references a column in which each value is unique (Refer to foreign key)

284
Q

Problem domain

A

The area of analysis that encompasses every activity affecting or affected by a problem

285
Q

Problem types

A

The various problems that data analysts encounter, including categorizing things, discovering connections, finding patterns, identifying themes, making predictions, and spotting something unusual

286
Q

Profit margin

A

A percentage that indicates how many cents of profit has been generated for each dollar of sale

287
Q

Programming language

A

A system of words and symbols used to write instructions that computers follow

288
Q

Proportion

A

The design principle of using the relative size and arrangement of visual elements to demonstrate information in a data visualization

289
Q

Python

A

A general-purpose programming language

290
Q

Qualitative data

A

A subjective and explanatory measure of a quality or characteristic

291
Q

Quantitative data

A

A specific and objective measure, such as a number, quantity, or range

292
Q

Query

A

A request for data or information from a database

293
Q

Query language

A

A computer programming language used to communicate with a database

294
Q

R

A

A programming language used for statistical analysis, visualization, and other data analysis

295
Q

R Markdown

A

A file format for making dynamic documents with R

296
Q

R Notebook

A

A document for running code and displaying the graphs and charts that visualize the code

297
Q

Random sampling

A

A way of selecting a sample from a population so that every possible type of the sample has an equal chance of being chosen

298
Q

Range

A

A collection of two or more cells in a spreadsheet

299
Q

Ranking

A

A system to position values of a dataset within a scale of achievement or status

300
Q

Record

A

A collection of related data in a data table, usually synonymous with row

301
Q

Redundancy

A

When the same piece of data is stored in two or more places

302
Q

Reframing

A

The process of restating a problem or challenge, then redirecting it toward a potential resolution

303
Q

Regular expression (RegEx)

A

A rule that says the values in a table must match a prescribed pattern

304
Q

Relational database

A

A database that contains a series of tables that can be connected to form relationships

305
Q

Relational operator

A

An operator used to compare values, also known as a comparator

306
Q

Relativity

A

The process of considering observations in relation or proportion to something else

307
Q

Relevant question

A

A question that has significance to the problem to be solved

308
Q

Remove duplicates

A

A spreadsheet tool that automatically searches for and eliminates duplicate entries from a spreadsheet

309
Q

Repetition

A

The design principle of repeating visual elements to demonstrate meaning in a data visualization

310
Q

Report

A

A static collection of data periodically given to stakeholders

311
Q

Return on investment (ROI)

A

A formula that uses the metrics of investment and profit to evaluate the success of an investment

312
Q

Revenue

A

The total amount of income generated by the sale of goods or services

313
Q

Rhythm

A

The design principle of creating movement and flow in a data visualization to engage an audience

314
Q

RIGHT

A

A function that returns a set number of characters from the right side of a text string

315
Q

RIGHT JOIN

A

A SQL function that will return all records from the right table and only the matching records from the left

316
Q

Root cause

A

The reason why a problem occurs

317
Q

ROUND

A

A SQL function that returns a number rounded to a certain number of decimal places.

318
Q

Ruby

A

An object-oriented programming language for web application development

319
Q

Sample

A

In data analytics, a segment of a population that is representative of the entire population

320
Q

Sampling bias

A

Overrepresenting or underrepresenting certain members of a population as a result of working with a sample that is not representative of the population as a whole

321
Q

Scatterplot

A

A data visualization that represents relationships between different variables with individual data points without a connecting line

322
Q

Schema

A

A way of describing how something, such as data, is organized

323
Q

Scope of work (SOW)

A

An agreed-upon outline of the tasks to be performed during a project

324
Q

Second-party data

A

Data collected by a group directly from its audience and then sold

325
Q

SELECT

A

The section of a query that indicates from which column(s) to extract the data

326
Q

SELECT INTO

A

A SQL clause that copies data from one table into a temporary table without adding the new table to the database

327
Q

Shiny (R)

A

An R package used to build interactive web apps with R code

328
Q

Small data

A

Small, specific data points typically involving a short period of time, which are useful for making day-to-day decisions

329
Q

SMART methodology

A

A tool for determining a question’s effectiveness based on whether it is specific, measurable, action-oriented, relevant, and time-bound

330
Q

Smoothing (R)

A

A process used to make data visualizations in R clearer and more readable

331
Q

Smoothing line (R)

A

A line on a data visualization that uses smoothing to represent a trend

332
Q

Social media

A

Websites and applications through which users create and share content or participate in social networking

333
Q

Soft skills

A

Nontechnical traits and behaviors that relate to how people work

334
Q

Sort range

A

A spreadsheet menu function that sorts a specified range and preserves the cells outside the range

335
Q

Sort sheet

A

A spreadsheet menu function that sorts all data by the ranking of a specific sorted column and keeps data together across rows

336
Q

Sorting

A

The process of arranging data into a meaningful order to make it easier to understand, analyze, and visualize

337
Q

Specific question

A

A question that is simple, significant, and focused on a single topic or a few closely related ideas

338
Q

SPLIT

A

A spreadsheet function that divides text around a specified character and puts each fragment into a new, separate cell

339
Q

Sponsor

A

A professional advocate who is committed to moving forward the career of another

340
Q

Spotlightling

A

Scanning through data to quickly identify the most important insights

341
Q

Spreadsheet

A

A digital worksheet

342
Q

SQL

A

(Refer to Structured Query Language)

343
Q

Stakeholders

A

People who invest time and resources into a project and are interested in its outcome

344
Q

Static data

A

Data that doesn’t change once it has been recorded

345
Q

Static visualization

A

A data visualization that does not change over time unless it is edited

346
Q

Statistical power

A

The probability that a test of significance will recognize an effect that is present

347
Q

Statistical significance

A

The probability that sample results are not due to random chance

348
Q

Statistics

A

The study of how to collect, analyze, summarize, and present data

349
Q

Story

A

The narrative of a data presentation that makes it meaningful and interesting

350
Q

String data type

A

A sequence of characters and punctuation that contains textual information (also called text data type)

351
Q

Structural metadata

A

Metadata that indicates how a piece of data is organized and whether it is part of one or more than one data collection

352
Q

Structured data

A

Data organized in a certain format such as rows and columns

353
Q

Structured Query Language

A

A computer programming language used to communicate with a database

354
Q

Structured thinking

A

The process of recognizing the current problem or situation, organizing available information, revealing gaps and opportunities, and identifying options

355
Q

Subquery

A

A SQL query that is nested inside a larger query

356
Q

SUBSTR

A

A SQL function that extracts a substring from a string variable

357
Q

Substring

A

A subset of a text string

358
Q

Subtitle

A

Text that supports a headline by adding context and description

359
Q

SUM

A

A spreadsheet function that adds the values of a selected range of cells

360
Q

SUMIF

A

A spreadsheet function that adds numeric data based on one condition

361
Q

Summary table

A

A table used to summarize statistical information about data

362
Q

SUMPRODUCT

A

A function that multiplies arrays and returns the sum of those products

363
Q

Swift

A

A programming language for macOS, iOS, watchOS, and tvOS

364
Q

Symbol map

A

A data visualization that displays a mark over a given longitude and latitude

365
Q

Syntax

A

The predetermined structure of a language that includes all required words, symbols, and punctuation, as well as their proper placement

366
Q

Tableau

A

A business intelligence and analytics platform that helps people visualize, understand, and make decisions with data

367
Q

Technical mindset

A

The ability to break things down into smaller steps or pieces and work with them in an orderly and logical way

368
Q

Temporary table

A

A database table that is created and exists temporarily on a database server

369
Q

Text data type

A

A sequence of characters and punctuation that contains textual information (also called string data type)

370
Q

Text string

A

A group of characters within a cell, most often composed of letters

371
Q

Third-party data

A

Data provided from outside sources that did not collect it directly

372
Q

Tibble (R)

A

A streamlined variation of data frames

373
Q

Tidy data (R)

A

A way of standardizing the organization of data within R

374
Q

Tidyverse (R)

A

A system of packages in R with a common design philosophy for data manipulation, exploration, and visualization

375
Q

Time-bound question

A

A question that specifies a timeframe to be studied

376
Q

Transaction transparency

A

The aspect of data ethics that presumes all data-processing activities and algorithms should be explainable and understood by the individual who provides the data

377
Q

Transferable skills

A

Skills and qualities that can transfer from one job or industry to another

378
Q

TRIM

A

A function that removes leading, trailing, and repeated spaces in data

379
Q

TSV (Tab-separated values file)

A

A text file that stores a data table by separating columns of data with tabs

380
Q

Turnover rate

A

The rate at which employees voluntarily leave a company

381
Q

Typecasting

A

Converting data from one type to another

382
Q

Unbiased sampling

A

When the sample of the population being measured is representative of the population as a whole

383
Q

Underscores

A

Lines used to underline words and connect text characters

384
Q

Unfair question

A

A question that makes assumptions or is difficult to answer honestly

385
Q

Unique

A

A value that can’t have a duplicate

386
Q

United States Census Bureau

A

An agency in the U.S. Department of Commerce that serves as the nation’s leading provider of quality data about its people and economy

387
Q

Unity

A

The design principle of using visual elements that complement each other to create aesthetic appeal and clarity in a data visualization

388
Q

Unstructured data

A

Data that is not organized in any easily identifiable manner

389
Q

Validity

A

The degree to which data conforms to constraints when it is input, collected, or created

390
Q

VALUE

A

A spreadsheet function that converts a text string that represents a number to a numeric value

391
Q

Variable (R)

A

A representation of a value in R that can be stored for later use

392
Q

Variety

A

The design principle of using different kinds of visual elements in a data visualization to engage an audience

393
Q

Vector (R)

A

A group of data elements of the same type stored in a one-dimensional sequence in R

394
Q

Verification

A

A process to confirm that a data-cleaning effort was well executed and the resulting data is accurate and reliable

395
Q

Video file

A

A collection of images, audio files, and other data usually encoded in a compressed format such as MP4, MV4, MOV, AVI, or FLV

396
Q

Vignette (R)

A

Documentation for an R package that describes the problem the package is designed to solve, explains how its functions can be used, and lists any dependencies on other packages

397
Q

Visual form

A

The appearance of a data visualization that gives it structure and aesthetic appeal

398
Q

Visualization

A

(Refer to Data visualization)

399
Q

VLOOKUP

A

A spreadsheet function that vertically searches for a certain value in a column to return a corresponding piece of information

400
Q

WHERE

A

The section of a query that specifies criteria that the requested data must meet

401
Q

Wide data

A

A dataset in which every data subject has a single row with multiple columns to hold the values of various attributes of the subject

402
Q

WITH

A

A SQL clause that creates a temporary table that can be queried multiple times

403
Q

World Health Organization

A

An organization whose primary role is to direct and coordinate international health within the United Nations system

404
Q

X-axis

A

The horizontal line of a graph usually placed at the bottom, which is often used to represent time scales and discrete categories

405
Q

Y-axis

A

The vertical line of a graph usually placed to the left, which is often used to represent frequencies and other numerical variables

406
Q

YAML

A

A language that translates data to improve readability