Glossary of terms Flashcards
A/B Testing
The process of testing two variations of the same web page to determine which page is more successful at attracting user traffic and generating revenue
Access Control
Features such as password protection, user permissions, and encryption that are used to protect a spreadsheet
Accuracy
The degree to which the data conforms to the actual entity being measured or described
Action-oriented questions
A question whose answers lead to change
Administrative metadata
Metadata that indicates the technical source of a digital asset
Agenda
A list of scheduled appointments
Algorithm
A process or set of rules followed for a specific task
Analytical skills
Qualities and characteristics associated with using facts to solve problems
Attribute
A characteristic or quality of data used to label a column in a table.
Audio file
Digitized audio storage usually in an MP3, AAC, or other compressed format
AVERAGE
A spreadsheet function that returns an average of the values from a selected range
Bad data source
A data source that is not reliable, original, comprehensive, current, and cited (ROCCC)
Bias
A conscious or subconscious preference in favor of or against a person, group of people, or a thing
Big data
Large, complex datasets typically involving long periods of time, which enable data analysts to address far-reaching business problems
Boolean data
A data type with only two possible values, usually true or false
Borders
Lines that can be added around two or more cells on a spreadsheet
Business task
The question or problem data analysis resolves for a business
CASE
A SQL statement that returns records that meet conditions by including an if/then statement in a query
CAST
A SQL function that converts data from one datatype to another
Cell reference
A cell or a range of cells in a worksheet typically used in formulas and functions
Changelog
A file containing a chronologically ordered list of modifications made to a project
Clean data
Data that is complete, correct, and relevant to the problem being solved
Cloud
A place to keep data online, rather than a computer hard drive.
COALESCE
A SQL function that returns non-null values in a list
Compatibility
How well two or more datasets are able to work together
Completeness
The degree to which the data contains all desired components or measures
CONCAT
A SQL function that adds strings together to create new text strings that can be used as unique keys
CONCATENATE
A spreadsheet function that joins together two or more text strings
Conditional formatting
A spreadsheet tool that changes how cells appear when values meet specific conditions
Confidence interval
A range of values that conveys how likely a statistical estimate reflects the population
Confidence level
The probability that a sample size accurately reflects the greater population
Confirmation bias
The tendency to search for or interpret information in a way that confirms pre-existing beliefs
Consent
The aspect of data ethics that presumes an individual’s right to know how and why their personal data will be used before agreeing to provide it
Consistency
The degree to which data is repeatable from a different point of entry or collection
Context
The condition in which something exists or happens
Continuous data
Data that is measured and can have almost any numeric value
Cookie
A small file stored on a computer that contains information about its users
COUNT
A spreadsheet function that counts the number of cells in a range that meet specific criteria
COUNTIF
A spreadsheet function that returns the number of cells that match a specified value
Cross-field validation
A process that ensures certain conditions for multiple data fields are satisfied
CSV (comma-separated values) file
A delimited text file that users a comma to separate values
Currency
The aspect of data ethics that presumes individuals should be aware of financial transactions resulting from the use of their personal data and the scale of those transactions
Dashboard
A tool that monitors live, incoming data
Data
A collection of facts
Data analysis
The collection, transformation, and organization of data in order to draw conclusions make predictions, and drive informed decision-making.
Data analysis process
The six phases of ask, prepare, process, analyze, share, and act whose purpose is to gain insights that drive informed decision-making
Data analyst
Someone who collects transforms, and organizes data in order to drive informed decision-making
Data analytics
The science of data
Data anonymization
The process of protecting people’s private or sensitive data by eliminating identifying information
Data bias
When a preference in favor of or against a person, group of people or thing systematically skews data analysis results in a certain direction
Data constraints
The criteria that determine whether a piece of data is clean and valid
Data design
How information is organized
Data-driven decision-making
Using facts to guide business strategy.
Data ecosystem
The various elements that interact with one another in order to produce, manage, store, organize, analyze, and share data
Data element
A piece of information in a dataset
Data engineer
A professional who transforms data into a useful format for analysis and gives it a reliable infrastructure
Data ethics
Well-founded standards of right and wrong that dictate how data is collected, shared, and used
Data governance
A process for ensuring the formal management of a company’s data assets
Data-inspired decision-making
The process of exploring different data sources to find out what they have in common
Data integrity
The accuracy, completeness, consistency, and trustworthiness of data throughout its life cycle
Data interoperability
A key factor leading to the successful use of open data among companies and governments
Data life cycle
The sequence of stages that data experiences, which include plan, capture, manage, analyze, archive, and destroy
Data manipulation
The process of changing data to make it more organized and easier to read
Data mapping
The process of matching fields from one data source to another
Data merging
The process of combining two or more datasets into a single dataset
Data model
A tool for organizing data elements and how they relate to one another
Data privacy
Preserving a data subject’s information any time a data transaction occurs
Data range
Numerical values that fall between predefined maximum and minimum values
Data replication
The process of storing data in multiple locations
Data science
A field of study that uses raw data to create new ways of modeling and understanding the unknown
Data security
Protecting data from unauthorized access or corruption by adopting safety measures
Data strategy
The management of the people, processes, and tools used in data analysis
Data transfer
The process of copying data from a storage device to computer memory or from one computer to another
Data type
An attribute that describes a piece of data based on its values, its programming language, or the operations it can perform
Data validation
A tool for checking the accuracy and quality of data
Data visualization
The graphical representation of data
Data warehousing specialist
A professional who develops processes and procedures to effectively store and organize data
Database
A collection of data stored in a computer system
Dataset
A collection of data that can be manipulated or analyzed as one unit
DATEIF
A spreadsheet function that calculates the number of days, months, or years between two dates
Delimiter
A character that indicates the beginning or end of a data item
Descriptive metadata
Metadata that describes a piece of data and can be used to identify it at a later point in time
Digital photo
An electronic or computer-based image usually in BMP or JPG format
Dirty data
Data that is incomplete, incorrect, or irrelevant to the problem to be solved
Discrete data
Data that is counted and has a limited number of values
DISTINCT
A keyword that is added to a SQL SELECT statement to retrieve only non-duplicate entries
Duplicate data
Any record that inadvertently shares data with another record
Equation
A calculation of data that involves addition, subtraction, multiplication, or division (also called math expression)
Estimated response rate
The average number of people who typically complete a survey
Ethics
Well-founded standards of right and wrong that prescribe what humans ought to do, usually in terms of rights, obligations, benefits to society, fairness, or specific virtues
Experimenter bias
The tendency for different people to observe things differently (also called observer bias)