Data Management Tools Flashcards
Well-thought-out collections of computer files, the most important of which are called tables.
Databases
Consist ofrecords(rows) separated byfields(columns) that can be queried (questioned) to produce subsets of information.
Tables
Software systems used to create databases.
Database Management Systems (DBMS)
- Volume
- Variety
- Veracity
- Velocity
4 Attributes That Define Big Data (4 Vs)
Akin to the “trustworthiness” of the data.
Veracity
How many data are presented that need to be stored or processed over a given time period.
Velocity
Sometimes calleddata discoveryis the examination of huge sets of data to find patterns and connections and identify outliers and hidden relationships.
Data mining
Are tools that are used to standardize data across systems and allow the data to be queried.
ETL is an acronym forextract, transform, and load
Evolved as an infrastructure for storing and processing large sets of data across multiple servers. Instead of centralized files in one place like a data warehouse or datamart, it uses a distributed file system that allows files to be stored on multiple servers.
Hadoop
The most widely used standard computer language for relational databases, as it allows a programmer to manipulate and query data.
Structured Query Language (SQL)
Produces interactive data visualization products focused on business intelligence.
Tableau