B6-M4 Data Management and Analytics Flashcards
1
Q
what is data mining?
A
allows users to obtain data themselves from databases or data warehouse. It also allows users to perform diagnostic analytics to drill down into underlying data to answer questions or better understand the data
2
Q
what are categories of big data?
A
- Veracity: the trustworthiness of data
- Volume: size of data set
- Velocity: speed or flow of data
- Variety: type or source of data
3
Q
what is publication phase of data life cycle?
A
publication is when data is circulated to users for various purposes
4
Q
what is archival?
A
the storage phase of the data life cycle
5
Q
what are keys in a table?
A
- Primary key: a unique identifier to allow a user to identify a specific record in a databases
- secondary key: non-identifying column used to find a row in a table
- foreign key: a column found in a relational database table that links data between two tables. It references the primary key of another table
- schema: is the organization of data that represents the construction of the database management system (DBMS)
6
Q
what is SQL?
A
- structure query language
- a type of code that uses commands such as SELECT, FROM, WHERE to query a database. the most common programming language to pull data
7
Q
what is data analytics?
A
involve: extract, transform, and load (ETL) process to pull data and get it into a format that is usable
- manual extraction is needed if the source data is in a format or location that is not easily attainable, which could require data mining software that is capable of complex manipulation and executing multiple queries
- transformation is the function that is performed after data has been obtained
- loading: the last phase in the ETL process
8
Q
what are 4 types of data analytics?
A
- Predictive: provide expected or predicted outcomes based on historical data. only provide a simple descriptive output
- Diagnostic: explain why something happened. provide simple descriptive output and does not explain the drivers or underlying causes of the value of the output
- Prescriptive: prescribe or recommend actions to be taken based on advanced analytics to reach a desired goal. provide simple descriptive output
- Descriptive: describe what happens within data