Data Management and Analytics Flashcards

1
Q

What is Big Data and the 5 V’s of Big Data?

A
  • the corporate accumulation of massice amounts of data that can be used for analysis, commonly referred to as data analytics
  • volume, velocity, variety, vercity and value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define volume

A
  • the quantity or amount of data points
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define velocity

A
  • the speed of data accumulation or data processing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define variety

A
  • the different types of data that are involved in the analysis
  • unstructured, structured and semi-structured
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define veracity

A
  • represents the reliability, quality or integrity of the data (trustworthiness)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define value

A
  • the insights Big Data can yield
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a primary key?

A
  • unique identifiers for a specific row within a table and are made up of one or more attributes
  • each row must have a unique primary key
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a foreign key?

A
  • attributes in one table that are also primary keys in another table
  • the link between a primary key in one table and a foreign key in another table is what creates a relationship between tables
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the extract, transform and load process (ETL)?

A
  • the process in which data is captured from its source and transferred to an org’s custody so that it can then be further analyzed
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is data extraction?

A
  • can take the form of an automated process, semiautomated process or manual extraction
  • the native source and the means of accessing the data must be determined in the initial ETL phase which will dictate the tools needed for designing the overall process of extraction
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is manual extraction?

A
  • a person may have to use specialized data mining software or write customized queries to obtain the data
  • tools used must ensure the data is coming from the correct location and is complete and accurate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is data transformation?

A
  • one of the most time consuming steps in the ETL process because it entails taking the often-unstructured raw data, cleaning it, manipulating it and validating it to ensure it is accurate and ready for analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is data validation?

A
  • needed after transformation to ensure data is not lost or inappropriately modified in the cleaning process
  • may be a visual review for simple data sets
  • if data set is large, basic statistical sets may be required to ensure the data has maintained integrity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is data load?

A
  • loading the data into a software program for analysis or into a data storage location
  • main concern is that the data has been extracted and transformed into a format that is compatible with the software program or storage destination
  • may be stored in an Operational Data Store (ODS), data warehouse, data mart or data lake
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the 4 key applications in data analytics?

A

Descriptive analytics- describing or explaining WHAT HAS occurred (summarizes the activity)

Diagnostic analytics- diagnosing or explaining WHY it occurred (uncovers correlations, patterns, and relationships)

Predictive analytics- predicting WHAT WILL occur (forecasts and projections)

Prescriptive analytics- prescribing WHAT COULD or SHOULD occur (recommendations and next steps)

the 2 D’s are backward looking
the 2 P’s are forward looking

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is data mining?

A
  • allows users to obtain data themselves from databases or data warehouses
  • allows users to perform disgnostice analytics to drill down into underlying data to answer q’s/better understand the data
  • cannot perform without a computer
17
Q

What are examples of the uses of data analytics?

A

Customer and Marketing analytics- supports digital marketing and allows a company to deliver timely, relevant and anticipated offers to customers

Managerial and Operational analytics- use data mining and data collection tools to plan for more effective business operations

Risk and Compliance analytics- used to monitor transactions through continuous monitoring, continuous auditing and fraud detection

Financial analytics- monitor financial performance through data mining and ratio analysis on a continuous basis

Audit analytics- assessing risk, providing assurance around certain ops, establishing thresholds and expectations, improving the quality of the audit by testing full populations

Tax analytics- organize tax info and guidelines, improve tax planning, monitor tax performance indicators