Data Management And Analytics Flashcards

1
Q

A fact, an occurrence, and instance or otherwise measurable observation including numerical digits, text, images, videos, recordings:

A

Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The corporate accumulation of massive amounts of data that can be used for data analytics

A

Big data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the five dimension of data (five V’s):

A
Volume
Velocity
Variety
Veracity
Value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

This V of big data represents the quantity or amount of data points or the size of the data

A

Volume

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

This V of big data refers to the speed of data accumulation or data processing

A

Velocity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

This V of big data represents the range of data type being processed or analyzed

A

Variety

Structured data - defined organizational format that has specific parameters (telephone numbers)
Semi-structured data - hybrid of structured and unstructured data (comma-separated values file)
Unstructured data - a format that does not have predefined parameters and lacks organization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

This V of big data represents the reliability, quality, or integrity of the data. Processes should be implemented so that duplicate fields missing fields, incorrect formats or characters are removed.

A

Veracity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

This V of big data refers to the insights big data can yield. Not all data will translate to actionable insights, so it is important to understand the question or business problem that needs solved before blindly looking at data.

A

Value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data can be stored in a variety of ways, but one of the most efficient and effective methods for many use cases is to store data in a what? These allow data to be stored in different tables and are linked through relationships using key fields.

A

Relational database

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In tables, a column is what?it describes the properties desired to be known about each entry

A

Attributes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

In a table, a row is what? It contains information about one entry within that table.

A

Records

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The intersection of a column and row (attribute and record)

A

Field

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Two main types of database keys are:

A

Primary key

Foreign key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A unique identifier for one specific row within a table, can be made up of one or more attributes

A

Primary keys

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Attributes in one table that are also the primary key in another table. For example, the customer ID may be the primary key in the customer table; however, it is a foreign key in the sales table.

A

Foreign key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Two types of database views:

A

Logical database view

Physical database view

17
Q

Represents the type of data that is tired in a database and it intended o explain the contents as well as structure of a database to users.

A

Logical database view

18
Q

Represents how the data is actually stored, processes, and or accessed within a database:

A

Physical database view

19
Q
Involves steps such as:
Determine the desired output 
Remove inaccurate data
Address missing fields
Remove sensitive information not needed
Ensure proper formatting, etc.
A

Cleaning data

20
Q

This process ensures data is not lost or inappropriately modified during the cleaning process. May only require a visual review or may require a statistical test

A

Validating data

21
Q

The process that can supplement or enhance data in a way that adds value to the existing data points

A

Manipulating data

22
Q

A repository f transactional data from multiple sources and is often a source for data warehouses

A

Operational data store (ODS)

23
Q

Very large data repositories that are centralized and utilized for reporting and analysis rather than for transaction purposes

A

Data warehouse

24
Q

Like a data share house but is more focused on a specific purpose such as marketing, logistics, etc.

A

Data mart

25
Q

Similar to a data warehouse, but it contains both structured and unstructured data with data mostly in its raw or natural data format.

A

Data lake

26
Q

A storage requirement that means each table must have a unique primary key as a record identifier

A

Entity integrity

27
Q

A storage requirement that notes a change to a primary key in one table must also cause a change to any related foreign key in a table that it is linked.

A

Referential integrity

28
Q

Four categories of data analytics:

A

Descriptive analytics - describing or explaining what has occurred
Diagnostic analytics - diagnosing or explaining why it occurred
Predictive analytics - predicting what will occur
Prescriptive analytics - prescribing what could or should occur

29
Q

Best to show quantitative trends over time

A

Line charts

30
Q

Best at showing comparisons

A

Column chart

31
Q

Best at showing additional details of a column chart

A

Stacked column chart

32
Q

Best at showing relationships between two variables

A

Scatter plots

33
Q

Best at showing lower and upper extremes, quartiles, and median data points

A

Boxplots

34
Q

Best at showing frequency

A

Dot plots

35
Q

Best at showing proportions of a whole value as a percentage

A

Pie charts

36
Q

Best at showing underlying foundations or building blocks that go into achieving a task or plan

A

Pyramid

37
Q

Best at showing a process hat has beginning and ending steps and a series of steps in between

A

Flowcharts

38
Q

Best at showing the cumulative effect of a series of data point that make up a whole

A

Waterfall chart

39
Q

Best at showing key events or milestones

A

Directional charts