chapter 1 Flashcards

1
Q

Define Data Analytics

A

the process of evaluating data with the purpose of drawing conclusions to address business questions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is used to analyze data to give organizations the information they need to make sound and timely decisions?

A

technologies, systems, practices, methodologies, databases, statistics, and applications.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Patterns are discovered from…

A

past archives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an analytics mindset?

A

recognizing when and how data analytics can address accounting questions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is Data scrubbing and data preparation

A

comprehend the process needed to extract (query), clean, and prepare the data before analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define data quality

A

recognize what is meant by data quality, be it completeness, reliability, or validity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

descriptive data analysis

A

perform basic analysis to understand the quality of the underlying data and their ability to address the business question

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

data analysis through data manipulation

A

demonstrate ability to sort, rearrange, merge, and reconfigure data in a manner that allows enhanced analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

problem solving through statistical data analysis

A

identify and implement an approach that will use statistical data analysis to draw conclusions and make recommendations on a timely basis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

data visualization and data reporting

A

report results of analysis in an accessible way to each varied decision maker and his or her specific needs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is the objective of data extraction

A

to identify and obtain the data from the appropriate source

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is the objective of transforming data

A

to validate the data for completeness and integrety

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the objective of loading data

A

to load the data into the appropriate tool for analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the five steps of the ETL process

A

determine the purpose and scope of the data request, obtain the data, validate the data for completeness and integrity, clean the data, load the data for data analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define classification

A

an attempt to assign each unit in a population into a few categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

define Regression

A

a data approach that attempts to estimate or predict, for each unit, the numerical value of some variable using some type of statistical model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

define similarity matching

A

a data approach that attempts to identify similar individuals based on data known about them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

define clustering

A

an attempt to divide individuals into groups in a useful or meaningful way

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

define co-occurrance grouping

A

a data approach that attempts to discover associations between individuals based on transactions involving them (i.e. when amazon says customers who bought this also bought…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

define profiling

A

a data approach that attempts to characterize the “typical” behavior of an individual, group, or population by generating summary statistics about the data (mean, median, stnd deviation)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

define link prediction

A

a data approach that attempts to predict a relationship between 2 data items (i.e. facebook sees you have 20 mutual friends w someone, suggests them as a friend)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

define structured data

A

data that are stored in a database or spreadsheet and are readily searchable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

define training data

A

existing data that have been manually evaluated and assigned a class

24
Q

define test data

A

a set of data used to assess the degree and strength of a predicted relationship established by the analysis of training data

25
Q

what is Benford’s law

A

it states that when you have a large set of naturally occuring numbers, the leading digit is likely to be small.

26
Q

Define diagnostic analytics

A

procedures that explore the current data to determine why something has happened the way it has, typically comparing the data to a benchmark

27
Q

what are the examples of diagnostic analytics

A

profiling, clustering, similarity matching, co-occurrence grouping

28
Q

define predictive analytics and what are it’s examples

A

procedures that can generate a model that can be used to determine what can happen in the future. Examples are regression, classification, and link prediction.

29
Q

what is nominal data

A

qualitative data that cannot be ranked (i.e. hair color)

30
Q

what is ordinal data

A

qualitative data that can be ranked (i.e. gold, silver, bronze, or an A,B,C grade)

31
Q

what is ratio data

A

quantitative data where 0 defines the “absence” of something. (i.e. cash)

32
Q

what is interval data

A

quantitative data where 0 is just another number (i.e. temperature)

33
Q

What is discrete data

A

quantitative data that only shows whole numbers (i.e. points in a basketball game)

34
Q

What is continuous data

A

quantitative data that shows numbers with decimals (i.e. height)

35
Q

what are declarative visualizations

A

visualizations that present findings to an audience (i.e. financial results

36
Q

what are exploratory visualizations

A

visualizations used to gain insight while you are interacting with the data (i.e. identifying good customers)

37
Q

what is the base standard

A

defines the formats for files & fields as well as master data requirements for users, business units, and tax tables.

38
Q

what is the general ledger standard

A

Defines the chart of accounts, source listings, trial balance, and general ledger or journal entry detail

39
Q

what is the order to cash subledger standard

A

defines sales orders and line items, shipments, invoices, open accounts receivable and adjustments, cash receipts, and customer master data.

40
Q

what is the procedure to pay subledger standard

A

defines purchases and line items, goods received, invoices received, open accounts payable and adjustments, payments, and supplier master data

41
Q

what is the inventory subledger standard

A

defines inventory location master data, product master data, inventory on hand data, and inventory movement transactions, and physical inventory and material cost

42
Q

what is the fixed asset subledger

A

defines fixed asset master data, additions, removals, and depreciation calculations.

43
Q

what is a homogeneous system

A

one single uniform installation or instance of a system

44
Q

what is a heterogeneous system

A

multiple installations or systems

45
Q

what do systems translator software do

A

attempts to map the various tables and fields from the varied enterprise systems in a heterogeneous system into a data warehouse

46
Q

what is a data warehouse

A

where all of the data can be analyzed centrally, it is a repository of data accumulated from internal and external data sources

47
Q

what is a flat file

A

a means of storing the data in one place, it is a single table of data with user-defined attributes that is stored separate from any application

48
Q

what is a correlation coefficient

A

how closely 2 datasets are correlated or predictive of one another

49
Q

what numbers do correlation coefficients range between

A

-1 to 1

50
Q

what is the hot hand fallacy

A

assuming events are not independent when they are

51
Q

what is selection bias

A

Having the wrong take because of the group you’re deriving data from

52
Q

what is publication bias

A

significant findings are published, not finding anything results in not being published

53
Q

recall bias

A

participants do not remember previous events or experiences accurately, or once they are told something happened a certain way, they believe to remember it that way.

54
Q

what is survivorship bias

A

focusing on the people or things that made it past some sort of selection (we analyze data from existing companies, not the ones that failed)

55
Q

what charts would you use for conceptual (qualitative) data

A

bar charts, pie chart, heat map, tree map (for comparison)
symbol map (for geographic data)
word cloud (text data)

56
Q

what charts would you use for data-driven (quantitative) data

A

box and whisker plot (for outlier detection) scatter plot (relationship between two variables)
line chart (trend over time)
filled map (geographic data)