EXAM 2 Flashcards

1
Q

4.1 How do Accountants Design Data Analysis Projects?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Create a Data Analysis Project Plan

What are the 5 steps?

Step 1: F___ on the o____e
Step 2: Select a d___ str___y
Step 3: Select an an___ str___y
Step 4: Consider r___
Step 5: Em___ co___s

A

Step 1: Focus on the objective
Step 2: Select a data strategy
Step 3: Select an analysis strategy
Step 4: Consider risks
Step 5: Embed controls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Create a Data Analysis Project Plan
Step 1: Focus on the objective

Keep the project’s 1) o___ve and 2) sp___ q____s in mind to 3) select the 4) b___ data and 4) an___ strategies to 5) ful__ the ob___tive and answer those 6) q___

Simply remembering to ask how the plan’s 7) pro__ d__and analysis st___y decisions 8) re___ to the 9) objective helps us make better 10) ch__

A

1) objective
2) specific questions
3) select
4) best data
4) analysis strategies
5) fulfill the objective
6) questions
7) proposed data and analysis strategy decisions
8) relate
9) objective
10) choices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Create a Data Analysis Project Plan
Step 2: Select a data strategy

Use 1) cr___ t____g to 2) d__p and r___ a few data 3) al___s.

This 4) en___ that we choose the data option most 5) ap___e for the objective

A

1) critical thinking
2) develop and rank
3) alternatives
4) ensures
5) appropriate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Create a Data Analysis Project Plan
Step 3: Select an analysis strategy

Use what was 1) le___ from 2) s___g the data 3) str__ and apply that same 4) dev___t and r____g process to analyses 5) alt___

Following this step 6) inc___ the likelihood of selecting the 7) b___ analysis 8) o___n

A

1) learned
2) selecting
3) strategy
4) development and ranking
5) alternatives
6) increases
7) best
8) option

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Create a Data Analysis Project Plan
Step 4: Consider risks

1) Co___ and pr___g cr___l risks to both the data and the analysis strategies 2) re__s how these risks can create 3) mis___ and in___ results

A

1) Considering and prioritizing critical
2) reveals
3) misleading and invalid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Create a Data Analysis Project Plan
Step 5: Embed controls

Designing and implementing 1) pre____ and de___ co___s into the analysis process leads to results that are 2) ac___, v____, and re___

A

1) preventative and detective controls
2) accurate, valid, and reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Step 2: Select the Data Strategy

Develop several data alternatives that could help answer the objective question.

Then, to select the most useful data alternative for the project plan, identify the factors you want to use to rank these alternatives and assign values to each alternative’s factors.

The best data strategy alternative is the one with the 1) hi___t ov__l factor r___gs

A

1) highest overall factor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Step 3: Select the Analysis Strategy

For any project plan, choosing the best analysis strategy involves considering and evaluating several possible alternative analyses given the objective questions and the already selected strategy for the data

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Steps 4 and 5: Data Strategy Risks and Controls

These data risks can be controlled by comparing the extracted data to the source invoice and collection documents.

Possible risks involving the management’s estimates of bad debt percentages include human bias embedded in the authorized bad debt percentages, changes in customer payment behavior, and business process changes to credit customer approvals and collection practices.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Steps 4 and 5: Data Strategy Risks and Controls

One way to control for these risks is to ask the finance, sales, and accounts receivable managers if there were changes to the customer base, market conditions, or business policies and procedures for credit approval, write-offs, or collection policies during the year. This knowledge could confirm existing bad debt percentages or motivate their adjustment.

Another control for evaluating the risks in management estimates and assumptions is to compare 1) c___t data to p___ y__ data. Evaluating increases in the median number of days outstanding and the number of new to returning customers in each age group can offer insights into the reasonableness of bad debt estimation percentages.

A

1) current data to prior year data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

4.2 What Should We Consider When Selecting Data for Analysis?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

It’s easier to make better decisions when we have the 1) ne___ in___

A

1) necessary information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

A successful data analysis project hinges on selecting data that are 1) rel___t and ap___ for the objective of the project, respecting the data’s 2) cha___s and me___ scales, and controlling for in__ data ri__

A

1) relevant and appropriate
2) characteristics and measurement scales, and controlling for inherent data risks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Identify Appropriate Data

Data can be considered appropriate for analysis when they are 1) r___t, av__e, and the ch___ ma__ the analysis method 2) req___.

Appropriate data can be 3) in___, e___al, or a co____ of both

A

1) relevant, available, and the characteristics match
2) requirements
3) internal, external, or a combination of both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Identify Appropriate Data

Internal data are generated within the organization, such as 1) s__ data, pur___ data, in__y data, c___er data, and ve___r data. Internal data can typically be more 2) ea___ co___d and v___d by an organization.

External data are obtained from sources outside of an organization. This data can include 3) we___ data, ge___ic data, and pub__ available co___or data. External data are somewhat 4) ri___r to use since we often 5) ca__t know if the data are 6) ac___ or co___. External data can, however, provide 7) in___ that internal data alone 8) c___t p___e

After identifying the available and relevant data alternatives, the 9) cha__ of the possible data sets need to be 10) ve___ as 11) s___e for the planned analysis.

A

1) sales data, purchase data, inventory data, customer data, and vendor data
2) easily controlled and verified
3) weather data, geographic data, and publicly available competitor data
4) riskier
5) cannot
6) accurate or complete
7) insights
8) cannot provide
9) characteristics
10) verified
11) suitable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define Data Set

A 1) co____ of data 2) col___ and r___ available for analysis

A

1) collection
2) columns and rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Understanding the 1) cha____ of a data set is important because, for example, 1.1) st___al me___ and t__s often require certain data characteristics or a minimum of data points.

2) Vi___ the data requirements for these measures and tests can threaten the 3) ac___, re___y, and sig____e of the analysis results.

A

1) characteristics
1.1) statistical measures and tests
2) Violating
3) accuracy, reliability, and significance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Define Fields

1) In____l columns representing the 2) cha___s about each 3) r___ stored in the columns of a data set

A

1) Individual columns
2) characteristics
3) record

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Define Attributes

The data fields that describe aspects of a 1) re___, e___t, or a___t of the object of 2) in___.

When the data source is a database, they are the 3) co__ in a data set.

A

1) resource, event, or agent 2) interest
3) columns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Define Records

1) R___ in a data set from a database are records, which represent the collection of 2) co____ that hold the 3) descriptions of a 4) s___e oc_____ of the data set’s 5) pu____

A

1) Rows
2) columns
3) descriptions
4) single occurrence
5) purpose

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

In addition to understanding the content of data fields in accounting databases, considering the 1) so__ of the data is important because the 2) q____of the data in the 3) f___s impact the 3.1) q___ of the 4) a__s

A

1) source
2) quality
3) fields
3.1) quality
4) analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the 2 types of raw data fields?

-____d raw data
-N____-___ed raw data

A

-Measured raw data
-Non-measured raw data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Define Measured Raw Data

Data 1) cr____ or ca___d by a 2) con____process capturing the 3) v__ of the data.

Examples include 4) p___e, c___t, n___r on hand, w___t, d___, h____s worked, temp___, sensor r___s, h____observation, or ec____c value.

Their format can be 5) di___ or co___ data.

A

1) created or captured
2) controlled
3) value
4) price, cost, number on hand, weight, date, hours worked, temperature, sensor readings, human observation, or economic value
5) discrete or continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Measured raw numeric data

can be used in ma___l cal__

A

mathematical calculations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Define Non-measured raw data

Data often created 1) au___ by the 2) co___ or com___y policy for 3) c___l.

Examples include 4) ide____ codes, chart of ac___t nu___, standard d____s, product cat___ c__s, or loc___ codes such as city.

These fields are typically formatted as 5) di___ data

A

1) automatically
2) computer or company
3) control
4) identification codes, chart of account numbers, standard descriptions, product category codes, or location codes such as city
5) discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Non-measured raw data

Non-measured raw data that have been formatted as 1) nu___ and di___e, meaning that they are 2) ___n-con____and will 3) n__r have 4) pa___ u___t v__.

These fields 5) ca___ be used in analysis for 6) mat____ cal___s because they refer to 7) u___e ide___ers for 8) in____ items and their 9) pro____category group.

A

1) numeric and discrete
2) non-continuous
3) never
4) partial unit values
5) cannot
6) mathematical calculations 7) unique identifiers
8) inventory items
9) product category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Define Calculated Data

Data created when 1) o__ or m___ fields in a particular 2) r____ (r___) have any number of 3) ma___l operators (such as +, −, ×, %) applied, and often are 4) d___d from using the data in another 5) fi___ or fi__ within the 6) sa__
record (7) n___across 8) di____ rows).

These fields can be formatted as 9) d___e or co___ data.

A

1) one or more
2) record (row)
3) mathematical operators
4) derived
5) field or fields
6) same
7) never a
8) different
9) discrete or continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Calculated data that is also 1) nu__and likely to be used in a variety of 2) dif___ mat___ ca___s in accounting.

A

1) numeric
2) different mathematical calculation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Define measurement scale

refers to the type of 1) in___ pr___ by the data.

Data measurement scales should be considered when 2) de___ the data strategy, as they impact which analyses can be 3) pe___ on the data.

A

1) information provided
2) designing
3) performed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What are the 4 types of data measurement scales?

-ca____
-o___l
-i____l
-r___

A

-categorical
-ordinal
-interval
-ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Define Categorical (nominal) Data

1) La____or n____d data that can be sorted into 2) gr___s according to 3) sp___ ch___s. The data 4) d__ n___ have a 5) qu___ value

A

1) Labeled or named
2) groups
3) specific characteristics
4) do not
5) quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Define Ordinal Data

1) Or___or r__d ca___al data. 2) Di___ between the categories 3) d___ n__ need to be 4) k___ or e___

A

1) Ordered or ranked categorical
2) Distance
3) does not
4) known or equal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Define Interval Data

1) O___ data that have 2) e__l and con___ different between 3) o___ns and an ar___ z___ po___t

A

1) Ordinal
2) equal and constant
3) observations and an arbitrary zero point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Define Ratio Data

1) Int__ data with a 2) n___l z__ p__t. A natural zero point means that it’s 3) n__ ar___

A

1) Interval
2) natural zero point
3) not arbitrary.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Consider Data Strategy Risks and Implement Controls

Another benefit of documenting data strategy in a project plan is that examining each choice can help identify three common data 1) ri___ that might affect an analysis’s 2) v___, a____, and r___ty

A

1) risks
2) validity, accuracy, and reliability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

What are the 3 common data risks ? (ig this would be more summarized)

  1. N___-re____e sa___ sel____
  2. O___ data points
  3. D__Data
A
  1. Non-representative sample selections
  2. Outlier data points
  3. Dirty Data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

What are the recommended controls for the 3 data risks?

  1. Non-representative sample selections
    -1) v___ rep____s of sample
  2. Outlier data points
    -perform a 2) h___m or quartile analysis to identify 3) ou___, then 4) ex__ or justify the 5) r___ used for outlier 6) ad___t or re__l
  3. Dirty Data
    -7) v___ in___y of data set and cl___ u__ dirty data is___
A

1) verify representativeness of sample
2) histogram
3) outliers
4) explain or justify
5) rule
6) adjustment or removal
7) verify integrity of data set and clean up dirty data issues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

3 common data risks (more detailed)

The first risk is that a sample 1) ex____from a 2) la___r po___ of data are a 3) po___ representation of the 4) un____ population.

That is why performing tests on the sample’s 5) repr____ can either 6) ve___ its 7) val___ or d___t its significant 8) we___

A

1) extracted
2) larger population
3) poor
4) underlying
5) representativeness
6) verify
7) validity or detect
8) weakness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

3 common data risks (more detailed)

The second risk is potentially including unusual data points in an analysis.

Outlier data points are 1) un__ data points compared to the rest of the data in either an activity point (x-axis or the independent variable), such as number of units produced, or an unusual level of the economic value (y-axis or the dependent variable), such as total costs or total sales.

The best control for identifying outliers is to 2) vi___ the data in a graph to check if the measure is very different than the rest of the data.

Identified outliers can either be 3) re___ (if possible) or 4) re___ with a logical and documented rule.

A

1) unusual
2) visualized
3) remeasured
4) removed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Define Dirty Data

variety of data ___

provides 1) in___or inc___e descriptions of the economic 2) ac___ of a business

A

errors
1) inaccurate or incomplete
2) activities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

3 common data risks (more detailed)

Dirty Data

Dirty data include 1) m__g, in___ du__e, and inc___ formatted data.

For example, the purchase orders, invoices, or checks written by the company should be sequentially accounted for, with no missing numbers or two with the same identifying number.

Controls should 2) a__s t__t for dirty data 3) be__ the analysis is performed.

Data can be 4) co___ to its source 5) doc____, or if that is not possible, tested for 6) rea____. Testing for reasonableness can include checking for 7) m___g seq___ num__, 8) du___ numbers, or 9) ve___ the data have the ex___ fo___t or ac___ble characters.

A

1) missing, invalid, duplicate, and incorrectly
2) always test
3) before
4) compared
5) documents
6) reasonableness
7) missing sequence numbers
8) duplicate numbers
9) verifying the data have the expected format or acceptable characters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Once data have been selected, evaluated, and any risks identified and controlled, the next step is to develop an analysis 1) st___ ap___e for the 2) o___ of the project

A

1) strategy appropriate
2) objective

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

4.3 What Should We Consider When Selecting an Analysis?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

What are the two questions when designing a data analysis strategy?

  1. Can the ch___ a___is strategy an__r the sp__c ob___ qu___s?
  2. Is the me____ sc___ of the data ap____ for the selected analysis strategy?
A
  1. Can the chosen analysis strategy answer the specific objective questions?
  2. Is the measurement scale of the data appropriate for the selected analysis strategy?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

The specific type of descriptive and diagnostic analysis used depends on the 1) me____ s__ of the data

A

1) measurement scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

1) Des____ analyses are the most common analysis strategies using accounting data.

It is useful for 2) ev____g str____ pe____e because it provides more 3) me_____ and b____ss in____

A

1) Descriptive
2) evaluating strategy performance
3) meaning and business intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Diagnostic analysis strategies can be compared to detective work. Accountants can use these analysis strategies to identify and discover the most likely 1) ca__s of accounting phenomena

Diagnostic strategies always require applying 2) cr___l thi___ skills.

A

1) causes
2) critical thinking

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Define Variable

a data 1) it__ that will be 2) us___ in the analysis

A

1) item
2) used

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Descriptive and diagnostic analyses help to better understand 1) w___ happened and 2) w__ it happened.

What if the objective is to predict a 3) fu___ outcome or determine what strategies 4) mi___t achieve a specific outcome? In those cases, we need to use 5) pre____ or pr____e analyses

A

1) what
2) why
3) future
4) might
5) predictive or prescriptive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Predictive analysis strategies use 1) hi___l data to create models that estimate a 2) fu___ va___e or o___e.

Prescriptive analysis strategies, on the other hand, help determine which 3) op___ is most likely to produce the 4) b___ ou___e given the objective.

Predictive and prescriptive analysis strategies can determine the best use of 5) res___, im___e pe___ce, anticipate market re__s and mo___s, and av___ process breakdowns:

A

1) historical
2) future value or outcome
3) option is most
4) best outcome
5) resources, improve performance, anticipate market reactions and movements, and avoid process breakdowns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

The last two steps of a data analysis plan are to consider 1) ri___ in__t to the data and analysis 2) str___ and to put in place 3) co___s to 4) re___ those risk

A

1) risks inherent
2) strategies
3) controls
4) reduce

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

It is critical to consider these risks during analysis strategy planning and to add controls to help 1) m___ them.

Without those controls we risk 2) pre___, inter___g, and rep____ inc___t analyses res___ possibly causing ourselves or our stakeholders to make 3) har__ dec___s

A

1) mitigate
2) preparing, interpreting, and reporting incorrect analyses results
3) harmful decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

4.4 How Do Data and Analysis Strategies Differ Across Practice Areas?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

As the last step of the data analysis planning stage, accounting professionals across practice areas 1) d___ data and analysis 2) str___ based on their project 3) ob___s

A

1) design
2) strategies
3) objectives

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Accounting Information Systems

Because a company’s accounting information system is involved in 1) pla___, ex___, con___, and rep___ the business’ operations, data analysis projects in this area often involve 2) in____ary data and 3) kn___

A

1) planning, executing, controlling, and reporting
2) interdisciplinary
3) knowledge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Accounting Information Systems

These project objectives can range in scope from impacting just one or two employees to involving most of the organization.

All four types of analytics objectives (1) de___e, diag___, pr___e, and pre____) are commonly used.

A

1) descriptive, diagnostic, predictive, and prescriptive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Accounting Information Systems

The data analyzed in these projects can include a variety of IT operational data with different levels of data 1) inte___ and co__l doc___n.

More traditional accounting data are often analyzed with 2) qua___ cat____l and or____ data:

-Counts of h___ tickets and inc____
-Er___ counts and d___y times
-L___n issues
-System s____n and _ service satisfaction

A

1) integrity and control documentation
2) qualitative categorical and ordinal

-Counts of help tickets and incidents
-Error counts and delay times
-Login issues
-System satisfaction and IT service satisfaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

Accounting Information Systems

AIS data analyses projects may also choose data strategies which use 1) qua____ int___l and ra___ data:

Costs such as equipment and software installations and updates, and IT staff labor.

Budget variances for any of these costs

A

1) quantitative interval and ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

Accounting Information Systems

It is important to select data that captures the AIS system’s 1) per__, vuln___, and er___

A

1) performance, vulnerabilities, and errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

Accounting Information Systems

Accountants designing analysis strategies for AIS projects are focused on increasing 1) co___ve adv____ and improving the 2) or___’s op__s.

They often use 3) st___s to understand which 4) ele__of the accounting system they should 5) f___ on. The mathematical formality of statistical tools can help to 5) pe___e management to make the necessary investments.

A

1) competitive advantage
2) organization’s operations
3) statistics
4) element
5) persuade

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

Accounting Information Systems

Risks

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

Auditing

Auditors use statistics to determine 1) r__s to account balances and 2) un__ transactions, especially in 3) jo___l en__s that involve estimates and assumptions.

They commonly work with different kinds of 4) d___ so___, from process documentation, journal entries, general and subsidiary ledgers, trial balances, to financial ratios

A

1) risks
2) unusual
3) journal entries
4) data sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

Auditing

Auditors use analysis strategies for:

-Continuous auditing modules to test large data populations rather than testing samples with inference risks.

-Automated identification of dirty data, unusual transactions, and pattern anomalies to better reduce audit risk and fraud risks.

-Testing entire transaction cycles’ internal controls with process mining and tracing purchasing to payment flows, P-cards (purchasing credit cards) usage and payroll documentation. Testing the revenue cycle from order to collection

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Auditing

Auditors use analysis strategies for:

-Using robotic process automation (RPA) to remove the human (and often inconsistent) element of repetitive audit tasks, freeing auditors to focus on areas that require thoughtful critical thinking and judgments. (RPA applications and benefits to accountants are explained in the chapter that covers data and analysis developments in accounting.)

-Testing hypotheses about inventory and fixed assets with sampling, statistical tests, and inferences for the population of these assets.

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

Auditing

Risks

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Financial Accounting

Financial accountants are primarily responsible for 1) cap____, re__g, pro___g, sto___, and re__g accounting information. 2) Ac__y, thoroughness, and documentation are essential.

The accounting data they use for analyses are both guided and restrained by accounting 3) re___ns and 4) go___ agency compliance

A

1) capturing, recording, processing, storing, and reporting
2) Accuracy
3) regulations
4) government

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

Financial Accounting

Due to the measurement scales of the variables they need for analysis, financial accountants may have to 1) tran___ data depending on their 2) ob___ q___ns.

These data most often include 3) cat___al, in__al, and ra___ data

A

1) transform
2) objective questions
3) categorical, interval, and ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

Financial Accounting

The purpose of a data analysis strategy might be 1) des___ and di____ng based on corresponding financial accounting rules and regulations:

-2) Na___, ti___g, and aut____ of transactions (and diagnose 3) is___) charged to each account.

-4) In___l control effectiveness and weaknesses.

-The 5) com___ and rea___ of adjusting entries at the end of the period.

A

1) describing and diagnosing
2) Nature, timing, and authorization
3) issues
4) Internal
5) completeness and reasonableness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

Financial Accounting

Financial accountants also use analysis strategies designed for their 1) pre___ and pr____e objectives regarding financial outcomes for managers and board members. One example is 2) es___g additional information to be presented along with their financial statements, such as:

-Ranking alternative capital sources and costs of capital in terms of favorability.

-Future net income and cash flows from operations, investing, and financing activities.

-Impacts of new strategies on the financial statement.

-Expected future costs associated with contingent liabilities, pension costs, new business units or the discontinuation of business units

A

1) predictive and prescriptive
2) estimating

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

Financial Accounting

Financial accountants use statistics to identify 1) opp___s and pr___s with 2) pr___, li___y, and business va___

A

1) opportunities and problems
2) profitability, liquidity, and business valuation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

Financial Accounting

Risks

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

Managerial Accounting

The purpose of most managerial accounting data analyses is to improve 1) pla___, op___al control, and de___-m___g to support an organization’s mission and strategies.

This type of data analysis is valuable for improving the 2) se____, ex___n, and ev____n of organizational strategy.

These kinds of analyses can also lead to decisions that give employees more access to information, which improves performance and, eventually, organizational culture and operation

A

1) planning, operational control, and decision-making
2) selection, execution, and evaluation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

Managerial Accounting

Managerial accountants prepare analysis strategies, both for routine and ad hoc (one-time) objectives for each functional area of their organization. These strategies use data across measurement scales to describe, diagnose, predict, and prescribe strategy impacts:

-Identifying areas where innovation in business organization, policy and process will increase efficiencies, for example, to reduce non-value-added steps and delays.

-Identifying areas where innovation in business partnerships, operations, and internal controls will increase effectiveness.

-Increasing competitive advantage through many new business intelligence opportunities.

-Improving compliance with all legal and regulatory regulations

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

Managerial Accounting

Identifying risks to the data and analyses used in managerial accounting is important to 1) e__e analysis results are 2) ac___ and re__e

A

1) ensure
2) accurate and reliable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

Managerial Accounting

Risks

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

Tax Accounting

tax accountants have more information for their tax 1) pla___ and co___ce services, which improves their 2) jud____ and how they document defend their positions to both clients and regulators.

A

1) planning and compliance
2) judgments

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

Tax Accounting

Perhaps the biggest effect of data analysis strategies on tax practice is the movement away from a dependency on 1) h__c data and toward a forward-looking, 2) v___-ad___ se__ perspective by using 3) pr___ and pre___g analytics strategies.

Examples of these predictive and prescriptive analytics include using complex data modeling to evaluate decision and position alternatives, which improves the value and accuracy of advice

A

1) historic
2) value-adding service
3) predicting and prescribing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

Tax Accounting

Risks

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

5.1 What is Data Profiling?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

Dirty Data

-can result in issues from 1) in__ pricing to an inability to detect 2) f__d to sending wrong bills to customers

-data preparation helps 3) a___d these 4) is__ and more

A

1) inaccurate
2) fraud
3) avoid
4) issues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
82
Q

Define Data Preparation

The process of 1) tran____ data into an analytical database by 2) pr___g, cle___, re__ng and in____g data prior to processing and analysis

It helps ensure 3) hi___-qu___ data and, therefore, improved 4) d___n-m__g.

A

1) transforming
2) profiling, cleaning, restructuring and integrating
3) high-quality
4) decision-making

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
83
Q

Define Data Profiling

the process of 1) in____ data 2) q___y and st____re

A

1) investigating
2) quality and structure

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
84
Q

What are the 3 parts of Data Profiling

  1. Inv___ the data q___ty
  2. Inv____ data str___e
  3. De___and inf___g
A
  1. Investigating the data quality
  2. Investigating data structure
  3. Deciding and informing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
85
Q

3 parts of Data Profiling

  1. Investigating data quality: Search for 1) an___s in the data. That is, are the data 2) di__?
  2. Investigating data structure: Find the 3) b__t way to 4) org___ the data and 5) im___ an__s.
  3. Deciding and informing: Make 6) dec___ about whether it is possible to address the identified 7) is___, what the 8) c__t of doing so would be, and consider the possible 9) cons___ if the issues are not 10) ad___.
A

1) anomalies
2) dirty
3) best
4) organize
5) improve analytics
6) decisions
7) issues
8) ocst
9) consequences
10 addressed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
86
Q

Decisions made in the final phase will guide the 1) ex__, tran___, l__ (__) process by determining what must be 2) c___e

A

1) extract, transform, load (ETL)
2) change

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
87
Q

For now, keep in mind that data profiling detects data 1) iss__ and 2) __ corrects them.

A

1) issures
2) ETL (extract, transform, load)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
88
Q

Investigate Data Quality

When discussing the “quality” of the data we work with, we are referring to the 1) sui__ of using the data for 2) de___n-m___.

Assessing data quality identifies 3) fla___ values in the data set, which reveals if there are data that must be 4) cle___.

There are different methods for doing this, which is called

-5) ru__-dr__ method
-6) exp___n and in___e

A

1) suitability
2) decision-making
3) flawed
4) cleaned
5) rule-driven method
6) exploration and inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
89
Q

Investigate Data Quality
Rule-Driven Method

The rule-driven method is a 1) t__p-d___ approach.

A 2) log___ relationship, or 3) r___, is defined 4) a___g data and 5) te__ to determine if the data 6) con__ to it.

The number of rules that can be specified is nearly 7) unl___

A

1) top-down
2) logical
3) rule
4) among
5) tested
6) conform
7) unlimited

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
90
Q

Investigate Data Quality
Exploration and Inference Methods

The exploration and inference methods are 1) bot___-__ approaches. The goal is to find 2) an___s by 3) exa___ the data from many different 4) per____.

Sorting, 5) fre___ distributions, and 6) o___r analysis are examples of powerful techniques for exploration purposes.

The second bottom-up approach, inference, is a method that relies on 7) co___ algorithms to identify 8) an___s
EX. sorting and frequency distributions

A

1) bottom-up
2) anomalies
3) examining
4) perspectives
5) frequency
6) outlier
7) computer
8) anomalies

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
91
Q

Rule-driven, exploration, and inference methods identify data 1) ano__, which occur when the data 2) d_ n_t meet expectations of 3) cor___, val__, co___y, and com___

A

1) anomalies
2) do not
3) correctness, validity, consistency, and completeness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
92
Q

Define Validation Rules

An 1) in___l part of data profiling, these rules define what 2) va__ are and are 3) n__ acceptable for analysis when investigating data 4) q___

A

1) integral
2) values
3) not
4) quality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
93
Q

What are Data quality characteristics?

-cor__
-va____
-inc___t value, inv___value

A

-correctness
-validity
-incorrect value, invalid value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
94
Q

Correctness

Data describe facts about entities, such as the name of a customer, the price of a product, or the date of a transaction.

Data are incorrect when the value assigned to an entity’s characteristic is wrong

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
95
Q

Validity

An incorrect value means the 1) wr__ value is assigned, and an invalid value means an 2) un____ value is assigned

A

1) wrong
2) unacceptable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
96
Q

Consistency

In addition to being correct and valid, data should be 1) con___

Data inconsistency occurs when the 2) s___e cha__c is represented 3) m__ways

A

1) consistent
2) same characteristic
3) multiple

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
97
Q

Using mgr, mngr, and manager to describe the same job position would create data in___

A

inconsistency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
98
Q

Consistency

How can we identify inconsistencies like these?
Here are two profiling techniques:

-Create a 1) li__ with all distinct 2) val___, then 3) s___ and review. The inconsistent values of mgr, mngr, and manager will likely be noticeable immediately.

-Build a 4) fre__ table, or a table that 5) co___ how many times a 6) v__e occurs. Values with a 7) l___ frequency might indicate 8) inc___ data.

Which is the best option ?

A

1) list
2) values
3) sort
4) frequency
5) counts
6) value
7) low
8) inconsistent

the second one is the best option

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
99
Q

Completeness

Data that is correct, valid, and consistent are only 1) ac___ if they are also 2) co___e, as noncomplete data might result in 3) dis___ insights

A

1) accurate
2) complete
3) distorted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
100
Q

Completeness
Data can be in complete in 2 ways.

  1. A missing instance is when a 1) co__t occurred but is not 2) rec___, (nothing there) such as when goods were sold, but the sales transaction was not recorded. A missing instance such as a missing sales transaction can be identified with gap analysis. If there is a sequential number for each sales invoice, then a missing number might indicate a sale that occurred but was not recorded.
  2. A missing value occurs when the 3) tra___ is recorded, but we 4) __ n___ have information for all 5) cha__. (we have some info, but not complete) The result is empty cells. This might happen if a customer is recorded, but the record does not include the customer’s email address. The term null typically indicates a missing or unknown value. It is important to assess how the missing information affects decision-making
A

1) concept
2) recorded
3) transaction
4) do not
5) characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
101
Q

Investigate Data Structure

Along with their quality, data are investigated to assess whether they are 1) str___ in a way that makes data analytics 2) e__ and ef___t

A

1) structured
2) easy and efficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
102
Q

Investigate Data structure

Unambiguous descriptions
-poorly named columns make it complicated to develop information models and to conduct analysis
-column names should be correct, intuitive, and clear because they’re part of the analytical database, which might have many different users
-unambiguous descriptions should be a priority when developing an analytical database
-analytical database: integrated data set used for analyses purposes

A
103
Q

Define Analytical Database

A 1) c__, w___l-st___d, in___ data set that can be used for analysis

A

1) clean, well-structured, integrated

104
Q

Define Slicing

Analysis includes 1) ag___, or gr__, data and then examining them from different 2) vie___s in many ways

A

1) aggregating, or grouping
2) viewpoints

105
Q

What 2 table structures are common best practices?

Si__-Va__d Co__s

F___ Tables

A

Single-Valued Columns

Flat Tables

106
Q

Define Single-Valued Columns

A 1) co___ in which each cell contains 2) o__ value describing a 3) si__e ch__

4) T__ or m___e values in the same cell makes analysis more 5) ch___g

A

1) column
2) one value
3) single characteristic
4) Two or more
5) challenging

107
Q

Define Composite Column

A column with cells that 1) co___ values for 2) t__o or more cha__

A

1) combines
2) two or more characteristics

108
Q

Define Multi-Valued Column

A column with cells that contain 1) mu___ values of the 2) s__ ch___c

A

1) multiple
2) same characteristic

109
Q

Define Flat Tables

column 0) h___ 1) ___ n__t contain data values 2) u__l for analysis purposes.

For data analysis, flat table structures are 3) pre___ over 4) cros___n tables

it’s the 5) __ practice for structuring data

A

0) headers
1) do not
2) useful
3) preferred
4) crosstabulation
5) best

110
Q

Crosstabulation tables have li___ for analysis

A

limitations

111
Q

Define Data Model

defines how different 1) ta__ re__ te to each other

A

1) tables relate

112
Q

Define Star Schema

The 1) reco___ data structure for analytical databases. They consist of 2) f___ t___s and di___n tables

A

1) recommended
2) fact tables and dimension

113
Q

Define Facts

In an accounting context, the 1) st__ data that correspond to business 2) tran___ such as 3) or__, sa___, pu__s, and pa___s

A

1) stored
2) transactions
3) orders, sales, purchases, and payments

114
Q

Fact Tables

The columns in fact tables are also mostly 1) qua___, and more specifically, 2) ad___

The values in these columns can be easily 3) ag__d, or gro___ together, as me__s.

These measures can then be 5) s__ using the columns in the next type of tables

A

1) quantitative,
2) additive
3) aggregated, or grouped together, as measures
4) sliced

115
Q

Define Dimensions

In an accounting context, these are the data that 1) pro___ co__t to analysis and give 2) me___ to f__s.

They are the 2) va___es or fi__s that can be used to 4) dr__ do___ or dis__te (sli__) analysis measures

A

1) provide context
2) meaning to facts
3) variables or fields
4) drill down or disaggregate (slice)

116
Q

Relationships

Relationships, the final element of data models, 1) l___k tables and are represented by 2) l___s in 3) s___r sc___a model.

All relationships in a star schema have a 4) o___to-m___y (___-___) 5) car____ pattern

A

1) link
2) lines
3) star schema
4) one-to-many (1-N)
5) cardinality

117
Q

Define cardinality

A constraint that defines how many times an instance of an entity may participate in a relationship

A
118
Q

Cardinality

It can take two possible values, N or 1:

N: An instance can 1) pa___e 2) m___ times in the 3) re____. There is 4) ___ restriction.

1: An instance can participate only 5) o___ time in the relationship.

A

1) participate
2) many
3) relationship
4) no
5) one

119
Q

A 1-N cardinality pattern for a relationship between a dimension and a fact table can be interpreted as follows:

1: For every fact, there is just 1) o__e co____ng va__ in each dimension table. For each instance of the Sale table (an invoice line), there is one date, product, customer, and salesperson.

N: There can be 2) m__y fa__ for each dimension. There can be many sales on the same day, and the same product type can be sold many times. Both customers and salespeople can be involved in many sales transactions.

A

1) one corresponding value
2) many facts

120
Q

Decide and Inform

After investigating the quality and structure of the data, it is time to take the next step:

-D__ n__ m__e fo___d

-Re___ the s____ce system

-Co___m the data’s fit___ for the analysis and move f__

If the decision is made to move forward, the next step in the process would be informed by any 1) is___s that must be 2) ad___.

A

-Do not move forward

-Redesign the source system

-Confirm the data’s fitness for the analysis and move forward

1) issues
2) addressed

121
Q

Do not move forward

If the data are 1) u__t, such as the data having 2) p___r qu___, then using the data for 3) d__n-ma___ would be too 4) ris___.

Similarly, if the data structure is poor, then 6) res___g and pr__g the data might be too 6) com___ and eco___y 7) unf___

A

1) unfit
2) poor quality
3) decision-making
4) risky
5) restructuring and processing
6) complicated and economically
7) unfeasible

122
Q

Redesign the source system

This decision is usually made because 1) er__s generated by the source system need to be 2) cor___. Once 3) f__d, the data source should be 4) pr___ again

A

1) errors
2) corrected
3) fixed
4) profiled

123
Q

Confirm the data’s fitness for the analysis and move forward

Moving forward with the data requires being 1) ri___-aw___

That is, we can move forward with the data with an awareness of potential 2) rel___y iss__ and their 3) im___ on 4) dec___

A

1) risk-aware
2) reliability issues
3) impact
4) decisions

124
Q

5.2 What Does it Mean to Extract-Transform-Load (ETL) Data?

A
125
Q

Define Extract-Transform-Load (ETL)

co__s is___s

A

corrects issues

126
Q

What’s the Extract-Transform-Load (ETL)

  1. R__ Data
  2. Ex___
  3. Tr___m
    -cle___
    -re____g
    -in___
  4. Lo___
  5. An___l Da___se
A
  1. Raw Data
  2. Extract
  3. Transform
    -cleaning
    -restructuring
    -integrating
  4. Load
  5. Analytical Database
127
Q

Define Data Extraction

The 1) ___ step in the ETL process, which involves 2) tra___ data from where they are 3) st__d to the platform where they will be 4) tra___

A

1) first
2) transferring
3) stored
4) transformed

128
Q

Extract Data

-Source data are moved to the platform where they will be transformed. This platform is normally a data warehouse, which is software that 1) st___ and an___s large data sets. Power BI and Tableau are examples of data warehouses.

-The process also includes data 2) va___, or confirming that the data were transferred 3) com__ and cor___.

A

1) stores and analyzes
2) validation
3) completely and correctly

129
Q

Transform Data

1) R__ data are 2) ra___ ready for analysis after they are 3) e___d

Data transformation 4) im___ the raw data for analysis through 5) cl___, res___g, and int___n

A

1) Raw
2) rarely
3) extracted
4) improves
5) cleaning, restructuring, and integration

130
Q

Cleaning Data

Data can be 1) inc____, inv___, inco____ or inco___.

Cleaning data, one of the most 2) im___ and t___e-con___aspects of data transformation, is also known as data 3) cl___or data 4) sc___ng. It involves 5) ad__, mo___g, and de___data

A

1) incorrect, invalid, inconsistent or incomplete
2) important and time-consuming
3) cleansing
4) scrubbing
5) adding, modifying, and deleting data

131
Q

Cleaning Data

-In the case of incomplete data, such as a missing sales transaction, data might need to be 1) ad__. A 1.1) sp__c st___y for dealing with incomplete data is 2) imp__, which is when estimated values are 3) sub___ for m__g data.

-Modifying data is necessary when a 4) cu__ value must be 5) re__ with a new 6) v__ if data are 7) inc___ in__, in__, or inc___te. Replacing the value NY with NJ in a column that records a customer’s state is an example of this.

-8) De__e data that are not 9) re___ for analysis. 10) Re___t data, such as a duplicate sales transaction, should be removed.

A

1) added
1.1) specific strategy
2) imputation
3) substituted for missing
4) current
5) replaced
6) value
7) incorrect, invalid, inconsistent, or incomplete
8) Delete
9) relevant
10) Redundant

132
Q

Define Data Restructuring

Also known as data 1) wra___ or data 2) mu__ this process 3) ch__ how the data in a data set are 4) or__d, such as 5) a___g, d___g, and ren___ columns or splitting and combining tables

does 6) n__ ch__ data values, but it does change how the data are 7) org__

A

1) wrangling
2) munging
3) changes
4) organized
5) adding, deleting, and renaming
6) not
7) organized

133
Q

Define Data Integration

The process of 1) co___ related data by 2) d___g rela__s with 3) p__y and f__n keys or combining 4) t___ or m___ tables with information from the 5) s___ entity.

A

1) connecting
2) defining
3) primary and foreign
4) two or more
5) same

134
Q

Integrating Data
There are two distinctive forms of integration:

-1) Li__g two tables by defining a 2) re___p between them. Relationships are created using 3) p___ry and for___ keys. Other aspects of relationships that must be specified are 4) ca___s.

-Combining two or more tables unites 5) inf___about the 6) s___ entity. Tables can be combined two ways.
-A union 7) co___es dif___ tables with the same data 8) str___. The result is a table with 9) m__ r___s.
-Recall from the foundational data analysis chapter that a join, or merge, combines data 10) el___ or co__s from 11) dif___ t__s. The result is a table with more 12) co__s.

A

Integrating Data
There are two distinctive forms of integration:

1) Linking
2) relationship
3) primary and foreign
4) cardinalities.
5) information
6) same entity
7) combines
8) structure
9) more rows.
10) elements or columns 11) different
12) columns.

135
Q

Define Data Matching

A process that 1) co____s data and determines whether they 2) d___e the 3) s___ entity

Data Matching is a challenge specific to 4) in__

A

1) compares
2) describe
3) same
4) integration

136
Q

Define Data Loading

The process of making the 1) ana___ d__e available for 3) u__e in software that 4) p__s analysis

Once the data are 4) cle___ and tra___, they are 5) lo__ into the 6) sof__ for analysis

A

1) analytical database available
2) use
3) performs
4) cleaned and transformed
5) loaded
6) software

137
Q

5.3 Which Patterns Extract Data?

A
138
Q

Each data analytics project has unique data preparation challenges.

While there is not a single, common approach for all projects, a structured set of data preparation patterns can address most challenges. Thesepatterns signal potential problems and provide guidance for finding them within the data set and correcting them.

Each pattern identifies a data issue, discusses how to detect it with a profiling method, and explains one or more ETL methods to correct it. Think of the patterns as a menu; you can select those which are most appropriate for your needs.

A
139
Q

The first step is identifying the 1) a__ data and 2) c___ng a data 3) dic___

A

1) available
2) creating
3) dictionary

140
Q

Define Data Dictionary

A 1) c__of the data in a data set that 2) in___ what data are 3) a___ and where they can be 4) f__.

A

1) chart
2) indicates
3) available
4) found

141
Q

A data dictionary 1) re___ds different pieces of 2) inf___ for each field, including a 3) na__, a brief 4) des___ of content, the data 5) t__e, whether the field is a 6) pr__y or for___n key, and whether the field is 7) man___ or op___al

A

1) records
2) information
3) name
4) description
5) type,
6) primary or foreign key
7) mandatory or optional

142
Q

A data dictionary is often referred to as 1) m___d__a–it is data about 2) d__a.

It is built gradually throughout the data 3) pre___ process, but creating 4) col__ names and descriptions is a good starting point

A

1) metadata
2) data
3) preparation
4) column

143
Q

What are the 2 extraction patterns ?

-I____e Data Transfer
-In___ Data Transfer

these determine if all data have been e__d and if they were tra___ cor___

A

-Incomplete Data Transfer
-Incorrect Data Transfer

these determine if all data have been extracted and if they were transferred correctly

144
Q

Data Preparation Pattern 1: Incomplete Data Transfer

Extraction transfers data from the source files to the 1) __ tool for further processing. An 2) in___transfer of data and 3) mi__ data leads to 4) unr___ results

A

1) ETL
2) incomplete
3) missing
4) unreliable

145
Q

Data Preparation Pattern 1: Incomplete Data Transfer
Compare Row Counts

Comparing row counts is one way to check com___ss

A

completeness

146
Q

Data Preparation Pattern 1: Incomplete Data Transfer
Add Missing rows

If the row counts 1) d__ ma__, add the 2) m__g r__s to the source data

A

1) don’t match
2) missing rows

147
Q

Data Preparation Pattern 1: Incomplete Data Transfer
Summary

Problem: All the data __ __t tra___.

Detect (Data Profiling): Com__ r__ counts.

Correct (ETL): a__ the m__g rows.

A

Problem: All the data did not transfer.

Detect (Data Profiling): Compare row counts.

Correct (ETL): Add the missing rows.

148
Q

Data Preparation Pattern 2: Incorrect Data Transfer

Even when all the rows have been transferred, the data might 1) n___ have been transferred 2) cor___. This is often caused by 3) di____ in data types.

A

1) not
2) correctly
3) differences

149
Q

Data Preparation Pattern 2: Incorrect Data Transfer

Problem: Data was 1) n__ cor___ transferred.

Detect (Data Profiling): 2) Com___ co___l amounts.

Correct (ETL): 3) mo___ i__ct values.

A

1) not correctly
2) Compare control
3) Modify incorrect

150
Q

5.4 Which Patterns Transform Columns?

A
151
Q

Once all data are transferred to the ETL tool, it is time to transform them.

Transformation has two purposes–1) cle___ the data by 2) cor___ values and 3) re___ and int__the data for analytics

A

1) cleaning
2) correcting
3) restructuring and integrating

152
Q

Data Preparation Pattern 3: Irrelevant and Unreliable Data

Data that are irrelevant for decisions 1) bl___ the data model. It is also important to avoid integrating 2) unr___ data into the data model (recall the old adage, “garbage in, garbage out”).

Keep in mind that 3) ex___g data from an analytical database is 4) n__ the same as 5) de__the data. The raw data still 6) e___t and can be 7) int___if necessary.

A

1) bloat
2) unreliable
3) excluding
4) not
5) deleting
6) exist
7) integrated

153
Q

Data Preparation Pattern 3: Irrelevant and Unreliable Data
-Scan Columns for Irrelevant and Unreliable Data

Irrelevant columns can be identified primarily by 1) sc__ the data 2) vi__y. The data 3) dic__ can also be a helpful tool

Scanning data can also determine whether a 4) co___ contains 5) u___e data.

Most ETL tools provide statistics about 6) er___, null values, and more, that can help determine a column’s 7) rel__

A

1) scanning
2) visually
3) dictionary
4) column
5) unreliable
6) errors
7) reliability

154
Q

Data Preparation Pattern 3: Irrelevant and Unreliable Data
-Remove Columns With Irrelevant or Unreliable Data

To correct the issue, 1) d___e the columns with 2) ir___nt and 3) unr___ data from the analytical database

A

1) delete
2) irrelevant
3) unreliable

155
Q

Data Preparation Pattern 3: Irrelevant and Unreliable Data
Summary

Problem:
1) Irr__ data 2) b__t a data set. 3) Un__ data 4) incr__ the 5) r___ of making 6) b___ decisions.

Detect (Data Profiling):
7) Vi__ s__n columns for 8) irrelevant and unreliable data.

Correct (ETL):
9) Re__ columns with irrelevant or unreliable data.

A

1) Irrelevant
2) bloat
3) Unreliable
4) increases
5) risk
6) bad
7) Visually scan
8) irrelevant and unreliable
9) Remove

156
Q

Data Preparation Pattern 4: Incorrect and Ambiguous Column Names

Column names become variables during data 1) exp__ and int___on. Their names are important because other people 2) m___ use the analytical database. Essentially, column names become part of the database’s 3) voc__.

Column names that are 4) co__, int___, and una__us make it 5) e___ for other people to 6) u___ a database and 7) ex___ data.

A

1) exploration and interpretation
2) might
3) vocabulary
4) correct, intuitive, and unambiguous
5) easier
6) use
7) explore

157
Q

Data Preparation Pattern 4: Incorrect and Ambiguous Column Names
-Scan Columns for Incorrect or Ambiguous Names

Visually scanning a column’s content and its data dictionary definition can reveal whether the column name accurately reflects its content. Here are four rules for naming columns:

  1. 1) Na__ should 2) ac___y describe the column’s 3) con___
  2. They should be 4) int__e to 5) bu__ people.
  3. Use only common 6) ab____ that are 7) un____ by everyone, such as YTD (year to date)
  4. Eliminate 8) s___s, un___, or other symbols. For example, use CustomerName instead of Customer_Name.
A

1) Names
2) accurately
3) content
4) intuitive
5) business
6) abbreviations
7) understood
8) spaces, underscores

158
Q

Data Preparation Pattern 4: Incorrect and Ambiguous Summary

Problem:
1) In___ or am__s column 3) na__ make it 3) ha__to 4) u__d and work with a data set.

Detect (Data Profiling):
5) Sc__ columns for 6) inc_ or amb__ names.

Correct (ETL):
7) Re__ columns.

A

1) Incorrect or ambiguous
2) names
3) harder
4) understand and work
5) Scan
6) incorrect or ambiguous
7) Rename

159
Q

Data Preparation Pattern 5: Incorrect Data Types

1) Ver__ data 2) t__s is an essential part of data 3) pre___

Data types are an integral part of column definitions because they 4) det__ what we 5) c___ and ca__ do with the data in a column.

A

1) Verifying
2) types
3) preparation
4) determine
5) can and cannot

160
Q

Data Preparation Pattern 5: Incorrect Data Types
Summary

Problem:
An 1) in__ data type 2) li__s what 3) c__ be done with the data in a column.

Detect (Data Profiling):
4) I___t the data type.

Correct (ETL):
5) C___e the data type.

A

1) incorrect
2) limits
3) can
4) inspect
5) change

161
Q

Data Preparation Pattern 6: Composite and Multi-Valued Columns

Each cell should contain 1) o__ va__ describing one 2) cha___c because 3) t__ or m__ values in the4) sa___ cell makes analysis more 5) cha__

A

1) one value
2) characteristic
3) two or more
4) same
5) challenging

162
Q

Data Preparation Pattern 6: Composite and Multi-Valued Columns

What are two specific scenarios that violate the single-valued rule and make analysis more complex?

A

composite columns and multi-valued columns.

163
Q

Data Preparation Pattern 6: Composite and Multi-Valued Columns
-Scan for Composite and Multi-Valued Columns

The best method to detect composite or multi-valued columns is 1) vi__ sc___g

A

1) visual scanning

164
Q

Data Preparation Pattern 6: Composite and Multi-Valued Columns
Summary

Problem:
Columns that are 1) n___ sin___v__d make analytics 2) di__

Detect (Data Profiling):
3) Ex__the 4) v__ in the columns.

Correct (ETL):
5) Sp___ combined columns. Create a 6) sep__ ta__ for 7) m__-v__d columns.

A

1) not single-valued
2) difficult.
3) Examine
4) values
5) Split combined columns
6) separate table
7) multi-valued

165
Q

Data Preparation Pattern 7: Incorrect Values
Summary

Problem:
1) Inc__ data might result in 2) po___ dec___-m__g.

Detect (Data Profiling):
3) De___ inc__ values with 4) ou___

Correct (ETL):
5) Mo__ incorrect values.

A

1) Incorrect
2) poor decision-making.
3) Detect incorrect
4) outliers
5) Modify

166
Q

Data Preparation Pattern 8: Inconsistent Values

The next pattern addresses data inconsistency, which occurs when 1) t___ or m___ 2) di___ re___s of the 3) sa__ value are 4) m__d in the same 5) c___n.

A

1) two or more
2) different representations
3) same
4) mixed
5) column

167
Q

Data Preparation Pattern 8: Inconsistent Values
-Identify Inconsistent Values

Two profiling techniques are useful for detecting inconsistent values:

  1. 1) Di__ v__: 2) Vi__ sc___ the 3) di___values of a column is an 4) ef__e way to 5) id__ inco___ data.
  2. 6) Fr__: Values with a 7) l__ frequency could indicate 8) in__t data.
A

1) Distinct values
2) Visually scanning
3) distinct
4) effective
5) identify inconsistent
6) Frequencies
7) low
8) inconsistent

168
Q

Data Preparation Pattern 8: Inconsistent Values
Summary

Problem:
Inconsistent data might result in 1) po__ d___n-ma__g.

Detect (Data Profiling):
2) I____ inconsistent values.

Correct (ETL):
3) Mo__ inconsistent values

A

1) poor decision-making.
2) identify
3) modifty

169
Q

Data Preparation Pattern 9: Incomplete Values

This pattern addresses 1) inco__ that might make data 2) unu____ and un__.

For example, without customers’ addresses, a business cannot send them marketing materials.

The different meanings given to 3) n__ values might also result in unreliable data.

A

1) incompleteness
2) unusable and unreliable
3) null

170
Q

Data Preparation Pattern 9: Incomplete Values

There are a few dimensions of incompleteness worth exploring:

  1. Should null values be allowed, and if not, are there any?
  2. If null values are allowed, what percentage of the values are null? If the percentage is high, should the column be loaded?
  3. How are incomplete values represented: nulls, or a specific code? Are the representations consistent?
A
171
Q

Data Preparation Pattern 9: Incomplete Values
-Remove the Column or Replace the Null Values

The correction scenario depends on the situation. If null values are not allowed but they exist,they should be 1) re__

If the number of null values is too high to be useful, then 2) re___ the column from the analytical database.

If there is inconsistency in representing missing values, design a 3) co___t schema and 4) co__ the values in terms of that schema

A

1) replaced
2) remove
3) consistent
4) correct

172
Q

Data Preparation Pattern 9: Incomplete Values
Summary

Problem:
Incompleteness might make data 1) un__ and un__e and result in 2) p___r decision-making.

Detect (Data Profiling):
3) In___ n___l values.

Correct (ETL):
4) Re__ the column or 5) replace the 6) n__ values.

A

1) unusable and unreliable
2) poor
3) Investigate null
4) Remove
5) replace
6) null

173
Q

Data Preparation Pattern 10: Invalid Values

Domain-specific rules that determine whether data are acceptable can be created for most columns. Data that do not meet these expectations are considered invalid.

A
174
Q

Data Preparation Pattern 10: Invalid Values
Create and Apply Validation Rules

A
175
Q

Data Preparation Pattern 10: Invalid Values
Summary

Problem:
Invalid data might result in 1) p___ decision-making.

Detect (Data Profiling):
2) Cr__and ap__ v___on rules.

Correct (ETL):
3) Mo__ invalid values.

A

1) poor
2) Create and apply validation
3) modify

176
Q

5.5 Which Patterns Transform Tables?

A
177
Q

Data Preparation Pattern 11: Non-Intuitive and Ambiguous TableNames
-Scan Tables for Incorrect or Ambiguous Names

Examining a table’s 1) c__t and its data 2) di___ry definition can help determine whether the name 3) acc__ reflects its 4) co__. The rules for naming 5) ta___ are the same as those for naming 6) c__s. They should be 7) in___, avoid sp__, un__, and s__l co__g:

For example, use CashReceipt instead of Cash Receipt.
Avoid special coding like DCustomer (the D refers to a dimension table, but not everyone will understand that).

A

1) content
2) dictionary
3) accurately
4) content
5) tables
6) columns
7) intuitive, avoid spaces, underscores, and special coding:

178
Q

Data Preparation Pattern 11: Non-Intuitive and Ambiguous TableNames
Summary

Problem:
Incorrect or ambiguous table names make it 1) h___to 2) und__and wo___ with a data set.

Detect (Data Profiling):
3) Vi__ sc__ tables for incorrect or ambiguous names.

Correct (ETL:
4) Re__ tables.

A

1) harder
2) understand and work
3) visually scan
4) rename

179
Q

Data Preparation Pattern 12: Missing Primary Keys

This pattern focuses on primary keys. Tables are descriptions of entities, and each instance of an entity should be 1) un__ id__d.

To be a primary key, a column must have a 2) un___ value for each instance and 3) n__ n__ values. Primary keys are normally already in place when data are 4) e__d from a 5) rel__ database.

However, a primary key will 6) n__ be in place when data are extracted from a spreadsheet, such as in the Beans case.

To establish a primary key, the field must be 7) se__ and both rules must be 8) v__

A

1) uniquely identified
2) unique
3) no null
4) extracted
5) relational
6) not
7) selected
8) validated

180
Q

Data Preparation Pattern 12: Missing Primary Keys

Problem:
Some tables 1) __ n__ have a 2) p__ key.

Detect (Data Profiling):
3) Ide__ tables 4) mi__ a 5) pr__ key.

Correct (ETL):
6) Cr__ a primary key

A

1) do not
2) primary key.
3) Identify
4) missing
5) primary
6) Create

181
Q

Data Preparation Pattern 13: Redundant Content AcrossColumns

This pattern looks for 1) red___s that create 2) inc___

Data inconsistencies occur when the 3) s__ data are recorded 4) __e than on__ and 5) ch__ in one place but 6) n__ the other, such as a customer’s email address

A

1) redundancies
2) inconsistencies.
3) same
4) more than once
5) changed
6) not

182
Q

Data Preparation Pattern 13: Redundant Content AcrossColumns

Here are two scenarios where two or more columns in a table might have the same content:

When there is 1) o__ such as an address that contains state information and a separate state field.

When there is 2) de___cy, which exists when one column’s values are 3) de__ on the values of another column in the 4) s__ table. Assume both age and date of birth are recorded.

However, the values of age change when time passes and the data will become inconsistent. Therefore, instead of transferring age from the data source, it should be calculated as part of the analytical database.

A

1) overlap
2) dependency
3) dependent
4) same

183
Q

Data Preparation Pattern 13: Redundant Content AcrossColumns
-Perform Column-By-Column Comparisons

Performing column-by-column comparisons for 1) ov__ or de___s would 2) d__ this issue.
Doing so within the Beans data set would show there is currently no redundant content.

A

1) overlaps or dependencies
2) detect

184
Q

Data Preparation Pattern 13: Redundant Content AcrossColumns
-Delete Redundant and Dependent Columns

Columns that contain redundant information can be 1) de__

When there is 2) d___cy, delete the column that contains the 3) de__nt value. Instead, use a 4) fo___ to recreate the column in the analytical database.

A

1) deleted
dependency
2) dependency
3) dependent
4) formula

185
Q

Data Preparation Pattern 13: Redundant Content AcrossColumns
Summary

Problem:
1) Re____content among columns in a table might result in 2) in___es.

Detect (Data Profiling:
Perform column-by-column 3) com__

Correct (ETL):
4) D___ red____ and de___nt data

A

1) redundant
2) inconsistencies
3) comparisons
4) Delete redundant and dependent

186
Q

Data Preparation Pattern 14: Find Invalid Values with Intra-Table Rules

Pattern 14 is similar to Pattern 10 in that it also defines 1) a___ values for a column.

However, Pattern 14 2) determines the 3) v___ of a column’s values based on the values in 4) o__ or m___ other columns in the 5) s__ table

A

1) acceptable
2) determines
3) validity
4) one or more
5) same

187
Q

Data Preparation Pattern 14: Find Invalid Values with Intra-Table Rules
-Create and Apply Intra-Table Validation Rules

The goal of the validation rule is to 1) id___ invalid data.

Creating 2) va___n rules requires 3) i__-de__ knowledge of the business, and they are implemented using a 4) scr___ language.

A

1) identify
2) validation
3) in-depth
4) scripting

188
Q

Data Preparation Pattern 14: Find Invalid Values with Intra-Table Rules
Summary

Problem:
1) Inv___ data might result in 2) p__ decision-making.

Detect (Data Profiling):
3) Cr___ and ap___ in___-table validation rules.

Correct (ETL):
4) M__y the invalid values.

A

1) invalid
2) poor
3) create and apply intra-table
4) modify

189
Q

5.6 Which Patterns Transform Models?

A
190
Q

Data Preparation Pattern 15: Data Spread Across Tables

Analysis becomes more challenging when data that describe the same entity are spread across 1) m___ tables

A

1) mulitple

191
Q

Data Preparation Pattern 15: Data Spread Across Tables

To identify similarly structured tables, look for 1) t___ or m___ tables with the 2) sa___ st___. These tables would have the 3) s__ co__s and 4) sim___ d___.

Another option is to 5) se___ fortables that describe different 6) cha___ of the 7) s__ ent__.

In the Beans case, Employee and EmployeeDemographics are such tables. The goal is to create a single table with all the employee information

A

1) two or more
2) same structure
3) same columns
4) similar data
5) search
6) characteristics
7) same entity

192
Q

Data Preparation Pattern 15: Data Spread Across Tables
Summary

Problem:
An entity’s data are 1) sp__across 2) mu__ tables, which 3) co__s analysis.

Detect (Data Profiling):
4) Id___ tables with a 5) si___r structure or tables that describe 6) dif___ characteristics of the 7) s___e entity.

Correct (ETL):
8) U__ or me___ tables.

A

1) spread
2) multiple
3) complicates
4) Identify
5) similar structure or tables
6) different
7) same entity
8) union or merge

193
Q

Data Preparation Pattern 16: Data Models Do Not Comply with Principles of Dimensional Modeling

Dimensional modeling is the technique of 1) cr___ data 2) mo___with 3) f___t tables surrounded by 4) d____n tables.

These data models, such as 5) st___ s____s, are 6) e__ to understand and result in 7) ef___t data processing.

A

1) creating
2) models
3) fact
4) dimension
5) star schemas
6) easy
7) efficient

194
Q

Data Preparation Pattern 16: Data Models Do Not Comply with Principles of Dimensional Modeling
-Analyze a Data Model’s Compliance with Dimensional ModelingPrinciples

Use these dimensional modeling principles by determining the 1) f___ and di___n tables and ensuring all 2) fi__s belong to the 3) cor__ table.

In an accounting context, fact tables correspond to 4) b___s transactions.

Dimension tables, on the other hand, 5) de___ who par__ in the transactions, 6) w___ the transactions 7) oc____, and 8) w__t was g___ u__ or a___d.

A

1) fact and dimension
2) fields
3) correct
4) business
5) describe who participates in the transactions
6) when
7) occurred
8) what was given up or acquired.

195
Q

Data Preparation Pattern 16: Data Models Do Not Comply with Principles of Dimensional Modeling
Summary

Problem:
Data models that are 1) __t structured following the principles of 2) d__l modeling are typically more 3) d__lt to analyze.

Detect (Data Profiling):
4) A___ a data model’s compliance with 5) di___l modeling principles.

Correct (ETL):
6) Rec___ the data model as a 7) s__/s__ke schema.

A

1) not structured
2) dimensional
3) difficult
4) Analyze
5) dimensional
6) Reconfigure
7) star/snowflake

196
Q

Define Star Schema

A data 1) st___ in which the information for a 2) dimension is 3) sp__ a___s mu___ tables.

A

1) structure
2) dimension
3) spread across multiple

197
Q

Data Exploration Pattern 17: Find Invalid Values with Inter-Table Rules

Patterns 10, 14, and 17 are similar because they define the acceptable values for a column.

However, Pattern 17 1) de___ the v__y of a column’s values based on the values in 2) o__ or m___ other tables.

A

1) determines the validity
2) one or more

198
Q

Define Referential Integrity

A rule that states the values of a 1) f___ key must be a 2) su___ of the values of its corresponding 3) p___ key.

which refers to the fact that all values in a foreign key should also 4) e__t as values in the corresponding primary key

A

1) foreign
2) subset
3) primary
4) exist

199
Q

Data Exploration Pattern 17: Find Invalid Values with Inter-Table Rules
-Create and Apply Inter-Table Validation Rules

1) In__-t__val__ rules identify 2) in__ data.
Their creation 3) re___ in-d___ knowledge of the 4) b__s.

A

1) Inter-table validation
2) invalid
3) requires in-depth
4) business.

200
Q

Data Exploration Pattern 17: Find Invalid Values with Inter-Table Rules
Summary

Problem:
1) In___data might result in poor decision-making.

Detect (Data Profiling):
Create and apply 2) i__-t__v_____ r__s.

Correct (ETL):
3) Mo__the 4) in___ rules.

A

1) invalid
2) inter-table validation rules
3) modify
4) invalid

201
Q

5.7 Which Patterns Apply to Data Loading?

A
202
Q

Once the data are cleaned and transformed, it is time to load them into the software for analysis. Data loading is the process of making the analytical database available for use. Since both extraction and loading are transfer processes, they have similar issues when it comes to the completeness and correctness of transferred data. It is also important that the data model of the analytical database is validated–that is, that all relationships have been defined.

A
203
Q

Data Preparation Pattern 18: Incomplete Data Loading
Summary

Problem:
All the data 1) d___ n__t tra__r during 2) l___g.

Detect (Data Profiling):
3) Co__ row counts.

Correct (ETL):
Add the 4) mis___ data.

A

1) did not transfer
2) loading
3) compare
4) missing

204
Q

Data Preparation Pattern 19: Incorrect Data Loading

Even when all the rows have been transferred, the data might 1) n__ have been transferred 2) co___. Pattern 19 addresses this potential issue.

A

1) not
2) correctly

205
Q

Data Preparation Pattern 19: Incorrect Data Loading

Problem:
The 1) co___ data 2) d__ n__tra___ during 3) lo___g.

Detect (Data Profiling):
4) Co___e co___l amounts.

Correct (ETL):
5) Mo__ inc__values.

A

1) correct
2) did not transfer
3) loading
4) Compare control
5) Modify incorrect

206
Q

Data Preparation Pattern 20: Missing or Incorrect Data Relationships

Analytics rely heavily on the underlying data 1) mo__.

2) Mis___or in___y defined data 3) re___ps make analytics 4) c___ng or even impossible, so the 5) comp___ and ac___y of the data model must be 6) va___ after 7) l___g.

A

1) model
2) Missing or incorrectly
3) relationships
4) challenging
5) completeness and accuracy 6) validated
7) loading

207
Q

Data Preparation Pattern 20: Missing or Incorrect Data Relationships

Problem:
1) Mis__ or in___y defined data 2) re___s makes analysis 3) c___ng or even 4) im___.

Detect (Data Profiling):
5) Inv___ the 6) completeness and 7) ac___ of the data model.

Correct (ETL):
8) M___ the data 9) m___l.

A

1) Missing or incorrectly
2) relationships
3) challenging
4) impossible
5) Investigate
6) completeness
7) accuracy
8) Modify
9) model

208
Q

6.1 What is Information Modeling?

A
209
Q

Define Information Modeling

The process of generating 1) ad___l k___ge from 2) d___ that is 3) rel__t for analysis 4) p___es

A

1) additional
2) data
3) relevant
4) purposes

210
Q

The Information Modeling Process

In information modeling, 1) d___are the 2) in__.
They are the raw figures and facts.

3) A___s are sets of 4) ins___ that 5) tra___ the data into 6) inf___, which is the 7) o___ of additional 8) k___dge gained from the data

A

1) data
2) input
3) algorithms
4) instructions
5) transform
6) information
7) output
8) knowledge

211
Q

The Information Modeling Process

Algorithms are the 1) l___ between the 2) i___t of 3) f___s and the 4) ou__of useful 5) inf__. Keep in mind that data (6) i__t) for one application can also be 7) in___n (8) ou__) for another application.

One example is how financial statements are information (output) for financial statement preparers, but they are data (input) for financial analysts.

A

1) link
2) input
3) facts
4) output
5) information
6) input
7) information
8) output

212
Q

Algorithms

are used to calculate 1) d___n, c___, fin____l r___s, and more

-they range from 2) sim__ to co__ and can be 3) c___d with a host of languages

A

1) depreciation, cost, financial ratios
2) simple to complex
3) coded

213
Q

The Information Modeling Process

2 types of information fields
-cal___ co__n
-m___e

A

-calculated column
-measure

214
Q

Define Calculated Column

A 1) n__data set column with 2) va___ that are calculated based on the 3) va___s in other 4) c___ns

They are the 5) int__ p__s of the table

A

1) new
2) values
3) values
4) columns
5) integral parts

215
Q

Define Measure

An 1) ag___d (to__l) data information field that can be used in 3) re___s for 4) an___ purposes

Measures are created by 5) al____, but they are 6) n__ in__ pa__ of a table.

A

1) aggregated (total)
2) reports
3) analytical
5) algorithms
6) not integral parts

216
Q

1) M___s are at the center of data analytics.

They can be calculated and then sliced in many ways during data 2) ex___n

A

1) Measures
2) exploration

217
Q

A Structured Approach

Performing data analysis requires determining the 1) nu___ to analyze (these are the 2) m___es) and 3) h__ to analyze them (these are the 4) dim__, or the 5) f___s that can 6) sl__ them)

A

1) numbers
2) measures
3) how
4) dimensions
5) fields
6) slice

218
Q

A Structured Approach

to developing accounting 1) inf__ mo__ will be used as an example to define 2) dim___ of w___, w__, and w___ in transactions to help make accounting 3) s__e of data

A

1) information models
2) dimensions of who, what, and when
3) sense

219
Q

Define Dimensions

In an accounting context, these are the data that provide 1) co__ to analysis and give 2) m___g to 3) f__s.

They are the 3) var__ or f___s that can be used to drill down or disaggregate (slice) analysis measures.

there are typically configured in a 4) __r s__ma

A

1) contect
2) meaning
3) facts
4) star schema

220
Q

Define Data Model

shows the 1) rel___ among the 2) dif__ e___es in a data set

It shows the concepts being 3) des___, the tab__, and the fi__ used to 4) de___e the concepts

A

1) relationships
2) different entities
3) described, the tables, and fields
4) describe

221
Q

Define Information Model

An 1) ext__ of the data model that includes 2) c___d columns and 3) me__

It has additional 4) inf__ that is calculated from the 5) d__a s
_t, which can be used for 6) an__s pu___

A

1) extension
2) calculated columns
3) measure
4) information
5) data set
6) analysis purposes.

222
Q

Create Measures and Dimensions

What are the 2 goals for building an information model for a star schema?

  1. The first is to 1) c__e a 2) ri__set of 3) me___s for the 4) f__ table
  2. The second goal for building an information model is a rich set of 5) di__ns that can 6) br__ do___, or sl__e, measures in many ways
A

1) create
2) rich
3) measures
4) fact
5) dimensions
6) break down or slice

223
Q

Dimension tables can describe specific characteristics of transactions: w__ par___d, wh___ was in___d, and w___n they oc___d

A

who participated, what was involved, and when they occurred

224
Q

Define Who Table

A 1) di___ table that describes 2) ag__ involved in the 3) bu___ transaction data

Internal agents are the employees who 4) p__e in the transaction

A

1) dimension
2) agents
3) business
4) participate

225
Q

External agents, such as customers and vendors, are parties external to the organization involved in the accounting transactions.

A participates relationship 1) li__ 2) a__s to 3) s___c transactions.

A

1) links
2) agents
3) specific

226
Q

Define What Table

A 1) dim___ table that describes the 2) re___sinvolved in a 3) b___s transaction

A

1) dimension
2) resources
3) business

227
Q

Define Flows Relationship

The data 1) rela__between 2) re__ and spe__ tran__

A

1) relationship
2) resources and specific transactions.

228
Q

Define When Table

A 1) di___ table that 2) de__s when the data from the 3) bus___ tran___ oc___.

A

1) dimension
2) describes
3) business transaction occurred

229
Q

Define Occurs Relationship

The data 1) re__ that 2) l__s the 3) cal__ to 4) sp__ bu__ss tran__

A

1) relationship
2) links
3) calendar
4) specific business transactions

230
Q

In__ mo__ is the heart of data analytics

A

Information

231
Q

6.2 Which Patterns Implement Information Modeling Algorithms?

A
232
Q

Information Modeling Pattern 1: Within-Table Numeric Calculation

The within-table numeric calculation pattern creates a new 1) fi__ (a 2) cal__ column) from 3) o__ or __ nu__columns, or fields in the 4) s__ table.

Basic 5) ari___ operations of addition, subtraction, multiplication, and division can do this

A typical EX is determining the dollar total for an order line or invoice line by multiplying price and quantity

A

1) field
2) calculated
3) one or more numeric
4) same
5) arithmetic

233
Q

Information Modeling Pattern 2: Within-Table Text Calculation

The second implementation pattern also involves creating a 0) n___ cal___ col__ from o__ or more fi___s in the sa__ table.
However, the information in the new column is created from 1) t__ f__

For example, different pieces of location information could be linked, or concatenated, into an address so it can be read by a map service such as Bing Maps or Google Maps

A

0) new calculated column from one or more fields in the same table.
1) text fields

234
Q

Information Modeling Pattern 3: Within-Table Classification

The number of 0) cl__n algorithms that can be applied to a data set is, in essence, 1) un__

Used when classifications are 1.1) d___d based on data that are 2) p__ of the 3) s__e ta__

most classification algorithms rely on 4) Bo__ lo__, so its necessary to understand Boolean operations like AND, OR, NOT, and IF functions and how to create them

A

0) classification
1) unlimited
1.1) determined
2) part
3) same table
4) Boolean logic

235
Q

Information Modeling Pattern 4: Across-Table Calculation

Regardless of the software, there are multiple questions to consider when linking tables:

  1. Are there 1) p___y defined 2) rel___ between the 3) ta__?
  2. What is the 4) na___ of the 5) rel__ and what 6) t___of 7) jo___ should be considered?
  3. What 8) car___s apply to the relationship: 9) __-__, ___-__, __-__, or __-__? For example, calculations become more complex when navigating a 1-N relationship.
A

1) properly
2) relationships
3) tables
4) nature
5) relationship
6) type
7) join
8) cardinalities
9) 1-1, 1-N, N-1, or N-N?

236
Q

Information Modeling Pattern 4: Across-Table Calculation

The calculations for this pattern are like the first three–a new column is created from existing data using 1) ari___, te__ cal___, or cla___

But this pattern is different from the earlier patterns because data from 2) dif__nt tables are used to create a 3) n__ co__n. This means the tables 3.1) m__ be 4) li__. How this is accomplished depends on the 5) sof__ being used

A

1) arithmetic, text calculation, or classification.
2) different
3) new column
3.1) must
4) linked
5) software

237
Q

Information Modeling Pattern 5: Single-Column Aggregation

The 1) si___-co__ aggregation pattern is the 2) fi___ to fo__ on 3) me__s, which, as you have learned, are the 3) h__ of data analytics.

Measures calculate 4) a___s that can be 5) sli___ or b__en down, in many ways

A

1) single-column
2) first to focus
3) measures
3) heart
4) aggregates
5) sliced, or broken down,

238
Q

In its simplest form, a measure applies a mathematical operation to all the values of a single column. Operations that can be applied include aggregate functions such as SUM, AVERAGE, COUNT, MIN, and MAX. Each of these functions generate a single value

A
239
Q

Information Modeling Pattern 5: Single-Column Aggregation

For most applications, single-column measures make up a significant part of the information model. A few more things about aggregation functions:

Except for aggregation functions that count, most require a numeric data type.

Be sure to understand how the aggregate functions deal with specific values, such as null, as this can differ depending on the implementation platform.

A
240
Q

Pattern 6: Filtered Aggregation

it’s often necessary to analyze aggregates based on 1) fi___ d__a sets

A

1) filtered data

241
Q

Define Measure Hierarchy

used when a 1) ne__, m__ co___ measure is created using 2) ex__ m___es

A

1) new, more complex
2) existing measures

242
Q

Information Modeling Pattern 7: Measure Hierarchies

In fact, 0) co__ problems can sometimes be solved by 1) di___ the problem into 2) s__ problems, then 3) combining the 4) d___t solutions.

Measure hierarchies are especially useful when calculating 5) ra___ and ben____

A

0) complex
1) dividing
2) smaller
3) combining
4) different
5) ratios and benchmarks.

243
Q

6.3 Which Patterns Help Develop and Implement Accounting Information Models?

A
244
Q

Information Modeling Pattern 8: How Many

Tables usually describe concepts like 1) c___s, ve___, sa___, and p___ts.

The next pattern creates a measure that 2) c___s the number of 3) r___s or instances, of a concept within a table.

This pattern can count 4) re___, tr___s, and a__s, which means it can be used for both 5) fa__ (tra___s) and di___n (re___s and a__s) tables.

A

1) customers, vendors, sales, and products
2) count
3) rows
4) resources, transactions, and agents
5) fact (transactions) and dimension (resources and agents

245
Q

Information Model

Transactions: how many sales
Agents (who): ho. many customers
Resources (what): how many products does KLUB have

A
246
Q

Information Modeling Pattern 9: Participates | Transaction-Who

The participates relationship describes 1) w__ is 2) inv__d in a transaction

A

1) who
2) invovled

247
Q

In KLUB’s data model there are three participates relationships:

From Sales to Customer (the customer is an external agent).
From Sales to Employee (the employee is an internal agent).
From Purchases to Vendor (the vendor is an external agent).

There are also other types of participates relationships that do not appear in the KLUB data set:

CashReceipts to Customer.
CashReceipts to Employee.
CashDisbursements to Vendor.
CashDisbursements to Employee.

Because the information captured by these relationships is similar, the same types of analyses apply to all of them.

A
248
Q

Information Modeling Pattern 10: Flows | Transaction-What

The flows relationship describes 1) w__ is inv___ in a transaction, such as goods and services.

A transaction results in either an 2) in___e or a dec__ of the 3) re__es

A

1) what is involved
2) increase or a decrease
3) resources

249
Q

Information Modeling Pattern 10: Flows | Transaction-What

In KLUB’s data model there are two flows relationships:

From Sales to Merchandise.
From Purchases to Merchandise.

There are also other types of flows relationships that do not appear in the KLUB data set:

Cash Receipts to Cash.
Cash Disbursements to Cash.

Because the information captured by these relationships is similar, the same types of analysis apply to all of them.

A
250
Q

Information Modeling Pattern 10: Flows | Transaction-What

The first step is to develop measures for the transactions, such as sales, that can be broken down by dimensions that are part of the Resource tables, such as merchandise.

A
251
Q

Information Modeling Pattern 11: Occurs | Transaction-When

The occurs relationship describes when a transaction takes 1) p__ by connecting it to a 2) Cal__ table

A

1) place
2) calender

252
Q

Information Modeling Pattern 11: Occurs | Transaction-When

There are two occurs relationships in KLUB’s data model:

Sales to Calendar.
Purchases to Calendar.

There are other occurs relationships that do not appear in the KLUB data set:

CashReceipts to Calendar.
CashDisbursements to Calendar.
Because the information captured by these relationships is similar, the same types of analyses apply to all of them.

A
253
Q

Information Modeling Pattern 12: Who-What-When Star Schema

While patterns 9, 10, and 11 analyze the who, what, and when dimensions of accounting transactions, this pattern integrates these 1) t__e re__ps

A

1) three relationships

254
Q

Information Modeling Pattern 13: Integrated Star Schemas

shows how 1) w___o-wh__-w__ stars are connected through resources.

Resources are being acquired for a purpose–to sell or to use them for manufacturing, for example

A

1) who-what-when