Final Exam Flashcards

1
Q

According to the text, which of the following is NOT true?

A

An example of an SSBI tool is PowerPoint.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When preparing data, analysts use the ETL process. ETL stands for Explore, Transfer, Load.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

According to the text, the data analysis process is comprised of three equally important stages, which of the following is NOT one of those stages?

A

Review

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Understanding “why” something happening in your analysis is called _________ analytics.

A

Diagnostic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

A visualization of a chart that compares actual vs expected monthly revenue would probably be found in the _________ area

A

auditing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In preparing data, the process of reviewing the data for possible issues is called

A

profiling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In the data analysis process, “C” in the MOSAIC model stands for “Cleaning”.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

The CPA Exam and the CMA Exam both include topics on data analytics

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which is consistent with the Data Analytics Mindset?

A

all of these

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which of the following is best defined as a measure of dispersion

A

variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Most of the data you will work with will come from

A

relational databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Questions with single dimensions should be answered with pivot tables, questions with multiple dimensions should be answered with excel functions.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Database elements can be represented in the REA model, the model’s elements are..

A

resources, events, agents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Which of the following is NOT one of the basic excel functions used in foundational analysis

A

DISPLAYIF

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a relational database table, a primary key is

A

is a unique value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A __________ is a bar chart of frequency distributions where the height of the bar represents the count of items in the interval

A

histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

There are 4 types of joins used to link tables together, which type of join DOES NOT result in any null values being produced?

A

Inner

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Simultaneously filtering for multiple dimensions is called

A

data slicing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

An action request made to a database is called a(n)

A

query

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Which is the best tool when the desired result is known, but not the input value for a single variable will achiever that result?

A

Goal Seek

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

An analysis prepared to support a predetermined belief is an example of

A

confirmation bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

an anomaly is

A

an observation that deviates from what is normal/expected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

When examining the relationship between two variables, if one variable increases as the other variable decreases the relationship is

A

a negative correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

In a regression model prepared to predict revenue, which of the following is the correct interpretation of an adjusted R-squared of 0.85?

A

the independent variables in the model can explain 85% of the change in revenue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

A spreadsheet model that allows evaluating how changes to values and assumptions affect an outcome is called a

A

what-if analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Determining if the analysis makes senses is associated with….

A

data analysis interpretation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

An appropriate analysis to use to determine how many times an event has occurred would be

A

a frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

which of the following analysis can predict a future outcome

A

linear regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

if the objective is to use historical data to identify patterns, which is the best analysis to use?

A

Trend analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

which of the following describes part of the goal of the ETL process

A

Identify and obtain the data needed for solving the problem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

the purpose of transforming data is

A

to validate the data for completeness and integrity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

mastering the data can also be described via the ETL process. ETL process stands for:

A

Extract, Transform, Load

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

the advantages of storing data in a relational database include

A

help in enforcing business rules and integrating business processes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

why is supplier ID considered to be a primary key for a supplier table

A

it contains a unique identifier for each supplier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Which of the following questions are not suggested by the institute of business ethics to allow a business to create value from data use and analysis, and still protect the privacy of stakeholders?

A

Does the data used by the company include personally identifiable information?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

which of the following is not a common way that data will need to be cleaned after extraction and validation

A

Clean up trailing zeroes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

which attribute is required to exist in each table of a relational database and serves as the “unique identifier” for each record in a table?

A

Primary key

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

what are attributes that exist in a relational database that are neither primary nor foreign keys?

A

Descriptive attributes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

the metadata that describes each attribute in a database is

A

data dictionary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

which of the following best describes an unsupervised approach to the evaluation of data?

A

data exploration looking for potential patterns of interest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

these data are organized and reside in a fixed field with a record or a file. such data are generally contained in a relational database/ spreadsheet and are readily searchable by search algorithms.

A

structured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

which approach to data analytics attempts to assign each unit in a population into a small set of classes where the unit belongs

A

classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

an observation about the frequency of leading digits in many real-life sets of numerical data

A

benford’s law

44
Q

which approach to data analytics attempts to predict a relationship between two data items

A

link prediction

45
Q

models associated with regression and classification data approaches have all these important parts except:

A

test data

46
Q

auditing financial statements, and its desire to look for errors, anomalies, and possible fraud, is most consistent with which type of analytics?

A

Diagnostic analytics

47
Q

in general, the simpler the model, the greater the chance of

A

underfitting the data

48
Q

test data

A

set of data used to assess the degree and strength of a predicted relationship

49
Q

in general, the more complex the model, the greater the chance of

A

overfitting the data

50
Q

ratio data

A

considered the most sophisticated type of data

51
Q

in the late 1960s ed altman developed a model to predict if a company was at severe risk of going bankrupt. He called his statistic altman’s z-score, now a widely used score in finance. Based on the name of the statistic, which statistical distribution would you guess this came from?

A

standardized normal distribution

52
Q

the Fahrenheit scale of temperature measurement would best be described as an example of

A

interval data

53
Q

Conceptual (Qualitative)

A

Comparison: Bar Chart, Pie Chart, stacked bar chart, Tree map, Heat map
Geographic data: Symbol map
Text Data: word cloud

54
Q

Data-driven (quantitative)

A

Outlier detection: box and whisker plot
Relationship between two variables: scatter plot
Trend over time: line chart
Geographic data: filled map

55
Q

least sophisticated type of data

A

nominal

56
Q

not a typical example of nominal data

A

SAT scores

57
Q

Anscombe’s quartet suggests that

A

visualizations should be used in tandem with statistics

58
Q

line charts are not recommended for

A

qualitative data

59
Q

letter grades would be best described as

A

ordinal data

60
Q

which testing approach would be used to predict whether certain cases should be evaluated as having fraud or no fraud

A

classification

61
Q

describes finding correspondences between at least two types of text or entries that may not match perfectly

A

fuzzy matching

62
Q

the determinants for sample size include all of the following except:

A

potential risk of account

63
Q

Benford’s law suggests that the first digit of naturally occurring numerical datasets follow an expected distribution where

A

the leading digit of 8 is more common than 9

64
Q

What type of analysis would help auditors find missing checks?

A

sequence check

65
Q

CAAT (Computer assisted audit techniques)

A

Automated scripts that can be used to validate data, test controls, and enable substantive testing of transaction details or account balances and generate supporting evidence for the audit

66
Q

which testing approach would be useful in assessing the value of inventory shrinkage given multiple environmental factors

A

regression

67
Q

which items would be currently out of the scope of data analytics

A

direct observation of processes

68
Q

which type of audit analytics might be used to find hidden patterns/variables linked to abnormal behavior

A

diagnostic analytics

69
Q

which type of audit analytics might be used to find hidden patterns/variables linked to abnormal behavior

A

diagnostic analytics

70
Q

what allows tax departments to view multiple years, periods, jurisdictions (state/federal/international) and differing scenarios of data, typically through use of a dashboard

A

tax data visualizations

71
Q

the task to tax accountants and tax departments to minimize the amount of taxes paid in the future

A

tax planning

72
Q

an example of a tax risk KPI would be

A

levels of late filing or error penalties

73
Q

an example of a tax cost KPI would be

A

ETR (Effective tax rate)

74
Q

an example of a tax efficiency and effectiveness KPI would be

A

amount of time spent on compliance vs strategic activities

75
Q

tax departments interested in maintaining their own data are likely to have their own

A

tax data mart

76
Q

in which stage of the IMPACT model would the use of tax cockpits fit?

A

track outcomes

77
Q

predictive analysis of potential tax liability and the formulation of a plan to reduce the amount of taxes paid is

A

tax planning

78
Q

the evaluation of the impact of different tax scenarios/alternatives on various outcome measures including the amount of taxable income or tax paid

A

what-if scenario analysis

79
Q

an example of a tax sustainability KPI would be

A

number of audits closed and significance of assessment over time

80
Q

dependent variable is

A

Y

81
Q

TO REMOVE NULL VALUES

A

go to power query and right click-remove empty

82
Q

binary values

A

either 0 or 1

83
Q

IMPACT

A

I-Identify the questions
M-Master the data
P-Perform the test
A-Address and refine results
C-Communicate insights
T-Track Outcome

84
Q

A data approach that attempts to discover associations between individuals based on transactions involving them.

A

co-occurrence grouping

85
Q

A data approach that attempts to characterize the “typical” behavior of an individual, group, or population by generating summary statistics about the data (including mean, standard deviations, etc.).

A

profiling

86
Q

A data approach that attempts to estimate or predict, for each unit, the numerical value of some variable using some type of statistical model.

A

regression

87
Q

Data that do not adhere to a predefined data model in a tabular format.

A

unstructured data

88
Q

An information system for managing all interactions between the company and its current and potential customers.

A

Customer Relationship Management (CRM) system

89
Q

Centralized repository of descriptions for all of the data attributes of the dataset.

A

data dictionary

90
Q

A means of storing data in one place, such as in an Excel spreadsheet, as opposed to storing the data in multiple tables, such as in a relational database.

A

flat file

91
Q

An information system that helps manage all the company’s interactions with suppliers.

A

Supply chain mgmt (SCM) system

92
Q

A data approach that attempts to divide individuals (like customers) into groups (or clusters) in a useful or meaningful way.

A

clustering

93
Q

Procedures that summarize existing data to determine what has happened in the past. Some examples include summary statistics (e.g., Count, Min, Max, Average, Median), distributions, and proportions.

A

descriptive analytics

94
Q

A numerical value (0 or 1) to represent categorical data in statistical analysis; values assigned a 1 indicate the presence of something and 0 represents the absence.

A

dummy variable

95
Q

One way to categorize quantitative data, as opposed to discrete data. Continuous data can take on any value within a range. An example of continuous data is height.

A

continous data

96
Q

One way to categorize quantitative data, as opposed to continuous data. Discrete data are represented by whole numbers. An example of discrete data is points in a basketball game.

A

discrete data

97
Q

The second most sophisticated type of data on the scale of nominal, ordinal, interval, and ratio; a type of qualitative data. Ordinal can be counted and categorized like nominal data and the categories can also be ranked. Examples of ordinal data include gold, silver, and bronze medals.

A

ordinal data

98
Q

The least sophisticated type of data on the scale of nominal, ordinal, interval, and ratio; a type of qualitative data. The only thing you can do with nominal data is count, group, and take a proportion. Examples of nominal data are hair color, gender, and ethnic groups.

A

nominal data

99
Q

interval data

A

The third most sophisticated type of data on the scale of nominal, ordinal, interval, and ratio; a type of quantitative data. Interval data can be counted and grouped like qualitative data, and the differences between each data point are meaningful. However, interval data do not have a meaningful 0. In interval data, 0 does not mean “the absence of” but is simply another number. An example of interval data is the Fahrenheit scale of temperature measurement.

100
Q

Procedures used to generate a model that can be used to determine what is likely to happen in the future. Examples include regression analysis, forecasting, classification, and other predictive modeling.

A

predictive analytics

101
Q

Procedures that summarize existing data to determine what has happened in the past. Some examples include summary statistics (e.g., Count, Min, Max, Average, Median), distributions, and proportions.

A

descriptive analytics

102
Q

Procedures that work to identify the best possible options given constraints or changing conditions. These typically include developing more advanced machine learning and artificial intelligence models to recommend a course of action, or optimizing, based on constraints and/or changing conditions.

A

prescriptive analytics

103
Q

Analysis technique of business processes used to diagnose problems and suggest improvements where greater efficiency may be applied.

A

process mining

104
Q

tax legislation offering major change to existing tax code

A

2018 Tax cuts and jobs act tax reform

105
Q

A subset of the data warehouse focused on a specific function or department to assist and support its needed data requirements.

A

data mart

106
Q

A repository of data accumulated from internal and external data sources, including financial data, to help management decision making.

A

data warehouse

107
Q

A subset of a company-owned data warehouse focused on the specific needs of the tax department.

A

tax data mart