PCLC Flashcards
Which of the following are differences between a spreadsheet (excel) and a table in a relational database?
- each column in a relational database must contain the same data type
- a table in a relational database must contain a unique identifier
T/F: Each table in a relational database can have multiple foreign keys and the foreign keys cannot contain duplicate values
F: Each table can contain duplicate values
T/F: Relational databases are the optimal way to store date, while flat files are the optimal way for humans to visually consume data and are optimal for analyzing data.
T
If you would like to query multiple fields from a relational table and simultaneously rename some of the fields, you would use the following SQL commands
Select
From
AS
T/F: The default order for the ORDER BY command in SQL is ascending order
T
A piece of XBRL data is referred to as a
Fact
T/F: XBRL data includes all amounts reported on a company’s financial statements filed with the SEC, but does not extend to data outside of the financial statements (textual data on the face of the report and data from the notes to the financial statements)
F
T/F: The FASB is responsible for keeping the GAAP Taxonomy current and aligned with FASB Codification
T
The ___ tool allows us to see the entire contents of a dataset and allows us to assess the quality, distribution, and attributes of the data
Browse
In the Alteryx FORMULA tool, where do we enter expressions?
Expression Editor
We can use ___ functions to remove unwanted characters, including whitespace, from string data
Trim
An INPUT DATA tool and a ___ tool can be used together to input data from multiple sheets within the same Excel file
Dynamic input
Basic conditional statements in Alteryx are composed of how many parts?
4
IF, ELSE, THEN, ENDIF
The Alteryx TEXT TO COLUMNS tools belongs to this category of tools in the tools pallet
Parse
What Alteryx tool can accomplish similar tasks as a VLOOKUP formula in Excel?
Find Replace
In order to parse data with RegEx, we must identify ___ in the data
patterns
Consider the following regular expression: \w+
Which of the following string records would NOT be completely identified by this regular expression
Goodbye
1234
TIGERS
Clemson University
Clemson University
Which of the following string records would be completely identified by this regular expression? \d
1
12
123
1234
1
T/F: Both the FORMULA tool and the MULTI-ROW FORMULA toot a) can create new columns or modify existing columns, b) use an expression editor to input functions, and c) can apply multiple expressions per tool
F: Multi-Row Formula tool only allows one expression per tool
T/F: When transposing data using the TRANSPOSE tool, the tool will output at least two columns of data with standardized names (called NAME and VALUE).
T
T/F: The OUTPUT tool allows us to output data to one single file or can group data and output this grouped data into separate files
True
Stacked Bar Charts, Line Charts, and Combo Charts are all examples of what type of vizualizations
Comparison
Which type of visualization is not a distribution visualization?
Line Chart
What has research found to be the least accurate way to compare magnitudes in a visualization?
Color
T/F: Exploratory visualizations are used as part of the analysis phase (not communication phase) and are used to develop a question/problem that has not clearly defined and help asses a question without a clear answer
T
T/F: Any visualization can be exploratory or explanatory. The distiction relates to how/why they are being used and how they are presented (if presented at all)
T
T/F: Pre-attentive attributes require conscious thought in order to be detected by your brain
F
T/F: Removing redundant labels from visualizations helps increase the “data-to-ink ratio”
T
T/F: Benford’s law applies to all datasets, as long as there are a sufficient number of observations in the data
F: Benford’s law only applies to datasets where each number 1 through 9 has an equal chance of being the leading digit
T/F: A common audit area in which analytics can streamline the audit is performing tests on massive journal-entry populations to identify risks and items of audit interest
T
Assume you want to estimate the following model:
collGPA = a + B1hsGPA + B2ACT + e
collGPA = college GPA, hsGPA = high school GPA, and ACT = ACT score
Assume the estimated model is:
collGPA = 1.29 + 0.453hsGPA + 0.0094ACT
According to our model, what is the expected college GPA of as student with a high-school GPA of 3.49 and an ACT score of 21?
Round to 2 decimal places.
3.07
Assume you want to estimate the following model:
collGPA = a + B1hsGPA + B2ACT + e
collGPA = college GPA, hsGPA = high school GPA, and ACT = ACT score
Assume the estimated model is:
collGPA = 1.29 + 0.453hsGPA + 0.0094ACT
Our model has an R2 of 0.13.
T/F: An R2 of 0.13 indicates that the sample students’ high school GPAs and ACT scores explain 13% of the variance in college GPAs.
T
Assume you want to estimate the following model:
collGPA = a + B1hsGPA + B2ACT + e
where collGPA = college GPA, hsGPA = high-school GPA, and ACT = ACT score.
Assume the estimated model is:
collGPA = 1.29 + 0.453hsGPA + 0.0094ACT
T/F: Assuming two students have the same ACT score, if student 1 has a high school GPA of 3.0 and student 2 has a high school GPA of 4.0, then this model would predict that student 1’s college GPA is 0.453 points lower than student 2’s college GPA.
T