Lecture 6 - Data analytics in accounting Flashcards
(!) Describe big data & the four V’s
General:
- Too large & complex dataset for existing system & traditional capabilities to capture, manage & analyze
- Raw data = Big data
- Push capability limit of IS
Four V´s:
Volume:
- Data amount
- Amount of involved data
Velocity:
- Data speed
- Speed of data generation or analysis
- Quick speed & real time data
- Eg. Streaming
Variety:
- Data structure
- Data form
- Structured data: Organized & fit in tables & databases. Eg. BS or P/L
- Unstructured data: Eg. Instagram
- Semistructured
Veracity:
- Data quality
- Cleanliness: Error or integrity issue
- Reliability: Fact > fiction
- Representational: Faithful data
- Eg. Cash in bank vs. estimate
(!) What is meant by data analytics & what are the benefits, costs & impact on accounting
General:
- Science of examining raw data
- Incl. tech, systems, practices, methods, databases & apps
- New poss. to analyze & assess data due to availability
- Increase in being critical
- Storage: Lower cost if in cloud
Benefits:
- Remove noise
- Organize data for DM
- Raw data –> info value
- Identify patterns for predictions
- Help sound & timely DM
- Discover risk & opportunities
- Investigate anormalies
- Forecast future behavior
- Produce value externally
- Produce value internally: Processes, productivity, utilization & growth
- Improve productivity, utilization & growth
Costs:
- Time consuming: Extract transform & load / ETL
- Sometimes req. impossible processing power
Impact on accounting:
- Discover risk & opportunities
- More time to present findings & DM
- Expand capabilities: Test fraud & automate compliance-monitoring
- Higher quality & consistency
- Quick & accurate forecasting
(!) Describe the terms extract, transform & load / ETL and its costs
Extract:
- Scrub data from unfamiliar data & noise so analyzable & useful
Transform:
- Reformat, clean & consolidate from multiple sources
- Time consuming: 50-90% of time
Load:
- Salary to data analytics scientist for scrubbing data
- Cost of tech to prepare & analyze data
- Cost to produce data
(!) Describe the data life cycle
.
Describe the different datatypes
Give some examples on use of data analysis in research
Management:
- HR-challenges
- Customer behavior & marketing
- Product dev. & innovation
- Global Value chain & future resilience
- Sustainability, governance & public policy issues
___________
Auditing:
General:
- Slow on adopting big data tech
- Extend of use in practice unknown
Financial distress modelling:
- Data mining to detect & forecast financial failure of firms
- Important for going concern evaluations
Financial fraud modelling:
- Assess risk of fraud
Stock market prediction & quantitative modelling:
- Predictive analysis & invest. advice to managers & investors
(!) Describe integration of data analytics in ERP systems for MA
.
(!) Describe the AMPS model
General:
- Circular
- About data analytics process
- Make DM’er more knowledgeable
__________
A - Ask the question:
- Must be narrow
___________
M - Master the data:
General:
- ETL: Extract, transform, load
- Ref. ADS
Data Accessibility:
- Data needed
- Data access
- Cost to acquire & process data
- Cost vs. benefit
Data Reliability:
- Clean & reliable data?
- Missing values?
- Need cleaning bf. use?
- Age of data: Usable?
Data Integrity:
- Accurate, valid & consistent data
- Reliability vs. relevant data
Data type:
- Privacy concern: Risk & allowed use?
- Structured, unstructured or semi structured: Impact on use.
- Internal/external?
- Readable?
- Numerical or categorial?
____________
P - Perform the analysis:
General:
- Req. appropriate data analytic technique
Descriptive analysis:
- What happened?
- Characterize, summarize & organize past performance
- Try understand
- Eg. Did we make profit last year?
Diagnostic analysis:
- Why did it happen?
- Investigate underlying cause
- Eg. Why ad expense increase yet sales fell
Predictive analysis:
- Will it happen in future?
- Foresight: Past pattern
- Eg. Which risk of firm bankrupt
Prescriptive analysis:
- What to do based on expectations
- Identify best poss. options within constraint & changing conditions
- Eg. Sales level to break-even
____________
S - Share the story:
- Interpretation
- Share results
- Data visualization
(!) Describe the audit data standard / ADS
General:
- Standard for data files & fields
- Provider & user w. same standard
- Reduce clean & format data costs
- Support external auditing
- Ensure complete & valid population
Benefits:
- Less time & effort to access data
- Work well w. standard audit & risk analytic tests
- Allow software vendors (ACL) to prod. data extraction programs for given enterprise system: Detect & prevent fraud & manage risk
- Facilitate test of N > Sample
- Work well w. XBRL GL Standards
(!) Describe the Altman´s Z score
General:
- Predict bankrupcy: Likelihood
- Credit-strength test
- Five financial ratios calc. as basis
- From data in annual report
- 1,8 = Head for bankruptcy
- 3 = Solid financial position
Ratios:
- Working capital / Total assets
- Retained earnings & Total assets
- EBIT / Total assets
- MV of equity / Total liabilities
- Sales / Total assets
(!) Describe data visualization & the process
General:
- Ref: Share the story
- Present info graphically
- Data –> info
- Present info for DM´er
__________
Process:
Understand data:
- Ref. ETL
Select data visualization tool:
- Excel
- Tableau
- Power-BI
Develop & present visualization:
- Design critical for effective info presentation
- Focus attention
- Avoid info overload: Less than processing power
(!) Describe considerations on data visualization
General:
- Consider axes
- Remember user
- Choose right chart type
- Use color & size for focus
- Provide key insights
- Consider delivery: Web or in person?
(!) Describe the elements of performing & sharing data analysis
Get data:
- Clean data
Set relationships among tables:
- Data > Relation
- Foreign & primary key
- Use structure to develop insights
Select visualization attributes:
- Attributes supporting story
- Sometimes need calc.
Select & modify visualization:
- Right chart type
- Relevant filters
Describe examples in the era of digital transformation
- Blockchain technology
- Artificial Intelligence / AI
- Algorithms
- Big Data Robotics
- Cloud Computing
- Internet of Things
- Cybercrime
- Fraud
(!) Describe blockchain
General:
- Shared ledger
- Data structure of transactions in blocks & chains
- Eliminate need for intermediaries in trustless, online, peer-to-peer digital currency transactions
- Eliminate middlemen in peer-to-peer transaction: Bitcoin
- Nodes = Computers
- Before smart contracts: No regulation, open network & anonymous transactions
- A secure form of AIS
Situations for use:
- Lack trust: Since agreement
- Errors or fraud by middleman
- If manual verification
- Supply chain: Transport
- Loyalty program: Customer files
- Auto industry: Process until delivery
Benefits:
- Faster
- Cut cost & ressources
Consequences:
- Complex
(!) Describe the three requirements to blockchain
Distributed & decentralized:
- Data distributed & synchronized w. whole network
- Fair participation
Consensus:
- Whole network validate & aware of transactions
- Confirmation of blocks
Immutability:
- When transactions confirmed: Not changeable
(!) Compare traditional transactions vs. Blockchain
Traditional transactions:
- Middleman: Yes
- Delays: Days
- Service fee: Yes
- System: Centralized ledger
- Data: Mid.man approve & record
- Copies: One
Blockchain:
- Middleman: No
- Fast transaction time: Minutes
- Service fee: Low
- System: Decentralized distributed ledger
- Data: All in network has copy
- Copies: Multiple
- Nodes in sync if new transactions
- Network see if add or delete info
- Secure, distributed data store
- Write-once, read-many system: No edit after committed
Describe popular consensus algoritms
General:
- New blocks added to longest chain
- How to agree on data
Proof of work:
- Miners compete to create block by solving complex mat problem
- Require computer power to solve: Prevent attacks
Proof of authority:
- Few network members are administrators
- Administrator create blocks for rest of network
- Administrator identity known: Ensure no bad behavior
- Network can vote to remove admin
Proof of stake:
- Validator proposing next block lock up crypty to ensure honest behavior
- Reduce computer cost & centralized risk
(!) Describe crypto currency
- No intrinsic value: Not redeemable
- No physical form: Only in network
- Network decentralized
- No bank
Describe how blockchains works
(!) Describe the difference btw. ERP & block chain
ERP:
- Centralized
- Manipulation risk: High
- Data: Many operations
- Database: Relational
- Labour-intensive: Yes. Humans
- Self-enforcing contracts: None
- Controls: Designed specific
- Accounting specific modules: Yes
Block chain:
- Decentralized & distributed
- Manipulation risk: Low
- Data: Only add
- Database: Linear transactional
- Labour-intensive: No
- Self-enforcing contracts: Easy by smart contracts
- Controls: By smart contracts. Program run when conditions met
- Accounting specific modules: No
(!) Describe Bitcoin & Ethereum
Bitcoin:
- First cryptocurrency
- Eliminate ability to double spend
- Often rejected since anonymous
- Transaction fee: Higher
- Business rules: Fixed
- Anonymous P2P transactions: No middleman
- Public blockchain: Anyone can join or leave & at any time
- Not changeable transaction history
- Add block to blockchain each 10m
- Validate by proof of work
- Rewards as economic incentive via mining: Resource intensive computation
- Distributed ledger
- Transaction fee differ by block size
- Mining 50% lower each 4 years
- Mining reward: 12,5 bitcoin
- Not spend transaction output system: Spend cash - receive change
Ethereum:
- Currency: Ether
- Transaction fee: Lower & differ by computation complexity
- Program smart contracts
- Constant mining
- Add block to b-chain each 10-12s
- Business rules: Programmable
- Mining reward: 3 Ether
- Account debited or credited
Describe cryptocurrency
General:
- Application to bc technology
- No intrinsic value: Not redeemable
- No physical form: Only in network
- Network decentralized
- No bank
Examples on transaction data:
- Sender
- Receiver
- Quantity of transferred bitcoin
- Timestamp of block transactions in chronological order
- Transaction blocks chained together
(!) Describe a smart contract
- Facilitate digital asset exchange
- Programmable in Ethereum: Not just hard coded fixed business rule
- More flexible
- Program run when conditions met
- Emerged w. Blockchain 2.0
- Can’t change when executed
- Business logic into software code
- Software code define terms, business rules & asset transfer
- Ensure following business rules
- Logic & business rules into contracts
- Computer program
(!) Describe the different types of block chains
Public blockchain
- Blockchain: Permissionless
- View & participate: No access restriction
- Economic reward for computing proof of mining work
- Eg. Bitcoin or Ethereum
Private blockchain
- Blockchain: Permissioned
- Join network: Need invite
- Enterprise blockchain
- No public exposure of internal info
- Transaction data & validation: Restricted
Consortium blockchain
- Blockchain: Permissioned
- Join network: Only relevant
- Allow more complicated enterprise behavior
- Org. participation: Allowed
- Allow private channels
- Consensus only on limited set of trusted nodes
- Eg. Hyperledger or Corda
Describe Corda
- Open source BC platform
- Use smart contract for business rules
- Only relevant party can join: Consortium BC
- Only related party informed on each transaction
- Administrators define & restrict users access rights
- Only restricted set of trusted nodes execute consensus protocol
- Developed by R3
(!) Describe Hyperledger
General:
- Open source BC platform
- Seek cross industry collab
- Smart contracts
- Private channels: Only visible for granted parties
- Configurable consensus
- Member management services
- By Linux foundation
(!) Describe the challenges of adopting Blockchain technology, solutions & the future for BC
General:
- Protocols lack req.: Speed, confidentiality & governance
- Sometimes visibility only needed for part of network
- Firm often start w. permissioned or private networks: Req. method to justify network participants
- Hard integrating private BC network w. existing enterprise solutions
- Blockchain not built to enterprise use
Solution:
- Blockchain 3.0
- Expand BC systems further: Beyond financial & business apps.
Future of BC:
- Cloud storage products
- Voting systems
- Attestation services
- Government adm.
Describe the impact BC has on auditing & assurance
Challenges:
- Require skillset: Programming & technology
- Auditor must understand how firm is implemented & use BC technology in its end2end business applications
- Illegal, fraudulent, or unauthorized transactions can appear in BC
- Records in BC network may have insufficient info for auditors
- Complicated protocols in the distributed ledger system
- Diff. BC platforms w. diff. protocols & terms make use of traditional auditing to BC technology difficult
- Understand smart contracts is crucial to audit BC use cases
__________
Benefits:
Continuous Audit possible:
- Since distributed & real-time info
(!) What is AI & its tasks
General:
- Machine intelligence > Human int.
- Computer ability of human tasks
- Cognitive technology
- Eg. Self driving cars
Tasks:
- Think logic
- Act rational
- Visual perception
- Speech recognition
- Language translation
Cognitive technologies:
- Self-learning algorithms: Computer examine connections & notice patterns w/o human intervention
- Eg. Machine learning, bots, robotics process automation, neural networks
(!) Describe machine learning
- A type of AI
- Computers ability to learn from experience > instructions
- Learn from training cases or data
- Self learning: Notice patterns
- Gather, classify & correlate data
- Learn from training w. data
- Predict new product or data
- Eg. Deep learning
(!) Describe neural networks
General:
- Engines of machine learning
- Mat models convert in- to output
- In-& output can be nested together
- Clean, well trained data set is used to optimize predictions
- Output = Predictions
- Trained model applied to more data
- Include loops
Feed-forward neural networks:
- Info in one direction
Recurrent neural networks:
- Connection btw. neurons has loops
(!) Describe deep learning
- Form of machine learning
- Multilayer neural network
- More complex
- More than two non-output layers
- Solve more sophisticated problems
- Often 2-3 hidden layers
(!) Describe what machine learning are designed to perform
Two types of machine learning performances:
Classification:
- Assign labels: Eg. Yes or no
- Divide input in output groups
Regression:
- Seek to predict real numbers
- Eg. House price or revenue
(!) Describe different types of machine learning
Supervised learning:
- Output = Known set of values
- Neural network try predict output by input dataset
- Data has in-& output pairs
- Output often number or label
- Most often used
Semisupervised learning:
- Some data labeled: Incorrect or missed
- Computer distinguish values from incorrect & missing output
- Active learning: Seek user to discover right label/output
- Eg. Netflix
Unsupervised learning:
- Use unstructured > labeled data
- Seek reducing clusters or dim.
- Try identify outliers in input data
- Diff parameters = Diff. results
- No input-output pairs
- Discover pattern in data
- Cluster if alike or not
Reinforcement learning:
- Trial/error
- Program act & learn by feedback on actions toward goal
- Goal is dynamic = Change by situ.
- Req. clear defined goals
- Req. not too complex situ.
- Eg. Games
(!) Assess model performance for classification
Confusion matrix:
General:
- Sum of predicted results
True positive /TP:
- Correct predict positive class
- Eg. It was a spam email
True negative / TN:
- Correct predict negative class
- Eg. It was NOT a spam mail
False positive / FP:
- Incorrect predict positive class
- Type 1 error
- Eg. I was actually NOT a spam mail
False negative / FN:
- Incorrect predict negative class
- Type 2 error
- Eg. It was actually a spam mail
(?) Describe some AI applications in accounting
Natural Language Processing / NLP & Natural Language Understanding / NLU:
- Machine communication
- Understand text
- Extract semantic meaning
- Discern sentiment from natural
- Eg. Alexa
Robotic process automation / RPA:
- Reduce human labor: Automation
- Tool performing high-volume repetitive accounting tasks
- Not ness AI tool
Machine Learning in audit & assurance:
- GL.ai (PWC): Replicate thinking & DM
- Deloitte’s machine: Review contracts & identify key terms
- Helix GLAD (EY): Detect anomalies in large database
- CLARA (KPMG): Potential risks
- BDO neural networks: Manage info in multiple languages globally
(?) Describe the elements of performing & sharing data analysis in Excel
Examine data & determine how tables connect:
Insert tables for data in each sheet:
- Rename tables
- Adjust format
Set relationships btw. tables:
- Data > Relationships
- Link table w. foreign & primary key
Summarize w. PivotTable:
- Format fields
- Change names
- Add slicers as filters
- Chart results in PivotChart
Describe the difference btw. a database, a data warehouse & big data
Database:
- Single data source
Data warehouse:
- Multiple data sources
- Only structured data
Big data:
- Also unstructured data