Lecture 10: Big Data Flashcards
Define Big Data and Describe its key characteristics (The Four V’s)
Big data are “datasets whose size is beyond the ability of typical database software tools to capture, store, manage and analyse”
- Volume
- Quantity of data
- Exponential growth in generation, collection and storage of data
- Challenge: datasets are of a size beyond the ability of typical database software tools to capture, store, manage and analyse. New forms of physical infrastructure (e.g. the cloud), hardware and software (e.g. Hadoop) are required - Velocity
- The speed at which data is acquired and used
- Challenge: Determine meaning at a faster rate and in real-time - Variety
- The various types of data
- Data can be structured or unstructured. Estimated that 20% of big fata is structure and 80% unstructured.
- This is a challenge because traditional information systems and data analyses are structured. New sources of data require new analytical methodologies.
- Structured data: has a defined length and format
• E.g. numbers, dates, strings
• Usually stored in a database; can be queried using e.g. SQL (Structured Query Language)
• ‘Traditional’ accounting and financial data, including customer relationship management (CRM) data and Enterprise Resource Planning (ERP) data
Computer/Machine-generated
• Sensor data (e.g. smart meters, medical devices, GPS data)
• Web log data—servers, applications, networks
• Point-of-sale data
• Financial data (e.g. automated stock trading data)
Human-generated
• Input data
• Click-stream data
- Unstructured data: Machine-generated • Satellite images; scientific data; photographs and video; radar or sonar data Human-generated • Internal business textual data • Social media data • Mobile data • Website content
- Veracity
- Refers to the truthfulness (reliability) of the data quality
- Challenge: Many data (esp. unstructured) sources are of doubtful veracity. Data may:
• Come from untrusted sources
• Be ‘dirty’ (inaccurate or incomplete)
• Have a low signal-to-noise ratio
- Quality of data analytics is only as good as quality of underlying data.
- Structured vs unstructured data
- Block-chain technology
Tutorial 10 Q2:
Why is big data analytics relevant to accountants? What are the potential applications and hence skills required by accounting graduates?
- Accounting records almost exclusively recorded in digital form nowadays
- Big data will impact on accounting in how data are accumulated and recorded, how mgmt. use data to measure and improve performance; how elements of reporting are processed and communicated, and how veracity is assured.
- Thus, changing skill sets required of the accounting profession:
• New audit skills, e.g. research and identify anomalies in risk factors underlying data; assurance of cybersecurity
• New tax skills, e.g. gather large amt of data in various forms and use it to help make tax dept business decisions; use technology to verify that remedial actions for calculations are within compliance limits
• New risk mgmt. skills, e.g. use simple vendor risk dashboards and filters to minimise inefficiencies + human errors
New advisory skills, e.g. identify and frame key business decisions and their related metrics to make solutions more effectively and efficiently; extra right data from different sources and run most appropriate analytics solutions to generate insights
Tutorial 10 Q3:
How will video, image, audio and textual data (i.e. data beyond traditional accounting numbers) change accounting?
Video, image, audio and textual data made available via Big Data can provide for improved managerial accounting, financial accounting and financial reporting practices.
Video and Image Data:
- Surveillance footage to reveal when and how many times restricted-access area is entered
- Video of inventory to assess real-time quantity changes to measure quantity and identify bottlenecks
Audio Data:
- Include conference calls, s/h and board of director meetings, customer calls, internal employee phone calls, audio from video sources
- Can provide additional evidence to support accounting records—e.g. interviews with client employees to understand asset valuations; analysis of customer phone calls to assess customer satisfaction etc.
Textual Data:
- All non-financial documentation—regulatory fillings, emails, corporate webpages, news media and social media.
- Use data to analyse customer satisfaction, employee satisfaction and identify fraud risks
2016 Semester 1 Exam Q7c)
Identify and describe key ethical and security concerns around the use of big data.
Ethics is an examination of principals, values, duties and norms, the consideration of available choices in order to make the decision and the strength of character to act in accordance with that decision
Key issues around the use of big data include:
- Data security and data breaches, such as the theft of data. Need to maintain confidentiality of data.
How secure is the data if it is shared with third parties? What is to stop third parties from on-selling the data?
- Information privacy, since personal information including names, addresses, medical records, bank details, photos etc are often captured by an individual’s interactions and transactions online. Business often undertake data mining and customer profiling to target advertising to individuals (e.g. through in-store loyalty cards, use of ‘cookies’ to determine which banner ads to display)
Question is, would this be an invasion of individual privacy? - Consent, information about users can be gathered without consent/with informed constant/with implied consent
but questionable if individuals are providing information consent for data collection? Can consent ever be fully informed? - Ownership of personal data , whether if it is the individual or the corporations who ‘owns’ and controls the data.
Is personal data an ‘asset’ in accounting terms? - Compliance with privacy laws, difficulty with regulating transactions on the global internet.
** What processes could business put in place to make their action ethically acceptable?