Big Data Analytics Flashcards
What is the promise of AI?
Unbiased consistent decision making, leverage data optimally and consistently
What are the key benefits of AI for financial managers?
-Data Processing: using both structured and unstructured data
-Improving efficiency: reduce costs by automating day-to-day assistance in risk management
Real-time and predicting
-Business decisions: greater predictive insight, visibility of risk
What is the HIPPO principle?
That AI attempts to objectify decisions by making them data-driven and not simply the Highest Paid Person’s Opinion
What does HIPPO stand for?
Highest Paid Person’s Opinion
What is the key (“oil”) to AI?
Data: brings key advantages expressed in the 5Vs
What are the 5Vs?
Volume, Velocity, Value, Veracity, Variety
Where is the Value Creation in Data?
GOOGLE, AMAZON, FACEBOOK
ARE INTEGRATED MODELS
WHO OWN THEIR
COMMUNITIES
PURE “PIPES” ARE WORTH 10x
LESS THAN COMMUNITIES
PURE PLAYERS ARE NICHE
PROVIDERS
What is the nº1 data business?
Advertisement
What are the valuation drivers?
The GAFA (?) => mostly Google and Amazon
What is the Gartner data value ladder?
It’s a graph describing different business analytics types and their impact on corporate culture.
What are the 4 types of analytics in the Gartner value ladder?
Descriptive: what happened?
Diagnostic: Why did it happen?
Predictive: What will happen?
Prescriptive: Will it happen again?
What are the key success factors of AI projects?
Human Acceptance: 70%
Tech: 20%
Algorithms: 10%
What is the discriminatory risk in AI?
AI is a clustering tool. Consumer clustering is inherently discriminatory
e.g. if it sees the main demographic buying beer is men, it’s only gonna promote beer discounts to men
What is the ethical problem with AI?
How can AI make moral trade-offs, how can we agree on a code of morals that all AI should follow (law, ethics, “societal value”)
What are the steps of the step data process?
Collect data
Clean and format data
Store data
Transform data
Use data
Where do you collect data?
Databases
Internet
Social Networks
Files (JSON)
How do you gather data from the internet?
Web scraping consists in collecting data directly published on websites
What do you need for internet scraping?
It requires interpreting the content of HTML pages in order to extract the content fields
When does internet scrapping work best?
It works well with structured content such as product catalogues, CMS systems and similar things.
What are the benefits of the JSON format?
Very Flexible
Very Easy to Parse
Strong Momentum
What is data cleaning?
Its the process of detecting
and correcting (or removing) corrupt or inaccurate records
What is the Tidy Data principle?
Tidy datasets are easy to manipulate, model and visualize, and have a specific structure.
How do you apply the Tidy Data principle?
- One variable; one column.
- Each observation; one row.
- There should be one table for each “kind” of variable.
- If you have multiple tables, they should include a column in the table
that allows them to be linked.
Why do we store data?
Big data storage enables the storage and sorting of big data in such a way that it can easily be accessed using the right tools.