17. Data in Banking Flashcards
Are the following examples of data sources accessed by banks internally or externally?
- credit references
- credit scoring
- credit rating agencies
All external sources of data
True or false - banks collect data from sources which are in the public domain?
True
What is meant by ‘Big Data’? What are the defining properties/characteristics of big data? (3/4)
Large, complex datasets.
Defined initially by the ‘3 V’s’:
1. Volume
2. Velocity
3. Variety
IBM added the 4th V:
4. Veracity
What is the aim of Big Data analytics?
To find patterns in the datasets which create business value.
In relation to data, what do we need to consider when we look at its VOLUME?
The SCALE of the data
In relation to data, what do we need to consider when we look at its VELOCITY?
The FREQUENCY at which data is GENERATED, CAPTURED & SHARED.
In relation to data, what do we need to consider when we look at its VARIETY?
DIFFERENT FORMS originating from DIFFERENT SOURCES (e.g. video, audio, text)
In relation to data, what do we need to consider when we look at its VERACITY?
The UNCERTAINTY of the data.
Big Data is measured in the following measurements:
- Exabytes
- Zottabytes
- Petabytes
- Zettabytes
- Terabytes
Put them in order from smallest to largest
How many of each is needed to make the one that is the next measurement up?
Largest - Smallest:
- Terabytes
- Petabytes = 1024 Terabytes
- Exabytes = 1024 Petabytes
- Zettabytes = 1024 Exabytes
- Zottabytes (Yottabytes) = 1024 Zettabytes
Remember:
T-PEZZ (like Tepees, the two Zs come in alphabetical order)
What is Structured Data? What are its main characteristics? (3)
Data which has been highly organised within a relational database so that it’s easily accessible.
Info should be:
1. Easily searchable
2. Organised
3. Displayed by search engines
What is unstructured data?
Data which has not been organised in a database format.
The following are examples of unstructured data. Determine which are examples that are human generated and which are machine generated.
- Satellite Images
- Seismic Imagery
- Social media data
- Security
- Surveillance/Traffic Video
- Organisational Internal Documents
- Radar/Sonar data
- Machine
- Machine
- Human
- Machine
- Machine
- Human
- Machine
What is predictive analysis?
Using data to provide future insights
What type of data analysis can be used by banks to combat financial crime/fraud?
Predictive analysis - can predict behaviours that would be expected of the customer & flag up any unusual behaviours as fraud.
The following are types of predictive analysis models:
- Customer Lifetime Value (CLV) Model
- Customer Segmentation Model
- Predictive Maintenance Model
- Quality Assurance Model
Explain briefly what each of these models can predict.
- Which customers are likely to invest in more products & services
- How to best group customers based on similar characteristics/behaviours
- The chances of essential equipment breaking down
- Defects in products & services
What is a Decision Tree? What is it used for? How does it achieve this?
A schematic tree shaped diagram modelling technique.
Used to determine which course of action to take by showing statistical probability. Shows how one choice may lead to the next.
What are Regression Techniques used for? How does it achieve this?
Modelling technique used to forecast asset values.
Helps users to understand the relationship between variables e.g. commodities and stock prices
What are Neural Networks? What are they used for? How do they achieve this?
Modelling technique which uses cutting edge algorithms to identify relationships within a dataset.
Does this by mimicking how the human mind works.
How is data gathered initially so that it can later be used in predictive analysis modelling techniques? (3) Briefly describe each method.
- Data mining = looking for correlations/patterns within large datasets. Relies heavily on statistics.
- Data analysis = finding expectations, checking hypothesis, querying existing data
- Machine learning = looks for trends & the programmes reconfigure themselves as they go along.
True or False - data is information which is made up of words and figures
False. Can be words and figures, but not just words and figures. Can also be:
- Swipes on a screen
- Images
- Sounds
- Mouse movements
etc
Why did incumbent banks initially struggle to make use of predictive data?
Because they had to be compliant with regulations surrounding data privacy.
What is an Algorithm?
A SEQUENCE OF INSTRUCTIONS for
- analysing data
- solving problems
- performing tasks
What is Analytics?
The use of
- data
- statistical modelling
- algorithms
to create INSIGHTS, PREDICT OUTCOMES & OPTIMISE DECISIONS
What are Cognitive Technologies?
The underlying technologies that enable AI
What is Artificial Intelligence (AI)?
Computer systems which are able to perform tasks that normally require human intelligence.
What is Machine Learning?
Computer programmes which improve their own performance through exposure to data.
What are Neural Networks?
Computer models which are used for machine learning.
They are designed to mimic human brain structure - layers of virtual neurons recognise patterns in data which has been input into the system.