17. Data in Banking Flashcards

1
Q

Are the following examples of data sources accessed by banks internally or externally?

  • credit references
  • credit scoring
  • credit rating agencies
A

All external sources of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True or false - banks collect data from sources which are in the public domain?

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is meant by ‘Big Data’? What are the defining properties/characteristics of big data? (3/4)

A

Large, complex datasets.

Defined initially by the ‘3 V’s’:
1. Volume
2. Velocity
3. Variety

IBM added the 4th V:
4. Veracity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the aim of Big Data analytics?

A

To find patterns in the datasets which create business value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In relation to data, what do we need to consider when we look at its VOLUME?

A

The SCALE of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In relation to data, what do we need to consider when we look at its VELOCITY?

A

The FREQUENCY at which data is GENERATED, CAPTURED & SHARED.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In relation to data, what do we need to consider when we look at its VARIETY?

A

DIFFERENT FORMS originating from DIFFERENT SOURCES (e.g. video, audio, text)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In relation to data, what do we need to consider when we look at its VERACITY?

A

The UNCERTAINTY of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Big Data is measured in the following measurements:

  • Exabytes
  • Zottabytes
  • Petabytes
  • Zettabytes
  • Terabytes

Put them in order from smallest to largest

How many of each is needed to make the one that is the next measurement up?

A

Largest - Smallest:

  1. Terabytes
  2. Petabytes = 1024 Terabytes
  3. Exabytes = 1024 Petabytes
  4. Zettabytes = 1024 Exabytes
  5. Zottabytes (Yottabytes) = 1024 Zettabytes

Remember:

T-PEZZ (like Tepees, the two Zs come in alphabetical order)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Structured Data? What are its main characteristics? (3)

A

Data which has been highly organised within a relational database so that it’s easily accessible.

Info should be:
1. Easily searchable
2. Organised
3. Displayed by search engines

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is unstructured data?

A

Data which has not been organised in a database format.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The following are examples of unstructured data. Determine which are examples that are human generated and which are machine generated.

  1. Satellite Images
  2. Seismic Imagery
  3. Social media data
  4. Security
  5. Surveillance/Traffic Video
  6. Organisational Internal Documents
  7. Radar/Sonar data
A
  1. Machine
  2. Machine
  3. Human
  4. Machine
  5. Machine
  6. Human
  7. Machine
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is predictive analysis?

A

Using data to provide future insights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What type of data analysis can be used by banks to combat financial crime/fraud?

A

Predictive analysis - can predict behaviours that would be expected of the customer & flag up any unusual behaviours as fraud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The following are types of predictive analysis models:

  1. Customer Lifetime Value (CLV) Model
  2. Customer Segmentation Model
  3. Predictive Maintenance Model
  4. Quality Assurance Model

Explain briefly what each of these models can predict.

A
  1. Which customers are likely to invest in more products & services
  2. How to best group customers based on similar characteristics/behaviours
  3. The chances of essential equipment breaking down
  4. Defects in products & services
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a Decision Tree? What is it used for? How does it achieve this?

A

A schematic tree shaped diagram modelling technique.

Used to determine which course of action to take by showing statistical probability. Shows how one choice may lead to the next.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are Regression Techniques used for? How does it achieve this?

A

Modelling technique used to forecast asset values.

Helps users to understand the relationship between variables e.g. commodities and stock prices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are Neural Networks? What are they used for? How do they achieve this?

A

Modelling technique which uses cutting edge algorithms to identify relationships within a dataset.

Does this by mimicking how the human mind works.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How is data gathered initially so that it can later be used in predictive analysis modelling techniques? (3) Briefly describe each method.

A
  1. Data mining = looking for correlations/patterns within large datasets. Relies heavily on statistics.
  2. Data analysis = finding expectations, checking hypothesis, querying existing data
  3. Machine learning = looks for trends & the programmes reconfigure themselves as they go along.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

True or False - data is information which is made up of words and figures

A

False. Can be words and figures, but not just words and figures. Can also be:

  • Swipes on a screen
  • Images
  • Sounds
  • Mouse movements
    etc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Why did incumbent banks initially struggle to make use of predictive data?

A

Because they had to be compliant with regulations surrounding data privacy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is an Algorithm?

A

A SEQUENCE OF INSTRUCTIONS for
- analysing data
- solving problems
- performing tasks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is Analytics?

A

The use of
- data
- statistical modelling
- algorithms

to create INSIGHTS, PREDICT OUTCOMES & OPTIMISE DECISIONS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What are Cognitive Technologies?

A

The underlying technologies that enable AI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is Artificial Intelligence (AI)?

A

Computer systems which are able to perform tasks that normally require human intelligence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is Machine Learning?

A

Computer programmes which improve their own performance through exposure to data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What are Neural Networks?

A

Computer models which are used for machine learning.

They are designed to mimic human brain structure - layers of virtual neurons recognise patterns in data which has been input into the system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

What is Deep Learning?

A

An advanced technique which is used to implement machine learning.

Uses multiple layers of Neural Networks stacked on top of each other. Simulates more complexed decision making that can be carried out by the human brain.

29
Q

What is a Binary Digit? What is another name for it?

A

Aka a BIT

The smallest bit of information that can be found within a computer, used for storing information.

It’s values are binary, so they are things like 1/0, On/Off, True/False

30
Q

What is Hyperpersonalisation?

A

Using data to personalise products and services for each of the life stages, plus individualised personalisation based on an individual’s behaviours/actions

31
Q

How did the legacy systems used by incumbent banks prove challenging with regards to the use of data?

A

Legacy systems were ‘silent data streams’ - they did not communicate with each other. They’re all built into individual business structures and not designed for sharing data.

32
Q

What is a Third-Party Provider (TPP)? Which piece of legislature concerns TPPs and what does it permit them to do?

A

TTPs are a new type of Payment Service Provider, introduced as a result of the EU’s Payment Services Directive (PSD2).

As part of PSD2’s open banking legislature, TTP’s who are authorised by the customer (have their consent) to access their data. This can be either so they can make a transaction or for information services.

33
Q

What do third-party providers (TPPs) use to access customer data?

A

Open APIs (Application Programming Interfaces)

e.g. open software shared between banks

34
Q

What are the two types of third-party providers? Describe the difference between the two & provide an example of each.

A

Account Information Service Providers (AISPs)
= only allowed to gather read-only account data. E.g. Money management apps

Payment Information Service Providers (PISPs)
= can both access data and initiate payments from the accounts being accessed. E.g. Klarna

35
Q

True or False - all data shared via Open Banking is subject to EU GDPR rules?

A

True

36
Q

What is Open Finance? How does it differ from Open Banking?

A

We currently have open banking. This means the sharing of payment, loans, savings data

The next step is open finance, where, as well as banking information being shared, data would also be shared by other financial establishments eg. insurance, pensions, tax etc.

37
Q

According to the principle of open finance, who owns customers’ data?

A

The data is owned & controlled by the customers themselves.

The data can be shared ethically & safely only once the customers give their consent.

38
Q

True or false - some people are more vulnerable to data scams than others?

A

True - everyone has different levels of financial capability and resilience, so some people are more vulnerable to data scams.

39
Q

How can banks use data to help vulnerable customers? What types of data can they collect to help with this? (3)

A

Banks can use data to monitor and learn about their vulnerable customers. This helps them to ensure they are offering these customers suitable products & services, and protect them from financial abuse.

Banks can monitor the following as an indicator of low financial resilience:
1. Signs of stress
2. Low average balances
3. Heavy overdraft usage

40
Q

What is the difference between AI and Machine Learning?

A

AI = machines processing huge amounts of data in order to analyse, identify patterns, make predictions through a range of methods.

Machine Learning = sub-category of AI. Basically, the computer is able to learn from the data without instruction.

41
Q

How does machine learning work?

A

Algorithms are used to identify patterns in data that has been processed.

The machine learning algorithms then learn from this data in order to make predictions and recommendations.

E.g. Netflix recommending programs based on a library of data on user viewing behaviour. They can predict what you want to watch.

42
Q

Describe the following types of Machine Learning:

  1. Supervised Learning
  2. Unsupervised Learning
  3. Reinforcement Learning
A
  1. Data is LABELLED/TAGGED to tell the machines what patterns to look for
  2. Machines look for patterns through CLUSTERING DATA
  3. Machine algorithms learn through TRIAL AND ERROR
43
Q

What is the main risk of using algorithms?

A

Algorithms run in the background as ‘black boxes’, i.e. you cannot see it’s workings for how they make their predictions, these are hidden. Therefore, they are susceptible to fraud.

44
Q

What are some examples of how machine learning can be used in retail banking? (4)

A
  1. Recommending products and services
  2. Voice assistance
    - matching words to sounds that come from a customer
  3. Facial recognition identification
  4. Text reading/web page describing services for the visually impaired
45
Q

What is a Distributed Ledger?

A

A book or computer file used to STORE OR RECORD the ECONOMIC VALUE of transactions in a monetary unit.

46
Q

What is Distributed Ledger Technology (DLT)?
How are they maintained?

A

A decentralised record-keeping database.

It is maintained COLLABORATIVELY by a number of parties who regularly agree on a Peer-to-Peer basis how to perform updates to the information held using a MUTUAL CONSENSUS VERIFICATION MECHANISM.

There is a master copy of the data which cannot be altered or deleted, so everyone shares the same single version of the truth.

47
Q

How does new data get added to Distributed Ledger Technology (DLTs)?

A

Each party sends their new data to the shared single ledger.

The data is verified using ZERO KNOWLEGDE PROOFS. This is a form of CRYPTOGRAPHY that allows the system to determine whether the data being submitted is true or not. The data is encrypted, making it more secure, ie. the system is able to get enough data to see whether it is true without accessing the dataset as a whole.

Once the data is verified, it gets added to the ledger in BLOCKS as part of a BLOCKCHAIN. It now cannot be changed or deleted.

48
Q

What is a Smart Contract? What is the main benefit of using a Smart contract? How can it be used with Distributed Ledger Technology (DLT)? What is the main benefit of using a smart contract?

A

Coded instructions which can self-execute on the occurrence of an event.
(E.g. can be written into a smart contract for an insurance company to automatically release funds upon submission of a valid claim)

Benefit = Removes the need for human intervention. Trustworthy & reduces operational risk

DLT - A smart contract can be stored within the blockchain. The contract will contain obligations of all parties which are agreed in advance, removing the need for lawyers.

49
Q

Name 4 things that smart contracts can be used to support.

A
  1. Automatic re-ordering of stock
  2. Automatic upload of purchase orders
  3. Automatic document preparation
  4. Potential application of AI for compliance checks
50
Q

What is EOS? What system does it use to validate its data? What is the name of its cryptocurrency?

A

A Blockchain-based decentralised system which uses a delegated proof-of-state system to validate its data.

Cryptocurrency = EOS Token

51
Q

What is Proof-of-State? How does it work?

A

A system used by Bitcoin & other public blockchains (eg VISA) to validate its data.

Originally this was introduced to reduce the need for as much computational work.

Blocks are verified using the machines of coin owners - less need for centralised computational work. Each block needs to be validated by multiple validators before being added to the blockchain.

“Validators” are selected randomly, but there is a minimum stake/coins that someone needs to hold to become eligible. The validator can collect a fee for their work.

52
Q

What is Delegated Proof-of-State? How does it work?

What are it’s benefits? (2)

A

Different to normal Proof-of-State as instead of the “validators” being chosen randomly, they are chosen democratically.

Network users vote for which “delegates” get to validate the next block.

The delegates work together more effectively/efficiently because they want to be continued to be voted in. If they do not validate within 24 hours, they get removed from the voting list.

Benefits = greater scalability & speed compared with proof-of-state.

53
Q

What is a party node?

A

A computer on the blockchain network.

54
Q

How do public blockchains work?

A

Anyone can join by setting up their own party node. The node then creates a transaction, which leads to the creation of a ‘block’ which represents this transaction.

The block gets broadcasted to all party nodes on the blockchain network. The nodes validate the blocks & the transaction gets verified/executed.

55
Q

What is a Cryptographic Hash?

A

In public blockchains, it is a unique encryption found within a block which securely contains the address of both the current and previous block within the chain.

56
Q

How do private/closed blockchains work?

A

Not open to the public. Party nodes are verified in advance.

Verified party nodes then create a private transaction and a block is created to represent that transaction.

An authorised party called a NOTARY validates the transaction & the block gets verified/executed.

57
Q

Are blockchains currently used within the financial services industry?

A

Not really, most have centralised ledgers.

But banks are beginning to explore DLT as something which could be of benefit in the future.

58
Q

What are the benefits of cloud computing? (3)

A
  1. Speed to market
  2. Reduced barriers to entry
    - bank doesn’t have to buy it’s own technology application, just needs to pay to use the service
  3. Access to the latest technology
59
Q

In relation to cloud computing, what is a public cloud deployment model?

What are the benefits of using this type of cloud computing? (2)

A

The cloud infrastructure is based on the service provider’s premises but it is subject to open use and can be operated by a user.

Benefits:
1.Easy to implement
2. Low cost.

60
Q

In relation to cloud computing, what is a private cloud deployment model?

What are the benefits of using this type of cloud computing? (2)

A

The cloud infrastructure is owned by an individual organisation. It can be hosted either internally or externally.

Benefits:
1. High levels of security
2. Scalability

61
Q

In relation to cloud computing, what is a Hybrid cloud deployment model?

What is once key reason why some businesses might prefer hybrid?

A

Applications are run using a combination of public & private clouds. You get the best of both worlds - performance, security & scalability.

Hybrid supports a BYOD - Bring Your Own Device policy to access business critical applications.

62
Q

What is ‘Hosting’?

A

A type of MANAGED SERVICE, i.e. the HOUSING OR MAINTENANCE of one or more websites.

63
Q

One model for providing a managed service on the cloud is ‘Platform as a Service’. Describe what this involves.

A

For web/app development.

The service provides firms with the ability to develop apps and websites, all within the cloud. It provides support for the entire web development cycle - from building & testing to final development.

64
Q

One model for providing a managed service on the cloud is ‘Infrastructure as a Service’. Describe what this involves.

A

For business data infrastructure systems.

The service provides basic data centre infrastructure (like storage or visualisation) but firm buying the service is responsible for the operating systems and applications etc.

Least comprehensive, bare bones. Basically provides a cloud alternative to hardware.

65
Q

One model for providing a managed service on the cloud is ‘Software as a Service’. Describe what this involves.

A

For providing complete and entire software systems.

The service provides software to businesses and the supports with management of applications/data/middleware/storage/networking

Most comprehensive. Commonly used by businesses as it makes it easier to streamline maintenance & support

66
Q

Hosted services can either be Single or Multi Tenant. What does each of these mean?

A
  1. Single Tenant = data / software is exclusive, it is not shared between clients/customers
  2. Multi Tenant = service uses all use a single instance of software or infrastructure which serves the needs of multiple customers. Each user is supported by a single database.
67
Q

What are the main challenges of cloud computing? (2)

A
  1. Integration with existing core infrastructure
  2. Compliance, regulation & security
    - rules vary between jurisdictions re: data storage & usage
68
Q
A