Chapter 3 - Data Flashcards

1
Q

What time frame of data do actuaries require for considering the future and why?

A

Usually require data about the present to give an accurate starting point for projecting into the future and past data to use as a guide for constructing models and setting assumptions Ex: exposed to risk or numbers of deaths/claims
Knowledge of structural drivers is also essential ex: recent pandemic or social trends;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Give examples of different data that might be needed for a bank seeking to lend int he form of mortgages? (data thats fact, uncertainty and judgement?)

A

Facts obtained: Amount of requested mortgage, address of home which loan is secured, house purchase price
Reasonably accurate data: Past loan experience, default rates for different types of borrowers or house price movements for example
Assumptions/Judgement: Banks assessment of the current economy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Define explicit assumption

A

Explicit assumptions are those that have been expressed and shared.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define implicit assumptions

A

Implicit assumptions are those that haven’t been articulated. We make implicit assumptions based on our personal experience and position, often without even realising that that’s what we’re doing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the two main uses of data by actuaries

A

Necessary for actuarial tasks: ex: pension scheme set up
Also for model development to predict what might happen in the future

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Compare scientists view of the world vs actuaries in terms of modelling and predictions.

A

Scientists often know future outcomes as the same thing will happen every time. For Actuaries the situation is more difficult for two reasons:
First, the future is random. Model output is either a probability distribution or characteristics of a probability distribution.
The second difficulty is that the probability distribution or its characteristics are rarely known.
So, while physicists, chemists, economists and actuaries all have theories about how the world works, only the first two work in a reasonably stable, consistent and predictable environment.
Actuaries also have to recognise that observations from the past may or may not be representative of the conditions that will apply in the future.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Explain why insurance companies often collaborate collecting data? - not intuitive to collaborate with competitors?

A

Insurance companies have long recognised the need to acquire large amounts of data and often do so by collaborating in its collection.
This may be viewed as anti competitive but also may be viewed as increasing competition int he market because new market entrants have much to learn from the other players.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When specifying data requirements - what must be considered by actuaries before data collection

A

What data they need - Actuaries often may ahve opportunity to be part of the process when a new product or system in introduced. This provides an opportunity to request data fields that may be useful for future analyses.- think of data needs for the future.
Must find out what data is available to you.
Have an idea of the nature of the solution in advance of data collection
Definitions of each field in data you’re collecting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Give examples of data that might be required and also desired from employees for a DB pension scheme calculation of PV of promised benefits

A

Date of birth, date hired, current status, salary history, benefit amount and annuity choice for actives.
Forecasts may also benefit from knowing: gender and job classification and time in that job. These may help because mortality and retirement rates may differ by both these factors. Future salary values may differ by job classification and time served in that position.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why do employers collect data that they cannot discriminate based on?

A

It’s collected to demonstrate compliance with anti discrimination, Need data to be representative of the population and for Reserving purposes! Insurers setting out technical provisions can collect gender data. It is not permitted to affect pricing but reserving needs to take account of gender.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the balance equation between grouping data and credibility

A

Ideally data to be analysed should be split into homogeneous groups in a mortality investigation. There is a balance to be struck between splitting data into homogeneous groups and having sufficient data in a group
Where data is scarce, such as for numbers of deaths at young ages, splitting data into homogenous groups may result in data groups that are too small to enable any credible analysis.
There is also a need to carry out sensitivity testing to check that if the data are grouped in a different way the same results are obtained.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When using industry wide data why wouldn’t data supplied by different organisations be comparable?

A

Heterogeneity due to:
Geographical or socio-economic sections of the market, different sales methods, different practices ex: underwriting, Nature of the data stored by different companies will not be the same, Coding used for the risk factors may vary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What problems can arise from using industry wide data?

A

Heterogeneity - data being on different basis
Data is less detailed or less flexible
External data are often more out of date than internal data
Data quality will depend on the quality of the data systems of all of its contributors
Not all organisations contribute - not representative.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a key different between sampling and surveying?

A

Sampling = truly random selection, forced responses
Survey = biassed by voluntary returns

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define stratified sampling and give an example

A

Stratified sampling ( risk-based sampling) deliberately biased to large claims / important segments.
Ex: A full valuation of insurance liabilities or a pension scheme may need to use whole population data in order to demonstrate sufficient accuracy whereas customer satisfaction analysis might use survey data as accuracy is less critical.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define Cross sectional vs longitudinal data

A

Cross sectional data means looking at multiple individuals over a short period of time, while longitudinal data looks at a (usually smaller) number of individuals over a longer period of time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Define a record

A

A record is a collection of data referring to one individual or one contract. A field is a property that a record might have.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Define a relational database

A

A relational database uses multiple table structures, cross-referencing records between tables- can reduce run times for searching or sorting records.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are three steps to do to ensure high quality data before the data is used?

A

Prevention - eliminate errors before they arise. Ex: data capture form tests, feasible values (gender M or F), automatiatic checks, type in email twice,
Detection - Study collected data for errors. Is it in line with my expectations?
Treatment - deal with errors that have been detected. May be possible to repair the data, or use imputation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Define imputation

A

The assignment of a value to something by inference from the value of the products or processes to which it contributes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Define deterministic data checks and give examples

A

Deterministic checks look for specific errors that are likely to occur.
Entries restricted to a specific list of possibilities such as male / female,
Entries are restricted to certain numerical ranges, such as range of allowable ages
Entries much bear specific relationships to each other Ex: Surrender value<Death benefit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Define an exploratory check with examples

A

Examines various global characteristics of the data to see if anything usual has been recorded.
Ex: calculations of max, min, means, stdevs, histograms, correlation / scatter plots etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What data checks should you perform acdcording to Mailander 2000?

A

Know where the data comes from, why and how it was captured
Understand the incentives inherent in the data’s original use
Examine several randomly selected records - anything unexpected?
Look for mistakes ex: blanks and duplicates
Ask for the definition of the critical data items - ex: Does Smoker mean: are you a smoker vs have you ever smoked Tabacco?
Develop ways to verify the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

When checking data detail some other things you should do to verify it?

A

Check and reconcile data with other courses ex: compare internal demographics.
Check liability or asset exists on a given date and appropriate value has been recorded
Check liability is held or an asset is owned on a given date;
Check when an event is recorded the time of the event and the associated income or expenditure are allocated to the correct accounting period;
Check data is complete, consistent and free from unusual values
Perform random spot checks on data for individual members/policies or assets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Detail how you should go about data repair

A

Ideall return to the source but this is expensive
May be able to spot what missing data should have been by Imputation: filling in missing fields based on comparison with other records where data is complete. Can be complicated multivariate analysis and assumes data deletion is random - risky
Also if you spot mistakes try to fix them - unrealistic to assume no mistakes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What risks arise with use of summarized data

A

When valuing benefits for a scheme say, it may be appropriate to use summarised data instead of detailed membership data in some circumstances. It should be recognised that the reliability of the values will be reduced, as full validation of the data will be impossible. Summarised data may miss significant differences between the nature of benefits that have been grouped together so only suitable if such inaccuracy is recognised by the users of the results of the calculations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How should insurers use summarised data provided by national agencies?

A

In some countries there are organisations that collect data from their member offices and then
make available summaries of all the data to their members. This can be used to determine bases for pricing, cannot be used in place of policy data to set provisions but is a starting point or a place to compare figures. Ex: CMIB figures

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Explain what CMIB do?

A

Continuous Mortality Investigation Bureau of the Institute and Faculty of Actuaries in the UK, which does a large amount of work on mortality and morbidity statistics. CMIB accept data from insurers and publish mortality tables from this.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Why might industry pooled mortality data from CMIB or equivalent be different from population for insurer?

A

This is because underwriting standards or distribution channels can be different and some insurers may have particularly strong franchise with a certain occupation.

Distribution channels would have different mortality because of the idea of policyholders self electing for life assurance. Some policyholders may have a policy because they sought financial advice, others may be because they saw ads on social media/ on location.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What does data governance mean

A

Data governance is the term used to describe the overall management of the availability, usability, integrity, and security of data employed in an organisation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Define a data governance policy

A

A data governance policy is a documented set of guidelines for ensuring the proper management of an organization’s data. A data governance policy will set out guidelines with regards to:
Roles and responsibilities with regards to data, has to be signed off every year.
How an organisation will capture, analyse, and process data;
Issues with respect to data security and privacy;
Data controls that will be put in place and
how the adequacy of the controls will be monitored on an ongoing basis for data Usability, reliability, accessibility, integrity and security.

33
Q

What are the risks an organisation si exposed to if they do not have an adequate data governance policy?

A

Legal/regulatory non-compliance; If someone asks how their data is used and data is not centralised and controlled there can be issues
Inability to rely on data for decision making;
Reputational issues; and
Incurring additional costs

34
Q

What data standards are actuaries held to?

A

All data standards int eh EU but also technical actuarial standard (TAS) 100

35
Q

State the 5 standards of TAS 100

A

Data is relevant for the purpose of technical actuarial work
If data is insufficient or unreliable it shall be improved to extent thats proportionate
Checks and controls have been applied and actions taken to improve data shall be documented
Communications shall describe the data used in technical actuarial work, source and any material uncertainty,and the approach taken to deal with this material uncertainty.
Communications must state limitations in actuarial information resulting from use of this data and provide indication of their impact on the information.

36
Q

Give the 15 principles of the European statistics code of practice

A
  1. Professional independence
  2. Mandate for Data Collection and Access to Data
  3. Adequacy of Resources
    4.Commitment to Quality
  4. Statistical Confidentiality and Data Protection
  5. Impartiality and Objectivity
  6. Sound Methodology
  7. Appropriate statistical procedures
  8. Non-excessive Burden on Respondents
  9. Cost effectiveness
  10. Relevance
  11. accuracy and reliability
  12. timeliness and punctuality
  13. Coherence and comparability
  14. Accessibility and Clarity
37
Q

Explain professional independence from the European statistics code of practice

A

Professional independence of statistical authorities from other policy, regulatory or administrative departments and bodies, as well as from private sector operators, ensures the credibility of European Statistics. The desire is for government data to fully reflect reality and not to have been tampered with

38
Q

Explain mandate for data collection and access to data from the European statistics code of practice

A

Statistical authorities have a clear legal mandate to collect and access information from multiple data sources for European statistical purposes.

39
Q

Explain adequacy of resources from the European statistics code of practice

A

The resources available to statistical authorities are sufficient to meet European Statistics requirements.

40
Q

Explain impartiality and objectivity from the European statistics code of practice

A

Statistical authorities develop, produce and disseminate European Statistics respecting scientific independence and in an objective, professional and transparent manner in which all users are treated equitably.

41
Q

What are the risks associated with using data

A

Errors/ Omissions
Insufficient historic data available to estimate credibly the extent of a risk or the extent of a risk in very adverse circumstances
If this is the case may ahve to sue other sources - from which data may not be as good a proxy for risk being assessed.
Historic data may not be a good reflection of future experience.
Balance of homogeneity vs credibility may be wrong.

42
Q

Why might Historic data not be a good reflection of future experience? give some reasons

A

Significant random fluctuations;
Future trends not being reflected sufficiently in past data;
Changes in the way in which past data was recorded;
Changes in the balance of any homogeneous groups underlying the data;
Heterogeneity with the group to which the assumptions are to relate;
The past data may not be sufficiently up to date, societal and tech changes

43
Q

What are the risks associated with grouping data

A

May be too small a group for credible analysis
Data set may not be sufficiently homogeneous.
May not be in a form that is appropriate for the purpose
A lack of confidence in the available data will reduce the confidence in an actuary’s conclusions.

44
Q

What is Solvency 2, its purpose and requirements

A

The European Solvency II regulations prescribe the characteristics of acceptable data for two
Purposes: The calculations of technical provisions which applied to all insurers, and the calibration of an internal model for capital calculations, which applies only to insurers whose internal models have been specifically approved.
Came into force 2016. It’s a directive (law across the EU)
Because it’s standardised means EU countries can recognise each other’s insurance.
The data requirements are: completeness, accuracy and relevance.

45
Q

When is data considered complete under Solvency II regulation?

A

The data include sufficient historical information to assess the characteristics of the underlying
risks and to identify trends in the risks
The data are available for each of the relevant homogeneous risk groups used in the calculation
of the technical provisions and no relevant data is excluded without justification.

46
Q

When is data considered accurate under Solvency II regulation?

A

The data are free from material errors;
Data from different time periods used for the same estimation are consistent;
The data are recorded in a timely manner and consistently over time.

47
Q

When is data considered appropriate under Solvency II regulation?

A

Data is consistent with the purposes for which they will be used;
Data ensures that the estimations made in the calculation of the
technical provisions do not include material estimation error
Data are consistent with the assumptions underlying the actuarial and statistical techniques that are applied to them
Data appropriately reflects risks to which the insurance undertaking is exposed.
The data were collected, processed and applied in a transparent, and structured manner - which is documented (as per requirements)
Data are used consistently over time

48
Q

What are the documentation requirements to ensure data is considered appropriate under Solvency II regulation

A

Data are Documented such that:
the criteria for the quality of data and an assessment of the quality of data, including specific qualitative and quantitative standards for different data sets are defined
Assumptions were used and set up in the collection, processing and application of data and are defined
We have details of process for carrying out data updates.

49
Q

What are the key checks an organisation should do for appropriateness of data under Solvency II

A

Definitions are set out in data collection.
Materiality checks
Assumptions were defined based on the way the data was collected
Are ENIDs considered?

50
Q

To use external data over internal what requirements must be met?

A

Insurance or reinsurance undertakings must demonstrate data is more
suitable than the use of data which are exclusively available from an internal source;
Insurer must Identify any trends in that data and the variation, over time of the assumptions or methodologies in the use of that data;
Insurer must demonstrate that the assumptions and methodologies reflect the characteristics of the insurance or reinsurance undertaking’s portfolio obligations.

51
Q

What is Solvency 2’s regulation around data limitations?

A

Where data does not comply with the articles set out, insurance and reinsurance undertakings shall document appropriately the limitations of the data including a description of whether and how such limitations will be remedied.

52
Q

Describe how lack of data can lead to adverse selection and what this means

A

Lack of data - adverse selection is a problem. If you don’t understand the risk as well as your customer those with more risk are more likely to buy the insurance
If you don’t understand the risk, at least limit the amount of business that you write.

53
Q

Define a controller and processor under GDPR

A

A controller determines the purposes and means of processing personal data.
A processor is responsible for processing personal data on behalf of a controller.

54
Q

To what does GDPR apply?

A

The GDPR applies to processing carried out by organisations operating within the EU or those outside the EU that offer goods or services to individuals in the EU.
The GDPR does not apply to certain activities including processing covered by the Law: Enforcement Directive, processing for national security purposes and processing carried out by individuals purely for personal activities.
Exception example: Garda can have information you do not know about
GDPR applies to personal data only

55
Q

What is personal data

A

Information that relates to an identified or identifiable individual.
If it is possible to identify an individual directly from the information ( ‘relating to’ the individual.) you are processing, then that information may be personal data.

56
Q

What is Pseudonymisation

A

Pseudonymisation means deleting names - this does not get you off the hook with personal data sometimes

57
Q

What is the ICO and their power

A

The ICO (information commissioners office) has the power to take action against controllers and processors under the GDPR.

58
Q

What are the seven principles of GDPR

A

Lawfulness, fairness and transparency
Purpose limitation
Data minimisation
Accuracy
Storage limitation
Integrity and confidentiality (Security)
Accountability

59
Q

Explain as Lawfulness, fairness and transparency one of the seven principles of GDPR

A

The need to have a lawful basis (valid grounds) for processing personal data and to be open with data subjects about how it will be used. You must determine your lawful process before processing begins, it should be documented and cannot be changed at a later date.
If you are processing special category data or criminal conviction/offences data you need to identify both a lawful basis for general processing and an additional condition for processing this type of data.

60
Q

Explain Purpose limitation as one of the seven principles of GDPR

A

the requirement to specify at the outset the purpose of the processing and have safeguards to prevent the use of the data for other purposes without consent.

61
Q

Explain Data minimisation as one of the seven principles of GDPR

A

To ensure the data is adequate to fulfil your stated purpose, relevant and limited to what is necessary for the processing. Stops companies hoovering up data and holding on to it when it might Become handy in the future

62
Q

Explain storage limitation as one of the seven principles of GDPR

A

The data should only be kept for as long as is necessary, and disposed of according to a set schedule. Data should also be periodically reviewed.

63
Q

Explain Integrity and confidentiality as one of the seven principles of GDPR

A

Security principle: this requires that data is held in conditions where ‘appropriate technical and organisational measures’ are in place to protect the data. Should be appropriate to the sensitivity of the data.

64
Q

Explain Accountability as one of the seven principles of GDPR

A

This reflects the need to evidence compliance and take responsibility for processing data in line with the law and the other 6 principles.

65
Q

What are the lawful basis for collecting data under GDPR?

A

Consent: consent for you to process their personal data for a specific purpose.
Contract: necessary for a contract you have with the individual. Ex: letting agency for renters
Legal obligation: to comply with the law.
Vital interests: Necessary to protect someone’s life.
Public task: Necessary for you to perform a task in the public interest or for your official functions. Ex: people who can vote,
Legitimate interests

66
Q

Define special category data

A

Special category data is personal data which the GDPR says is more sensitive, and so needs more protection. Includes an information on an individual such as: race; ethnic origin; politics; religion; trade union membership; genetics; biometrics (where used for ID purposes); health; sex life; or sexual orientation.
Poses a risk to persons fundamental rights and freedoms

67
Q

What are the 10 possible conditions for processing special category data

A
  1. Consent
  2. Necessary for the purposes of carrying out the obligations and exercising specific rights of the controller or of the data subject in the field of employment and social security and social protection law - authorized by Union/State law
  3. To protect vital interests of data subject or someone else if data subject cannot give consent
  4. Legitimate activities with appropriate safeguards by a foundation, association or any other not-for-profit body on condition that the processing relates solely to the members, or former members
  5. Personal data manifestly made public
  6. Necessary for establishment, exercise or defence of legal claims
  7. Necessary for reasons of substantial public interest, by Union or Member State law
    8.Purposes of preventive or occupational medicine
  8. Public interest in area of public health ex: cross border threats protection
  9. Archiving purposes in the public interest, scientific or historical research purposes or statistical purposes
68
Q

What additional permissions does criminal offense data need as well as lawful basis?

A

Criminal offence data needs either legal authority or official authority for processing as well as a lawful basis.

69
Q

As per GDPR what are the rights of the individual?

A

The right to be informed: the provision of clear privacy information at the point of collection.
The right of access: obtain a copy of any personal data held in a timely manner.
The right to erasure: personal data permanently destroyed.
The right to restrict processing: processing of personal data limited or stopped altogether.
The right to data portability: Have a copy of the data in a transferrable format. Allows subject to move personal data easily in a safe and secure way.
The right to object: Have data processing stopped in certain circumstances for example: direct marketing.
Rights in relation to automated decision making and profiling.

70
Q

What must an organisation do if their processing falls under Automated decision making or profiling under GDPR?

A

Must give individual information about the processing, introduce simple ways for them to request human intervention, carry out regular checks to make sure that systems are working as intended.

71
Q

What measures under accountability and governance can organisations take under GDPR - sometimes they must

A

Implementing data protection policies;
Putting written contracts in place with organisations that process personal data on your behalf;
Maintaining documentation of your processing activities;
Implementing appropriate security measures;
Recording and reporting personal data breaches;
Carrying out data protection impact assessments
Appointing a data protection officer.
Adhering to relevant codes of conduct and signing up to certification schemes.
Accountability obligations are ongoing.

72
Q

What rules does GDPR outline around documentation

A

You must maintain records on several things such as processing purposes, data sharing and
retention.
You may be required to make the records available to the Information Commissioner (local
regulator) on request.
Record must be kept in writing, up to date and reflect the current processing activities.

73
Q

What does “data protection by design and default” refer to?

A

GDPR requires “data protection by design and by default”
In essence, this means you have to integrate data protection into your processing activities and business practices,

74
Q

What is a DPIA - requirements under GDPR

A

A Data Protection Impact Assessment (DPIA) is a process to help you identify and minimise the data protection risks of a project.
You must do a DPIA for processing that is likely to result in a high risk to individuals.
DPIA must:
Describe the nature, scope, context and purposes of processing
Asses necessity, proportionality and compliance measures;
Identify and assess risks to individuals -likelihood and the severity
Identify any additional measures to mitigate those risks.

75
Q

What is the role of a DPO - this is a requirement of GDPR

A

DPOs monitor internal compliance, inform and advise on data protection obligations, provide advice regarding Data Protection Impact Assessments (DPIAs) and act as a contact point for data subjects and the supervisory authority. They represent data subjects and report on data protection risks, advise on all aspects of data protection.
The DPO must be independent, an expert in data protection, adequately resourced, and report to the highest management level.

76
Q

What does GDPR detail about data security

A

Security of data must be appropriate to both your circumstances, the data senstivity and the risk your processing possesses.
Where appropriate, you should look to use measures such as pseudonymisation and encryption.

77
Q

What does GDPR state on personal data breaches

A

Duty on all organisations to report certain types of personal data breach to the relevant supervisory authority within 72 hours of becoming aware of the breach, where feasible.
If the breach is likely to result in a high risk of adversely affecting individuals’ rights and freedoms, you must also inform those individuals without undue delay.

78
Q

Give an example of an exemption from GDPR cover

A

Ex: Exam scripts and exam marks. This exemption exempts you from the GDPR’s provisions on: the right to be informed; the right of access; and all the principles, but only so far as they relate to the right to be informed and the right of access.