Chapter 18-Data Flashcards

1
Q
  1. Explain why the use of data is an increasingly significant issue for organisations.
A

Organisations often accumulate large amounts of information relating to individuals as part of their ongoing operations. The increasing use of technology has now made it possible to collect, store and use very large amounts of information about individuals in ever more diverse ways.
Organisations have particular responsibilities when acquiring and maintaining personal data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. Define personal data.
A

Personal data relates to information in respect of an individual where the individual can be identified, or where the data combined with other information could allow the individual to be identified.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. What is the main ethical responsibility for organisations in relation to personal data?
A

Organisations have an ethical responsibility to deal responsibly with personal data. In particular, they need to balance the privacy of individuals with the need of the organisation to make fair and reasonable use of the personal data in their operations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. Explain the purpose of data protection legislation, giving examples of how it compares between different countries.
A

Many countries have data protection laws to safeguard the rights of individuals with regard to how organisations can process and maintain personal data.
While the relevant regulations vary by jurisdiction, the objectives and expected behaviour are often similar. Examples of legislation that are broadly similar include the Data Protection Act in the UK, Personal Information Protection and Electronic Documents Act in Canada, and Personal Data (Privacy) Ordinance in Hong Kong.
However, not all countries have equivalent data protection legislation. For example, the USA has much less stringent personal data / privacy laws or regulations than the UK. Organisations need to take extra care where data is being transferred between countries, even if the purpose is valid.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. State the eight principles of the UK’s Data Protection Act that relate to processing personal data.
A

Personal data must:
1. be processed fairly and lawfully
2. be obtained and processed for specified purposes
3. be adequate, relevant and not excessive for the purposes concerned
4. be accurate and, where necessary, kept up to date
5. not be kept longer than necessary for the purposes concerned
6. be processed in accordance with the individual’s rights under the Act
7. be processed securely
8. not be transferred to a country or territory outside the European Economic Area unless that country or territory ensures an adequate level of protection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. Give three examples of possible consequences of non-compliance with data protection legislation when processing personal data.
A

The consequences of non-compliance with the relevant data protection laws when processing personal data can be significant.
For example:
* Individuals who commit criminal offences may be prosecuted.
* Organisations can be fined for serious breaches. For example, in the UK, organisations can be fined up to £500,000.
* In addition to prosecution and/or financial penalties, breaching data protection rules could lead to adverse publicity which can lead to significant reputational damage for an organisation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. Explain the relevance of anonymity to the definition of personal data.
A

The ability to identify the individual to whom the information relates is crucial to the definition of personal data. For anonymous data (ie where that individual cannot be identified) the obligations on an organisation are often considerably less. For example, in the UK anonymous data does not constitute personal data and the duties and obligations of the Data Protection Act do not apply.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. Explain the relevance of competition laws to data protection, including an example.
A

In addition to data protection laws, jurisdictions may also have competition laws which may also limit the uses to which data can be put.
For example, anti-competitive agreements may be prohibited - eg data could be shared among a small number of companies to fix prices in a particular market.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. Outline the potential consequences of non-compliance with this legislation.
A

There can be significant consequences of non-compliance with competition laws, including fines, awards for damages and disqualification of company directors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q
  1. List seven examples of information that can constitute sensitive personal data.
A

Sensitive personal data can include information related to:
* racial or ethnic origin
* political opinions
* religious or other similar beliefs
* membership of trade unions
* physical or mental health condition
* sexual life
* convictions, proceedings and criminal acts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. Give four examples of conditions that might have to be satisfied in order to permit sensitive personal data to be processed.
A
  • The data subject has given explicit consent.
  • It is required by law for employment purposes.
  • It is needed in order to protect the vital interests of the individual or another person. For example, if an individual with a medical condition has an accident at work, it would be in the individual’s vital interest to disclose this condition to medical staff treating the individual.
  • It is needed in connection with the administration of justice or legal proceedings.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
  1. Explain what is meant by ‘big data’, including its key characteristics.
A

The increasing use of technology has now made it possible for the public and private sector to collect and analyse very large data sets of information. This is often referred to as ‘big data’.
Big data can be characterised by:
* very large data sets
* data brought together from different sources
* data which can be analysed very quickly - such as in real time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q
  1. Describe the main data protection considerations for organisations using big data.
A

Big data can include personal data (such as data from social media or loyalty cards), but can include other data (such as climate change data). If personal data is held by a company, then the company needs to comply with the relevant data protection rules. Given the large amount of information that could be held on an individual, privacy considerations are likely to be a concern for individuals whose data is held.
Anonymisation can potentially aid big data analytics, as it means that the information being analysed is no longer considered personal data. This can assist organisations to carry on research or develop products and services. It also enables these organisations to give an assurance to the people whose data was collected that the organisation is not using data that identifies them for big data analytics.
A key feature of big data is using ‘all’ the data, which contrasts with the concept of data minimisation in the data protection principles. This raises questions about whether big data is excessive, while the variety of data sources often used in big data analytics may also prompt questions over whether the personal information being used is relevant. Organisations need to be clear from the outset what they expect to learn or be able to achieve by processing the data, as well as satisfying themselves that the data is relevant and not excessive.
Organisations that hold big data also need to be transparent when they collect data, and explaining how the data will be used is an important element in complying with data protection principles. The complexity of big data analytics will not be an acceptable excuse for fai ling to obtain consent where it is required.
Regulators expect organisations that hold big data to be proactive in considering any information security risks posed by big data. Data governance is becoming increasingly important for holders of big data. This must take account of data protection and privacy issues.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q
  1. Define data governance.
A

Data governance is the term used to describe the overall management of the availability, usability, integrity and security of data employed in an organisation.
A data governance policy is a documented set of guidelines for ensuring the proper management of an organisation’s data.
A data governance policy will set out guidelines with regards to:
* the specific roles and responsibilities of individuals in the organisation with regards to data
* how an organisation will capture, analyse and process data
* issues with respect to data security and privacy
* the controls that will be put in place to ensure that the required data standards are applied
* how the adequacy of the controls will be monitored on an ongoing basis with respect to data usability, accessibility, integrity and security.
The data governance policy will also provide a mechanism for ensuring that the relevant legal and regulatory requirements in relation to data management are met by the organisation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q
  1. Describe the purpose and typical contents of a data governance policy.
A

A data governance policy is a documented set of guidelines for ensuring the proper management of an organisation’s data.
A data governance policy will set out guidelines with regards to:
* the specific roles and responsibilities of individuals in the organisation with regards to data
* how an organisation will capture, analyse and process data
* issues with respect to data security and privacy
* the controls that will be put in place to ensure that the required data standards are applied
* how the adequacy of the controls will be monitored on an ongoing basis with respect to data usability, accessibility, integrity and security.
The data governance policy will also provide a mechanism for ensuring that the relevant legal and regulatory requirements in relation to data management are met by the organisation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q
  1. State four risks to which organisations are exposed if they do not have adequate data governance procedures.
A

Organisations that do not have adequate data governance procedures can be exposed to risks relating to:
* legal and regulatory non-compliance
* inability to rely on data for decision making
* reputational issues, which can in turn lead to
○ existing customers moving to another provider
○ a compromised ability to attract new customers
* incurring additional costs (for example fines and legal costs).

17
Q
  1. Describe the key data issues when businesses are combined by merger or takeover.
A

Where businesses are combined by merger or takeover, one of the key issues is whether the data for the two businesses should be combined onto one system and, if so, which.
The saving in overhead costs such as system maintenance and management is frequently cited as a justification for the transaction. In practice, the costs of converting the data from one working system to another are high. New developments are carried out on one system and the other is left to decline as a legacy system, often requiring proportionately higher maintenance costs. Thus the aim of cost saving is often not achieved.
There is a risk in aggregating data sourced from difference systems and a data governance policy needs to address this risk.

18
Q
  1. Give four examples of possible risks that arise when using data, which relate to data volume or quality.
A

Examples of possible risks associated with using data are:
* The available data might contain errors or omissions, which could lead to erroneous results or conclusions.
* There may be insufficient historical data available to estimate credibly the extent of a risk, and the likelihood of the occurrence of that risk in future.
* Even where there is sufficient data to estimate credibly future experience in normal conditions, there may be insufficient data available to provide a credible estimate of a risk in very adverse circumstances, which may be necessary for some purposes (eg estimating the tails of a distribution).
* Where there is insufficient data it may be possible to use data from other sources (eg industry data, other countries, competitors), but there is a risk that data from other sources may not be a sufficiently good proxy for the risk being assessed.

19
Q
  1. List eight reasons why historical data may not be a good reflection of future experience.
A

Historical data may not be a good reflection of future experience. This could be due to:
* past abnormal events
* significant random fluctuations
* future trends not being reflected sufficiently in past data
* changes in the way in which past data was recorded
* changes in the balance of any homogeneous groups underlying the data
* heterogeneity with the group to which the assumptions are to relate
* the past data may not be sufficiently up to date
* other changes - eg medical changes, social changes, economic changes etc.

20
Q
  1. Outline the two risks that arise where an actuary attempts to group data into broadly homogeneous groups.
A

There are risks where an actuary attempts to group data into broadly homogeneous groups. The risks associated with this are:
* the individual data groups may be too small for a credible analysis
* if data groups are merged so there is sufficient data in each group to be credible, the combined data set may not be sufficiently homogeneous.

21
Q
  1. Give three examples of other risks that are associated with the use of data.
A
  • The available data may not be in a form that is appropriate for the purpose required.
    • A lack of confidence in the available data will reduce the confidence in an actuary’s conclusions.
    • The available data may have been collected for a purpose, which means that it is not appropriate for a different purpose.
22
Q
  1. Explain what is meant in general by electronic or automated trading, including its advantages.
A

In the last few decades, the trading of financial assets has increasingly been carried out electronically. Advances in computer power, communication technology and programming capability have offered new tools for investment decisions, trading execution and risk management.
Electronic trading has the advantages of increased speed and efficiency of trading, and can result in lower dealing costs on trades. In addition, automated trading can potentially facilitate the execution of complex trading strategies that would not have previously been possible.

23
Q
  1. Explain what is specifically meant by algorithmic trading.
A

Algorithmic trading is a form of automated trading that involves buying or selling financial securities electronically to capitalise on price discrepancies for the same stock or asset in different markets. Often many trades are carried out very quickly to take advantage of temporary price discrepancies, with the aim of making small profits on each trade. The trader will use a formula (or algorithm) to decide whether a financial asset should be bought or sold.
The parameters underlying the algorithm used to determine when assets would be bought or sold will need to be derived using data from an appropriate source(s).

24
Q
  1. Describe four risks that are associated with algorithmic trading.
A
  • There could be an error in the algorithm or the data used to parameterise the model could be wrong, leading to potential losses on each trade, rather than the expected profits. This is an issue when a large number of trades could be completed very quickly.
    • The algorithm may not operate properly in adverse conditions. For example, the algorithm could stop trading an asset in turbulent markets, reducing liquidity of the asset and increasing volatility.
    • In very turbulent conditions, trading in individual stocks, or even entire markets, may be suspended before an algorithmic trade can be completed.
    • The main risk of algorithmic trading is the possible impact on the financial system. An example of this was a 5%-6% plunge and rebound in major US equity indices within the span of a few minutes due to a large number of trades done at erroneous prices in May 2010. The increasing integration between markets and asset classes means that a meltdown in one market could impact other markets and asset classes.
25
Q
  1. Explain what is meant by programmed trading.
A

Algorithmic trading looks at prices of stocks across all markets. An early development was known as programmed trading, which just considered automated rules for trading individual stocks on a single market. It gives a good example of the advantages and disadvantages of algorithmic trading.

26
Q
  1. When data is required for a number of tasks, what is a key principle in its provision?
A

The overriding principle is that the data for all the tasks should be controlled through one single, integrated data system. However, this ideal is not always achieved in practice. In a smaller organisation it is easier to ensure that the data used for different applications are consistent, because it is likely that the same small group of people will carry out the applications.

27
Q
  1. Comment on how different sources of data (ie publicly available and internal) may be suitable for different purposes.
A

For some purposes, data may only be required on a ‘big picture’ basis. Here, data will be publicly available from published company accounts and regulatory returns.
Product providers need data relating to the individual risks that they provide cover for. The quantity and quality of these data are both important. Without sufficient quantity, data groupings will either be non-homogeneous or lack credibility. However, even where there are plenty of data available, poor quality data will mean that any results produced are not reliable.

28
Q
  1. Outline two sources of problems of data quality and quantity and why the current management are not necessarily to blame for these problems.
A

Problems of data quality and quantity can be a result of poor management control of data recording or its verification processes, or due to poor design of the data systems. This may not necessarily be a reflection on the current management, as good quality data cannot necessarily be obtained quickly. After implementing a process for maintaining extensive records, it may take many years for enough data to be collected for analysis purposes.

29
Q
  1. Describe how it can be ensured that good quality data is obtained from proposal and claim forms.
A

Proposal forms
The proposal form needs to be designed to:
* Collect data at an appropriate level, including data that may be needed in the future
* Questions need to be clear and unambiguous, so that the proposer will give the full, correct information and the underwriting department can process the application readily
* Have inputs that are quantitative as far as possible

Claim forms
The claim forms should be designed with the aim of producing information that can be both analysed accurately and also transferred easily to the computer system. It should also link to the proposal form and enable that cross-checking of information can be carried out.
As well as data relating to current risks covered, it is important to retain the history of past policy and claim records.

Input of data
The system should have inputs in the same order as the proposal form and such that the person inputting the information does not need to interpret the information.
Staff inputting the information should be well trained.
Financial incentives could perhaps be offered for accuracy of input.
The data system should have validation checks, eg checks on:
* blank entry fields
* sensible entry values
The insurer may send the policyholder a copy of the key information and ask them to verify that it is correct.

Other features of the system
The system should be capable of storing information, so that historical data is available for future pricing exercises.
The system should be robust and flexible.
The system should be secure, ie only certain individuals are allowed to amend the data.
At regular intervals checks of movement analyses should be carried out.

30
Q
  1. Discuss the data issues, data requirements and data sources in benefit scheme valuations.
A

There may be occasions when the actuary does not have full control over the data available. For example, when valuing benefits under an employee benefit scheme, the scheme sponsor will usually provide data on the operation of the scheme and the scheme membership. It will be particularly important to validate this type of data.
Data will be required to place a value on the benefit entitlements of individuals. Data will be required in respect of individuals who have an entitlement to receive a benefit in the future and also individuals who are currently receiving benefits. The data will need to be sufficiently detailed to provide all information that is likely to be financially significant to the level or timing of future benefits.
For example, if a pension is to be provided, the age of the individual will be significant. However, if a pension were also to be paid to a spouse after the death of the member, the existence and age of a spouse of a young member may not be financially significant as the marital status of the member may change in the future.
Any equivalent data used when previously valuing benefits will be useful to the actuary as it will enable reconciliations to be performed that help to indicate the validity of the current data. Accounting data may also help in this process.
Where reserves are built up for benefits, a balance sheet and income and expenditure statement may exist. This will provide information about the total value of the assets held and perhaps information relating to recent benefit outgo and premium / contribution income. This information will be useful in verifying other data or in considering the assumptions to be used. If audited accounts exist, they will enable greater reliance to be placed on the figures when verifying the data.
To place a value on assets that is reliable and consistent with a value placed on future benefits, it is necessary to obtain a full listing of the individual assets held. These individual holdings should then be checked to determine whether they are permitted or are subject to valuation restrictions imposed by regulation or legislation.

31
Q
  1. List five assertions regarding data that an actuary should aim to check.
A

Whether using data provided by their own organisation or a third party, an actuary will have to make and check certain assertions about that data. Such assertions include:
* that a liability or asset exists on a given date
* that a liability is held or an asset is owned on a given date
* that when an event is recorded the time of the event and the associated income or expenditure are allocated to the correct accounting period
* that data is complete, ie there are no unrecorded liabilities, assets or events
* that the appropriate value of an asset or liability has been recorded.

32
Q
  1. Outline eleven data checks that might be carried out on valuation data.
A

Possible checks could include:
* reconciliation of the total number of members / policies and changes in membership / policies, using previous data and movement data
* reconciliation of the total benefit amounts and premiums and changes in them, using previous data and movement data
* the movement data should be checked against any appropriate accounting data, especially with regard to benefit payments
* checks should be made for any unusual values, such as impossible dates of birth, retirement ages or start dates
* consistency between salary-related contributions and in-payment benefit levels indicated by membership data and the corresponding figures in the accounts
* consistency between the average sum assured or premium for each class of business should be sensible, and consistent with the figure for the previous investigation
* consistency between investment income implied by the asset data and the corresponding totals in the accounts
* where assets are held by a third party, reconciliation between the beneficial owner’s and the custodian’s records
* full deed audit for certain assets, such as checking the title deeds to large real property assets
* consistency between shareholdings at the start and end of the period, adjusted for sales and purchases, and also bonus issues, etc
* random spot checks on data for individual members / policies or assets
* cross check against data from different sources e.g. proposal form

33
Q
  1. Give examples of two sets of circumstances where data will not be ‘ideal’.
A
  • Data have not been captured at a sufficiently detailed level. For example, a benefit scheme may not analyse membership by whether the employee is a clerical or a manual worker. Changes in the structure of the membership may have a material effect on scheme benefits such as early death or accident benefits. Similarly, if limited information is collected at the point of sale then only limited data will be captured on the insurer’s database.
    • There may be insufficient data to provide a credible result. A provider may have recently launched a new product or branched out into a new target market. Alternatively the provider may simply be too small to attach any credibility to its own experience. This is particularly the case with benefit schemes, where very few employers will be of sufficient size to have credible experience to assess mortality rates before retirement.
34
Q
  1. Describe the drawbacks of using summarised data for valuation purposes.
A

When valuing benefits it may be appropriate to use summarised data instead of detailed membership data in some circumstances. However, it should be recognised that the reliability of the values will be reduced, as full validation of the data will be impossible. Additionally, the summarised data may miss significant differences between the nature of benefits that have been grouped together. Summarised data is therefore only suitable if such inaccuracy is recognised by the users of the results of the calculations.
It is also unlikely that summarised data could be used to value options or guarantees that may or may not apply on an individual basis.

35
Q
  1. Describe, with examples, what is meant by ‘industry-wide’ data collection schemes.
A

In some countries there are organisations that collect data from their member offices and then make available summaries of all the data to their members. For example, in the UK, the Association of British Insurers collects and collates a wide variety of insurance data. This cannot be used in place of policy data to establish provisions for a particular policy or scheme, but could be used to determine bases or be used in product pricing.
One of the best examples is the Continuous Mortality Investigation of the Institute and Faculty of Actuaries in the UK, which does a large amount of work on mortality and morbidity statistics. The volume of data that can be collected from across a whole industry greatly improves the statistical significance of the resulting analysis.

36
Q
  1. Discuss the advantages of having access to industry-wide data.
A

An insurer participating in an industry-wide scheme has the prospect of being able to compare its own experience with that of the industry as a whole (or that part of it represented by the participating insurers) with regard to both the overall level and the pattern of the experience
by the categories into which the data are classified. Any significant differences point to a need for explanation. Since an insurer is likely to be seeking to expand by attracting business from its competitors, it may be important to have an indication of the ways in which the characteristics of the business it is seeking may differ from those of the business it already has.

37
Q
  1. Discuss the disadvantages of having access to industry-wide data.
A

When using industry-wide data, there is potential for distortions arising from heterogeneity. This is because the data supplied by different organisations may not be precisely comparable because:
* companies operate in different geographical or socio-economic sections of the market
* the policies sold by different companies are not identical
* sales methods are not identical
* the companies will have different practices, eg underwriting or claim settlement standards
* the nature of the data stored by different companies will not always be the same
* the coding used for the risk factors may vary from organisation to organisation.

Other problems with using industry-wide data may be:
* the data will usually be less detailed, or less flexible, than those available internally
* external data are often much more out of date than internal data
* the data quality will depend on the quality of the data systems of all of its contributors
* not all organisations contribute, and the organisations that do contribute are not representative of the market as a whole.

38
Q
  1. List two further examples of sources of data for an insurance company.
A

It may also be possible to obtain data from a reinsurer or from national statistics.

39
Q
  1. Describe the process of risk classification.
A

The main aim of risk classification is to obtain homogeneous data. The reduction of heterogeneity within the data for a group of risks makes the experience in each group more stable and characteristic of that group. Furthermore, it enables the data to be used more appropriately for projection purposes. This is important when monitoring claims and mortality experience. Any heterogeneity in data groups will serve to distort the results and can lead to setting provisions that are too big or too small and calculating premiums or contributions that are incorrect.
Ideally data to be analysed should be split into homogeneous groups, for example, by age and gender in a mortality investigation. However, where data is scarce, such as for numbers of deaths at young ages, splitting data into homogeneous groups may result in data groups that are too small to enable any credible analysis to be carried out. In such cases data may need to be combined into groups which are less homogeneous, but which are large enough to be credible. Whenever data is to be analysed there needs to be a balance between splitting the data into homogeneous groups and having sufficient data in each group to enable a credible analysis to be carried out.
There is also a need to carry out sensitivity testing to check that if the data are grouped in a different way the same results are obtained.