Chapter 13 Flashcards

1
Q

selection bias

A

data is not randomly selected sufficiently to represent the population

ie sample = 3% errors
population = 4% errors
selection bias

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

observer bias

A

an observer lets their assumptions (may be unconscious) to influence their observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

omitted variable bias

A

the researcher omits a key variable that results in an incorrect finding

relates to EXPLORATORY DATA ANALYSIS not descriptive analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

cognitive bias

A

relates to human perception + how data is presented

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

self-selection bias

A

individuals select themselves to be part of a study

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

confirmation bias

A

the researcher accepts data that confirms their belief + ignores data that disagrees

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

survivorship bias

A

the sample contains data that has previously survived some other event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

null hypothesis

A

if stat sig difference > 95% = reject null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

descriptive stats

A

the statistical summarisation of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

inferential stats

A

the stat findings of a small population of data are taken to be applicable to the characteristics of a larger population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

exploratory data analysis

A

the identification of relationships within a dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

confirmatory data analysis

A

using stats to confirm a pre-determined hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A company correctly records and analyses all its sales transactions. At the end of each month, a
report is produced for the sales director listing details of every sales transaction: customer, products,
quantities and prices. Which of the following describes the quality of the report’s data and
information?
A good quality of data, but poor quality of information
B good quality of both data and information
C poor quality of data, but good quality of information
D poor quality of both data and information

A

A

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Big Dave Ltd collects data about customers and what they buy as well as certain items of personal
information. It analyses this data to identify relationships between the different variables, such as
what products appeal most to people of certain age groups.
Requirement
What type of analysis is this an example of?
A Descriptive statistics
B Exploratory data analysis
C Confirmatory data analysis
D Relativity analysis

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Which of the following is the best description of professional scepticism?
A All information should be challenged, and should be assumed to be incorrect until it has been
proved otherwise.
B Forecasts that appear optimistic should be ignored, while forecasts that are pessimistic should
be assumed to be correct.
C Assessing the information critically, being alert to possible misstatements due to error and fraud .
D Refusing to accept that information is correct until it has been certified by a qualified accountant.

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

A schools inspector sits in on classes and makes an assessment of the teacher. The teacher is given a
grade from 1 to 5 where 5 is excellent and 1 is inadequate. Without realising that she is doing this,
the inspector tends to give more generous grades to teachers of maths and science, and lower
grades to teachers of arts and humanities. Before becoming an inspector, she was a science teacher.
Requirement
What type of bias is the inspector introducing into her rankings?
A survivorship bias
B cognitive bias
C observer bias
D self-selection bias

A

C

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

A data analyst at a major retail chain has performed some analysis in which he has calculated the
mean monthly expenditure, and the average number of visits per month by the chain’s customers.
The information was prepared by using details of credit and debit card payments to enable the
analyst to track all purchases made by a particular customer. Approximately 60% of purchases in the
stores are made using credit and debit cards, and the analyst claims to have tracked the purchases of
10% of card users.
Requirement
In evaluating the statistics produced by the analyst, which of the following conclusion would you
reach?
A Given that 10% of the card users were used in the sample, the statistics are likely to be
representative of the population.
B The data in the sample may suffer from selection bias so it should be recognised that the
statistics may not be an accurate reflection of the whole population.
C Since only a sample of customers was used, the data analysis is likely to be wrong and should
therefore be ignored.
D The data in the sample suffers from omitted variable bias so it should be

A

B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

The directors of a business have asked you to prepare a presentation in which you provide an
overview of the trends in sales over the last 10 years. The company has three main product lines.
Requirement
Which type of chart would be most useful for providing a good overview of the trends in sales over
the last 10 years?
A A clustered bar chart
B A component bar chart
C A pie chart
D A line chart

A

D

line = best way of identifying trends - good for an overview of sales

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Arkwright Ltd analyses huge quantities of data about a wide variety of issues from a wide variety of
sources. Arkwright Ltd is seeking competitive advantage from:
A its transaction processing system
B big data
C cybersecurity
D its strategic process

A

B - a Co which uses big data for competitive advantage streams in huge quantities from a variety of internal + external sources + applies data analytics to obtain as much value from the data as possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

The ability to stream big data into an organisation’s systems in real time is an example of which
feature of big data?
A Volume
B Veracity
C Velocity
D Variety

A

C = the speed of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

0 Which of the following features of big data concerns the fact that data sets contain anomalies and
errors?
A Veracity
B Variety
C Volume
D Velocity

A

A = concerns the trustworthiness or accuracy of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

a random number generator is used to select a sample from within the population

A

simple random sampling

23
Q

a random number generator is used to select a sample from within the population

A

simple random samplingd

24
Q

a random number generator is used to select a sample from within the population

A

simple random samplin

25
Q

disadvantage of simple random sampling

A

data may not be representative - through chance

26
Q

every nth observation from within a population is selected

A

systematic sampling - reduces chance of an unrepresentative sample being taken

27
Q

disadvantage of systematic sampling

A

patterns ie weekly invoice raising

28
Q

the population is divided into sub populations based on a particular characteristc

A

stratified sampling - strate = sub-population

29
Q

type II error

A

false negative = the null hypothesis is true but it is rejected because the sample is stat significant to the null hypothesis

30
Q

type I error

A

false positive = the null hypothesis is false but it is accepted because the sample is not stat significant to the null hypothesis

31
Q

An external auditor is auditing the accuracy of purchase invoices in the accounting system of a client.
The auditor has formulated a hypothesis that errors occur in 2% of invoices. A sample of 30 invoiceswas taken, and errors occurred in 3% of these. This was not considered to be significantly different
from the hypothesis, and the auditor concluded that errors occur in 2% of invoices.
Subsequently, the internal auditors checked all of the invoices entered into the system for the
previous year and identified an error rate of 4%.
Requirement
What type of error did the external auditor make in concluding that the error rate was 2%?
A a type I error
B a type II error
C omitted variable bias
D selection bias

A

B type II error - accepted null hypothesis (negative) when should’ve rejected and gone for alternative hypothesis ie there is a stat difference

NOT
omitted variable bias - no variable has been omitted

NOT
selection bias - not necessarily as sample means dont always have the same mean as population means

32
Q

1 Big Data analytics typically involves the analysis of unstructured data. Which of the following is an
example of unstructured data for a company that operates a chain of coffee shops?
A Spreadsheet analysis of the purchase of new chairs for store refurbishments
B Data tables showing monthly sales figures at each store
C A table of supplier names and addresses
D Email communications between a customer and the marketing department

A

structured data: any data that is contained within a field in a data record or file - includes data in databases + spreadsheets

unstructured data: data that is not easily contained within structured data fields, including content of pictures, webpages, videos, emails etc

D

33
Q

Data science has great importance to big data and data analytics.
Statement (1) Data science deals with analysing and interpreting data rather than how it is collected.
Statement (2) One objective of data science is to extract value from data.
Requirement
Identify whether each statement is accurate.
A Statement (1) accurate; Statement (2) inaccurate
B Statement (1) inaccurate; Statement (2) inaccurate
C Statement (1) accurate; Statement (2) accurate
D Statement (1) inaccurate; Statement (2) accurate

A

accurate + accurate

34
Q

Finax plc is a credit referencing organisation. It collects data on the credit transactions of millions of people and commercial organisations worldwide. Vertex Ltd has an arrangement with Finax plc to access its store of big data so that it can make decisions on whether to extend credit to new
customers.
Requirement
Which type of big data does Finax plc supply?
A Created big data
B Provoked big data
C Transacted big data
D Compiled big data

A

D - compiled big data is collected by a third party (Finax plc) and accessed by a business (Vertex Ltd)

35
Q

There are several sources of big data available to business organisations.
Statement (1): Open data is data primarily sourced from public sector data, such as transport data
and government financial data.
Statement (2): Human-sourced data is primarily from social networks, emails and text messages.
Requirement
Identify whether each statement is accurate.
A Statement (1) accurate; Statement (2) inaccurate
B Statement (1) inaccurate; Statement (2) inaccurate
C Statement (1) accurate; Statement (2) accurate
D Statement (1) inaccurate; Statement (2) accurate

A

A

36
Q

unstructured data : captured data

A

unstructured = obtained without a particular objective so has no inherent structure

captured = data which is created passively from unrelated activity and captured without a specific purpose

37
Q

unstructured data : user-generated data

A

unstructured = obtained without a particular objective so has no inherent structure

user-generated = data which internet users create and voluntarily place online
ie tweets, IG photos

38
Q

structured data : compiled data

A

data collected by a 3rd party ie market research, credit rating, polling organisation

39
Q

structured data : provoked data

A

data obtained from people who have been given the option to express their opinion ie questionnaries

40
Q

structured data : transacted data

A

data collected about actual transactions ie sales

41
Q

structured data : created data

A

data that has been created on purpose by an organisation, usually for product or market research

42
Q

sources of big data : processed data

A

comes from info systems held by traditional businesses ie EPOS cash till

43
Q

sources of big data : open data

A

large amounts of data in the public domain ie gov financial data, public service data, geo-spatial data, transport data

44
Q

sources of big data : human-sourced data

A

obtained from social networks, blogs, emails etc

45
Q

sources of big data : machine-generated data

A

can be obtained from the internet-of-things (devices connected to the internet) ie FitBit

46
Q

The characteristic that determines whether information is both of good quality and valuable is:
A accuracy
B accessibility
C timeliness
D relevance

A

D

good quality info must be accurate + timely
valuable info must be accessible

both good quality + valuable = relevant

47
Q

6 Last year Rillet Ltd invested in a new information system to collect and analyse big data in regards to
the sale of one its products. Once collected and analysed, the information was then reported to the
board of directors. The directors were so happy with the system that it was expanded to report on all
of the company’s products. However, for one of the products, an error in the system meant that a
pattern in regards to sales was identified that did not actually exist. As a consequence, the directors
made a business decision that made a substantial loss.
Requirement
Which of the following risks of big data does this describe?
A Information overload
B Workforce skills
C Data dependency
D Data security

A

C - the business became too dependent on the data which put it at risk of data errors and errors in interpreting the data

48
Q

7 The following statements about using spreadsheets in the budgeting process have been made:
(1) The budgets of the different departments can be easily consolidated into a budget for the whole
organisation
(2) Spreadsheets ensure that no errors occur in calculations
(3) Budget templates can be used to help department managers prepare their budgets
Requirement
Which of the statements above are correct?

A

1 and 3

2 is wrong - there may be errors in the formulas!

49
Q

The ICAEW has issued a guidance document ‘Principles of good spreadsheet practice’.
Requirement
Which of the following is an aim of the ICAEW principles of good spreadsheet practice?
A To provide detailed guidance about spreadsheet design
B To reduce the amount of time wasted by poor spreadsheet design
C To provide detailed guidance on the use of formulae in spreadsheets
D To provide detailed guidance on formatting spreadsheets

A

B and to reduce the number of errors caused by poor spreadsheet design

the principles provide a framework rather than providing detailed advice about spreadsheet design

50
Q

The following statements about spreadsheet practice have been made
(1) Never embed in a formula a number that might change or need to be changed
(2) Keep formulae as short and simple as practicable
(3) Have a formula for the calculation of an important value in several different cells to ensure that the calculation is not lost
Requirement
Which of the above statements are consistent with the ICAEW ‘Principles of good spreadsheet
practice?’
A 1 and 2 only
B 2 and 3 only
C 1 and 3 only
D 1, 2 and 3

A

A

statement 3 = the OPPOSITE - calculations should only be performed once only to avoid having inconsistent calculations

51
Q

The following are principles of good spreadsheet practice:
(1) Build in checks, controls and alerts from the outset
(2) Include an ‘About’ or ‘Welcome’ sheet to document the spreadsheet
(3) Separate and clearly identify inputs, workings and outputs
Requirement
Which of the principles above can reduce the problems that occur when a spreadsheet is inherited
by a new user?
A 1 and 2 only
B 2 and 3 only
C 1 and 3 only
D 1, 2 and 3

A

inheriting a new spreadsheet - the user may be unaware of the implications of making changes to the spreadsheet as they do not understand the design

2 + 3 = mitigate this!!!

1 = designed to identify errors when the spreadsheet is created rather than helping new users when the spreadsheet is handed over to them

52
Q

Data Analysts often use samples taken from a population in order to find out more about that population. A sample should be representative.
Requirement
Which of the following best describes the meaning of a representative sample?
A A sample where the size of the sample is greater than 30 items
B A sample that includes data from all sub classes of the population
C A sample that reflects characteristics of the population as a whole
D A sample that is chosen randomly

A

C

53
Q

A market research company wishes to select a sample of voters to call ahead of a local election. The
company has obtained a copy of the electoral register for the local area, which contains the names of
all individuals who are entitled to vote. All individuals have been assigned a register number, starting
from 0001, and there are approximately 10,000 individuals on the register.
The market research company uses a random number generator to generate a number (n) between
0 and 500. The first individual chosen for the sample will be the individual with the register number n.
Subsequent individuals are chosen by choosing the individual whose register number is 500 higher
than the previous individual (ie, the second individual will have a register number equal to n + 500,
the third n+ 1,000 etc). In this way, the market research company aims to select a sample of 20
individuals from the local area.
Requirement
Which method of sampling is this an example of?
A Stratified sampling
B Simple random sampling
C Cluster sampling
D Systematic sampling

A

D