Chapter 13 - Data Analysis Flashcards

1
Q

What is Data?

A

Consists of numbers, letters, symbols, raw facts, events and transactions which have been recorded but not yet processed into a form which is suitable for use by management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is information?

A

Data which have been processed in such a way that is meaningful to the person who receives it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why is information useful to management?

A
  • Helps planning
  • Helps making decisions
  • Helps controlling day-to-day operations, for example by comparing actual results with those planned
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the four types of data?

A
  • Quantitative data = numerical data that provides measurements or quantities. Expressed as numbers for e.g. number of KG needed to make a unit of product
  • Qualitative data = Cannot be expressed as numbers or values and it is much harder to analyse
  • Discrete Data = Non-continuous data can take on any value (within a range) for e.g. time or distance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are internal sources of data

A
  • Accounting records
  • HR/payroll records
  • Machine logs/computer systems
  • Procurement data systems
  • Timesheets
  • Communication to/from staff
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the sources of internal information?

A

Formally gathered
- Market research e.g. new trends, customer tastes, competitor products
- Research and development
- Tax and accounting specialists
- Legal specialists

Informally gathered
- Any information gathered on an ongoing basis e.g. newspapers, internet, meetings with external business colleagues

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the internet of things IoT

A

internet connected devices continually collect and exchange data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Using the mnemonic ACCURATE - What are qualities of good information?

A

A - Accurate e.g. no typos, roundings, categorised, assumptions
C - Complete e.g. all information provided with purpose
C - Cost-beneficial e.g. benefit > cost of producing information
U - User-targeted e.g. understandable and useful to recipient
R - Relevant for purpose intended
A - Authoritative e.g. genuine, highest quality for purpose, source should be known and reliable
T - Timely e.g. produced in advanced of when needed
E - Easy to use e.g. clear, concise, constructive, communicated appropriately

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Data analysis?

A
  1. Identify the information needs
  2. Collect the data
  3. Analyse the data
  4. Present the information
  5. Use the information
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are ways in which data can be analysed?

A
  1. Inferential statistics e.g. draw conclusions about a set of data taken from a population to describe and make inferences about the population
  2. Exploratory data e.g. when pattern is identified in types of data. This type of analysis may use regression and correlation analysis.
  3. Confirmatory data analysis - confirms (or not) a hypothesis using statistical methods. For example a price increase of 3% will reduce demand by 5%
  4. Sample e.g. a group of items drawn from a population. The population may consist of items such as metal bars, invoices, packets of tea
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is sampling?

A

Collecting a sample by selecting a unit e.g. people, organisations) then using the information to generalise to the wider population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are the three main reasons why sampling is necessary

A
  1. Whole population may not be known
  2. Even if the population is known the process of testing every item can be extremely costly in time and money e.g. gaining information about the popularity of TV programmes by interviewing every viewer
  3. Items being tested may be completely destroyed in the process, for e.g. in order to check the lifetime of an electric light bulb it is necessary to leave the bulb burning until it breaks and is of no further useW
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the rules involved with sampling?

A

Sample must be chosen in such a way that is representative of the population

Sample must be of certain type. In general large the sample, the more reliable the results will be

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the four types of sampling?

A
  1. Random
  2. Systematic
  3. Surveys
  4. Stratified
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are spreadsheets?

A
  1. Computer package used to manipulate data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the SUM function used for?

A

Totals the values in the list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are the AVERAGE function used for?

A

Average of the values in the list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the MAX function used for?

A

Highest values in the list

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the MIN values used for

A

Lowest values in the list

20
Q

What are the disadvantages of using spreadsheets?

A
  • Can be time consuming
  • Not able to identify data input errors or prevent accidental deletion so training of staff is important
  • Sharing violations among users wishing to view or change data at the same time
  • Difficult to identify an error in the design of the spreadsheet as some formula are very complicated.
  • Spreadsheets are open to cyber-attack through viruses, hackers and general system failure
  • Spreadsheets are restricted to a finite number of records and they may not be a true reflection of the ‘real’ world.
21
Q

What are the two problems with data?

A
  • Comparability: is it possible to compare data from different sources?
  • Data bias: When a sample is chosen does it truly represent the population
22
Q

What are the 7 types of bias

A
  • Selection bias
  • Self-selection bias
  • Observer bias
  • Omitted variable bias
  • Cognitive bias
  • Confirmation bias
  • Survivorship bias
23
Q

What is selection bias?

A

When selecting a sample all items in a population should have the same chance of being picked - true random sampling

If data is not random then selection bias can occur and sample may not be representative

24
Q

What is self-selection bias?

A

When an individual selects whether or not to include themselves as part of a sample

25
Q

What is observer bias

A

When assumptions of a researcher can, unintentionally influence observations

26
Q

What is omitted variable bias?

A

If variable is left out when data is being analysed that could affect the analysis. Such as age or gender when analysing shopping habits

27
Q

What is cognitive bias

A

How data is perceived can influence the understanding of the results and lead someone to misinterpret the information

28
Q

What is confirmation bias?

A

Confirmation bias can occur when information is processed that favours previously existing beliefs. It can lead to inconsistent information being ignored

29
Q

What is survivorship bias?

A

If a sample only contains items that have survived a previous event survivorship bias can occur.

The act of focussing on successful people, businesses or strategies and ignoring those that have failed

30
Q

What is a hypothesis testing?

A

Where data is used to confirm if an idea or hypothesis is true.

31
Q

What is null hypothesis

A

Type of hypothesis used in statistics that proposes there is no difference between certain characteristics of a population

32
Q

What is statistical significance

A

Where the results are deemed to have occurred due to a specific cause rather than by chance

33
Q

What is a Type I error

A

False positive error occurs when a null hypothesis is rejected even if it’s true and should not be rejected

34
Q

What is a type II error

A

False or negative error occurs when a null hypothesis is false, but it’s accepted

35
Q

What is data visualisation

A

Use of charts and diagrams to present information

36
Q

What are forms of data visualisations?

A
  • Bar charts
  • Pie charts
  • Line graph (time)
37
Q

What is a big data?

A

Datasets with sizes beyond the ability of typical database software to capture, store, manage, and analyse

38
Q

What are the key features of Big data [FOUR V

A
  • Volume: Considers the amount of data fed into the organisation
  • Variety: Considers the various of formats of data received
  • Velocity: Considers the speed that data is fed into the organisation
  • Veracity: Considers the reliability of the data being received
39
Q

What relevance does volume have on big data

A
  • Does the organisation have resources to store and manage data?
  • Does it have the financial resources required to invest in or upgrade IT/IS
40
Q

What relevance does variety have on big data?

A
  • Are systems compatible and capable of accepting various forms of data?
  • Legally is the data owned by the organisation or by the third partyW
41
Q

What relevance does Velocity have on big data?

A
  • Are systems able to capture and process ‘real time’ data
  • Does the organisation have the skills to provide timely analysis of this data?
42
Q

What relevance does Veracity have on big data?

A
  • Can the organisation challenge data received from third party
  • Is the data received fully representative of the whole data population
43
Q

What is the importance of big data?

A
  • Potential to achieve competitive advantage
  • Huge array of new data sources: Social media, Internet of things
  • Exponential growth in computing power and storage capacity
  • New avenues of knowledge certain such as crowd sourcing and open source softwareW
44
Q

What is data science?

A

Collecting, preparing, managing, analysing, interpreting and visualising large and complex datasets

45
Q

What is data analytics?

A
  • Value extracted from big data by data scientists through the process of data analytics
  • Source data is analysed to turn it into information that is useful to the business
46
Q

What are the benefits of big data, data science and data analytics

A
  • Decision making: real time analysed information allows managers to make better decisions
  • Customer analysis: Market segmentation and customisation can occur from having a greater insight into customer needs
  • Innovation: Analysed big data can reveal completely new ideas and lead to innovation
  • Risk management: Big data can assist with the identification, quantification and management of risk
47
Q

What are the risks of big data, data science and data analytics

A
  • Storage e.g. systems must be reviewed and upgraded to cope with the data and processing required
  • Skills e.g. data scientists and analysts are in short supply making it difficult for organisations to recruit and retain the right staff
  • Data dependency e.g. data led decisions lead to significant risk should the data be weak, erroneous or corrupted
  • Overload e.g. too much information and analysis can make businesses lose sight of the key data and also slow down decision making and responsiveness
  • Data privacy e.g. there is a risk that data privacy legislation could be breached
  • Data security e.g. protection needs to be put in place to protect any data from cyber security risks