Chapter 8 Flashcards
What is the difference between Data and Information?
Data is raw facts, numbers or transitions that are not yet processed.
Information is data which has been processed in such a way that it is meaningful.
NB: The ‘process’ could be a simple sorting or summarising process.
What are the characteristics on good information
ACCURATE
Accurate
Complete
Cost effective
Understanable
Relevant
Authoritative
Timely
Easy to use
Characteristic: Accurate
Information should be precise enough to allow the appropriate decision to be made
Characteristic: Complete
Information should contain all relevant facts
Characteristic: Cost effective
It must be more valuable than the cost of producing it
Charactiersic: Understandble
Information should be clearly presented so that it is easily and quickly understood
Characteristic: Relevant
The information ust be pertinent to the user and within the control of the responsible party
Characteristic: Authoritative
The information must be based on trustworthy sources
Characteristic: Timely
The information should be available within any relevant time periods
Characteristic: Easy to use
The information should be accessible
What is Quantitative data?
Data that can be measured (e.g. age,
What is Qualitative data?
Data that cannot be measured (how blue are your eyes
What is Discrete data
values/observations are distinct and separate - they can be counted
the number of blah in blah etc
What is continuous data
values/observations may take on any value within a finite or infinite interval.
You can count, order and measure continuous data,
height, weight, temperature, time
What is Primary data
Data that has been gathered for a specific purpose by the user
This could be via interviews, questionaires etc
What is Secondary data
Data gathered by someone else for another purpose but that you are using for your purpose
e.g. financial statements ror statistics
What are the different sampling techniques?
1) Random
2) Stratified random
3) Systematic
4) Cluster
5) Multistage
6) Quota
Sampling technique: Random
-Items are picked at random from the sampling frame until the sample is complete
Sampling technique: Stratified Random
In a stratified sample, the sampling frame is divided into non-overlapping groups or strata . A random sample is then taken from each stratum
-eg, geography, age group, gender
Sampling technique: Systematic
Every nth item in the sampling frame is taken for the sample
Sampling technique: Cluster
Cluster sampling divides the population into groups, or clusters. A number of clusterse are selected randomly to respresent the population and then all unites within selected clusters are included in the sample
Sampling technique: Quota
sample of ‘x’ items is taken
There may be bias in the sample
e.g. a street interviewer
Sampling technique: Multistage
Multi-stage sampling is like cluster sampling, but involves selecting a sample within each chosen cluster, rather than including all units in the cluster. Thus, multi-stage sampling involves selecting a sample in at least two stages. In the first stage, large groups or clusters are selected. These clusters are designed to contain more population units than are required for the final sample.
-In the second stage, population units are chosen from selected clusters to derive a final sample. If more than two stages are used, the process of choosing population units within clusters
continues until the final sample is achieved.
what is sampling?
When data is gathered from a sampling frame (numbered list of all items in the population
What is grouped frequency distribution? exmaple
A table of data with various ranges identified and a frequency column indicating the number of items in each range
akak data = ages of 20 students
Age (range) | Number of students (Frequency)
19-21
Ungrouped frequency distribution =
Age| Frequency
What are the different types of bar charts?
Simple bar chart
Multiple bar chart
component bar chart
percentage compnent bar chart
What do bar charts show?
Data is shown by bars of equal width, the height of which corresponds to the value of the data
What is a histogram?
A histogram is a frequenct distribution where the frequency of occurrence is represented by the area of the bars not the height of the bars.
In CIMA, intervals will be of equal width, allowing you to calculate the height of the bar using the formula
Heigh = frequency of the interval
Histogram with unequal widths - what to do
You may see a histogram where the bars are unequal widths. This reflects the fact that the frequency is demonstrated by area, not height. Height of bar = Frequency × Standard class width/Actual class width
For example, if the width of the block is one and a half times the standard width, we must divide the
frequency by 1.5, i.e. multiply by 0.67 (1/1.5).
What is an Ogive?
An ogive is a cumulative frequency curve which shows the total number of items less than a certain value
We can now use the ogive to estimate the number of people receiving below a certain amount
What are the three characteristics of big data
Using 3V’s: Volume, Velocity, Variety
V: Volume: More data than ever before is being collected,
-This is driven by social media, transactional-based data recorded by large organisations and internal systems
V: Velocity: refers to the speed at which data is being streamed into the organisation, in real time.
-Transactions may be constant, particularly for multi-national organisations and those with a significant ecommerce presence
V: Variety: Big data includes data from many different systems which will be in many different formats.
Structure data - transaction files, unstructured data - include social media posts, emails etc
Making sense of these many different sources and formats may require significant investment
What are the three characteristics of big data
Using 3V’s: Volume, Velocity, Variety
V: Volume: More data than ever before is being collected,
-This is driven by social media, transactional-based data recorded by large organisations and internal systems
V: Velocity: refers to the speed at which data is being streamed into the organisation, in real time.
-Transactions may be constant, particularly for multi-national organisations and those with a significant ecommerce presence
V: Variety: Big data includes data from many different systems which will be in many different formats.
Structure data - transaction files, unstructured data - include social media posts, emails etc
Making sense of these many different sources and formats may require significant investment
How is big data used?
-Companies who identified the potential of big data ahead of competitiors have obtained competitive advantage.
-Big data can be used to identify and/or analyse opportunities to increase revenues and improve product offerings, or reducing costs through process efficiencies
-Organisations can build a more complete picture of their customer gained from multiple sources
-Big data can be used to identify/analyse opportunities to save costs (inc, improving logistics, reducing the cost of fraud)
-Big data is big and complex, systems need to ensure information is able to be made available and acted upon quickly.
What is big data?
Big data involves capturing and processing data on a vast scale and converting it into information that is able to be utilised by the organisation.
Analysis of big data can uncover unexpected relationships and provide new insights into business performance,
What does big data include?
-Big data includes data collected from many previously seperate systemes. The data includes both structured data and unstructured data.
Unstructured data may be tagged using metadata
What is meta data?
Metadata is data that describes other data, aka what time a photo was taken etc.
What is meta data?
Metadata is data that describes other data, aka what time a photo was taken etc.