Sources of data Flashcards
PRIMARY, SECONDARY DATA
Primary data - collected specifically for the purpose of the survery. You are aware of the possible shortfalls and limitations. It`s expensive to collect and store.
Secondary data - collected for some other purpose. Must be ACCURATE and RELIABLE.
DISCRETE, CONTINOUS DATA VARIABLES
Discrete - can take countable number of values (nr. of goals on the match, can only be whole number)
Continuous - take on any value (height)
INTERNAL, EXTERNAL SOURCE
- INTERNAL
- Accounting records (cost, general, purchase, sales ledger)
- Data relating to personnel (payroll)
- Production department (ascribing costs to the physical information produced by the source)
- Service business - accounting, solicitors - EXTERNAL
- Received from customers and suppliers
a) Primary SOURCE
- As close as you can get to the origin of the item of the data
b) Secondary SOURCE
- Second-hand data
SOURCES OF 2ND DATA
- Banks (money supply, government debt, financial transactions)
- Government (population data)
- Financial newspapers (forex, interest rates, gilts)
- Trade Journals (industry, competitors` products, industry costs and prices)
- Websites
PROS and CONS of 2nd DATA
PROS
- Cheap
- Large quantity
- Quick to obtain
- Great for analyzing the past and patterns
CONS
- Inadequacies, limitations
- Out of date
- Not relevant
- Incorrect
ECONOMIC ENVIRONMENT
Affecting firms at national and global level, both in general level of economic activity and in particular variables, such as exchange rates, interest rates, inflation.
MACROECONOMIC FACTORS THAT INFLUENCE COMPANIES
- Overall growth or fall in GDP - increased/decreased demand for goods/services. When there is expansion we try to identify the demand. In the recession we are focused on costs, competition, profitability.
- Local economic trends - type of industry, labour rate, house prices.
- Inflation - disrupting decision making, wages go up
- Interest rate - costs of borrowing, how much consumers afford to spend
- Tax levels - how much profit to retain/give to shareholders, VAT - how much do customers pay for the product
- Government spending - ifluenced if you are supplier of givernment
- The business cycle - expansion, contraction
3 Vs of BIG DATA
- Volume
- Scale of information - Velocity (hitrost)
- Timeliness - Variety
- Structured and unstructured
BIG DATA - overall
- Collection
- Analysis
- Make predictive models (only digitalized data can be used)
- Find trends, understand customers
- Focus resources more effectively to make better decisions - increase profit, decrease costs
DECISION MAKING AND BIG DATA
Decision based on big data made:
- Quicker
- More flexibility, respond earlier
- Based on current situation, can take potential future situations into account
- Hard data evidence that can be quantified
- Collaborative basis (data can be shared, presented)
- Higher probability of out of the box decisions (all factors taken into account)
BENEFITS OF BIG DATA ANALYTICS
- Analysis of vast quantities in relative quickly time
- Improving decision making
- Focus on individual customer
- Cost reduction
SAMPLING, CENSUS (pros and cons)
Data are often collected from sample, not population. If the whole population is examined = CENSUS.
Cons of census
- Costly
- Data may be out of date by the time you complete
Pros of sample
- Once a certain sample size is reached, very little accuracy is gained by examining more items
- Possible to ask more questions
CHOICE OF SAMPLE
Completness
- Covering all areas of population in sample
- Lack of completness = BIAS
(NON)-PROBABILITY SAMPLING METHOD
- Probability - known chance of each member of the population appearing in the sample.
- Non-probability - change of each member of population appearing in the sample is not known. QUOTA sampling.
PROBABILITY SAMPLING METHOD
- Random
- Stratified random
- Systematic
- Multistage
- Cluster
RANDOM SAMPLING, SAMPLING FRAME
- Every item in the population has an equal chance of being included
- Random sample is not perfect sample
- SAMPLING FRAME = numbered list of all items in the population
CONS
- Full range of variation
- Unrepresentative sample is possible
- Costly
- Numbering is laboring
- Difficult to obtain data
- Adequate sampling frame could not exist
STRATIFIED RANDOM
Dividing population into strata or categories. Random samples are then taken from each stratum or category.
PROS
- Representative
- Reflecting population
- Conclusions about each stratum can be made
- Increased precision, less variation
CONS
- Prior knowledge of each items in population is required
SYSTEMATIC
Selecting every n-th item after a random start.
PROS
- Easy
- Cheap
CONS
- Biased sample
- Not completely random (some chances have a zero chance of being selected)
MULTISTAGE
Dividing population into number of sub-populations and then selecting a small sample of these sub-populations at random.
PROs
- Fewer investigators are needed
- Not so costly
CONs
- Bias
- Not truly random (rest of the populations cannot be in the sample when areas are chosen). Selected areas should reflect the full range of diversity.
CLUSTER
- Non-random
- Selecting one definable subsection of population as sample, that subsection is taken to be representative of the population.
PROs
- Alternative to multistage
- Inexpensive
CONs
- Bias
QUOTA SAMPLING
- Non-probability sampling method
- Interview all the people until quota is met
PROs
- Cheap, administratively easy
- Much larger sample can be studies
- No sampling frame required
- May be the only possible approach
- Yields enough accurate information
CONs
- Bias
- Sampling error due to its non-random nature