Lecture 2 REVISED Flashcards
what are common applications of statistics?
- predictive modelling
- pattern recognition
- anomaly detection
- classification
- sentiment analysis
what are common business use cases of statistics?
- customer analytics
- targetted advertising
- website personalisation
- risk management
- investment optimisation
- fraud detection
examples of challenges in statistics?
- varied/massive amounts of data
- varied types of data ((un)/(semi)/structured data)
- eliminating bias
“numbers don’t lie” ?
even when numbers are correct, people and organisations with their own agendas may use them to mislead
can skew the story and hide relevant facts
‘numbers don’t lie’ is false
unethical uses of statistics
- biased sampling
- eradicating data that doesn’t support your views
- eradicating data without justifiable reason
- using jargon
- deliberately using wrong method of analysis
statistical enquiry circle? (PPDAC)
- problem
- plan
- data
- analysis
- conclusion
primary/secondary data?
primary = data collected directly from the source
secondary = data previously collected by someone else
differences between data and information?
data = raw facts/figures, input, meaningless unless contextualised
information = polished data with context, meaningful, easier to understand, output
qualitative/quantitative data?
quantitative data = represents measures/counts - always numeric (interval/ratio scale)
qualitative data = names or labels used to identify an attribute (nominal/ordinal scale)
what does level of measurement determine?
the amount of information contained in the data
what are the 4 levels of measurement?
- nominal
- ordinal
- interval
- ratio
nominal data
- consists of labels/names used for identification
- can be numeric or non-numeric
- categories are in no logical order and have no particular relationship
ordinal data
- exhibits properties of nominal data and may be rank ordered
interval data
represented by numbers but doesn’t have a true 0
ratio data
represented by numbers and has a true 0