week 2 Nature and forms of data Flashcards
why is statistics relevant in business
Statistics plays an important role in virtually all aspects of business (e.g. strategy, marketing, operations, supply chain).
what are some common applications
of statistics in business
Common applications of statistics include predictive modelling, pattern recognition, anomaly detection, classification, and sentiment analysis.
Data analysis cycle. statistical enquiry cycle.
problem: define the problem. question and hypothesis
plan: study design and variables
data: collect and treat dataset
analysis: exploratory data analysis (EDA)
Modelling effort
Relating findings with context
Conclusion:
answer the question
present results and insights
new questions may emerge
Data science process
The data analysis process includes a set of activities that business analysists/ data scientists perform to gather, prepare, analyse data, and present the results/ findings to business users
What are the two main categories in which data collection is typically distinguished, and why is data collected.
data is collected for specific purposes
In terms of data collection, it may be distinguished between primary and secondary.
what is primary data
Primary data refers to data collected directly from the data source without going through any existing sources (e.g. survey conducted by a researcher, answers of an online questionnaire).
what is secondary data
Secondary data consists of data previously collected and compiled by someone else (e.g. stock market index).
Data vs information
data:
raw facts or figures
Meaningless and useless until it is organised and processed
understanding is commonly difficult
input is treated as data
information:
data with context
processed and meaningful form of data
understanding is comparably easier
output is treated as information
qualitative data
Qualitative data are names or labels used to identify an attribute of each element.It may be numeric or nonnumeric (use the nominal or ordinal scale).
quantitative data
Quantitative data represent measurements or counts.It is always numeric (use the interval or ratio scale).
what does the level of measurement determine
The level of measurement determines the amount of information contained in the data.
what does the level of measurement also indicate
The level of measurement also indicates the data summarisation and statistical analyses that are most appropriate.
what are the four levels of measurement
There are four levels of measurement: nominal, ordinal, interval, and ratio.
what does nominal data consist of
Nominal data consists of labels or names used for identification, may be non-numeric or numeric.
information about nominal data
The categories are in no logical order and have no particular relationship. The categories are said to be mutually exclusive since an individual, object, or measurement can be included in only one category.