Week 2 Flashcards

1
Q

Why is statistics relevant in Business?

A

Statistics plays an important role in virtually all aspects of
business (e.g. strategy, marketing, operations, supply
chain).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What do statistics provide?

A

For example, it provides information about customers
that helps companies to create stronger marketing
campaigns and targeted advertising to increase product
sales.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does statistics aid in?

A
  • It aids in managing financial risks, detecting fraudulent
    transactions, and preventing equipment breakdowns in
    manufacturing plants, among others.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the common applications of statistics?

A

Common applications of statistics include predictive modeling,
pattern recognition, anomaly detection, classification, and
sentiment analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the business use cases?

A

Business use cases include, but are not limited to, the following:
* Customer analytics
* Targeted advertising
* Website personalisation
* Risk management
* Investment/ trading optimisation
* Fraud detection
* Predictive maintenance
* Logistics and supply chain management

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are examples of statistical software?

A

Statistical software, such as Jamovi, Stata, EViews,
Minitab, and SPSS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are examples of data wrangling and modelling packages?

A

Data wrangling and modelling packages/ libraries, for
instance in R (e.g. tidyr) and Python (e.g. numpy).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are examples of data visualisation tools?

A

Data visualisation tools and packages/ libraries, for
instance in R (e.g. ggplot2) and Python (e.g. Matplotlib).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why are statistical analyses inherently challenging?

A
  • Statistical analyses are inherently challenging because of the
    advanced nature of the analytical process within a particular
    business context/ problem.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does the varied and massive amount of data add to?

A
  • The varied and massive amount of data add to the complexity
    and increase the time it takes to complete projects.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the challenges of large data sets?

A
  • In addition, working with large datasets that may contain a
    variety of structured, unstructured and semistructured data,
    further complicates the statistical data analysis process.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is one of the biggest challenges?

A

One of the biggest challenges is eliminating bias in datasets
and analytical applications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What biases can result in if they are not identified?

A

Such biases may skew results if they are not identified
and addressed, creating flawed findings that lead to
misguided decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What kind of impact can biases have?

A

Even worse, they may have a harmful impact on groups of
people.
For example, in the case of gender or racial bias in AI
systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is an additional challenging task when analysing data?

A

Thus, finding the right modelling approach and/or
appropriate data to analyse is an additional challenging
task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How are correct numbers and data misleading?

A

But even when numbers and data are correct, people and
organisations with their own agendas may use them to
mislead because they don’t tell the whole story/ hide
relevant facts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why we must evaluate statistics critically?

A

All types of information (e.g. statistics/ data visualisation)
may be, intentional or unintentionally, misleading.
That’s why you must evaluate them critically.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why are data and numbers powerful tools?

A

Data and numbers are powerful tools for building
arguments by adding credibility and may help proving a
particular point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What can data provide?

A

Data can provide insights into the world around us and
help addressing related problems.

20
Q

What are some examples of unethical use of statistics?

A

Unethical use of statistics:
Biased sampling.
* Throwing out data that does not support
your views.
* Throwing out data without a valid
statistical reason.
* Presenting findings and analysis in
terminology designed to confuse or
overwhelm its audience.

21
Q

What is the statistical enquiry cycle?

A

Problem, Plan, Data, Analysis, Conclusion

22
Q

What does the data analysis do?

A

The data analysis process includes a set of activities that business
analysists/ data scientists perform to gather, prepare, analyse data, and
present the results/ findings to business users

23
Q

Why is data collected?

A

Data are collected for specific purposes.

24
Q

How can data be distinguished?

A

In terms of data collection, it may be distinguished
between primary and secondary.

25
Q

What does primary data refer to?

A

Primary data refers to data collected directly from the
data source without going through any existing sources
(e.g. survey conducted by a researcher, answers of an
online questionnaire).

26
Q

What does secondary data consist of?

A

▪ Secondary data consists of data previously collected
and compiled by someone else (e.g. stock market
index).

27
Q

What may data be classified as?

A
  • In terms of data type, data may be classified as
    quantitative and qualitative.
28
Q

What does the statistical analysis depend on?

A

The statistical analysis that is appropriate for a particular
variable depends on whether the data for the variable are
qualitative or quantitative.

29
Q

What qualitative data can be?

A

Qualitative data are names or labels used to identify an
attribute of each element.
It may be numeric or nonnumeric (use the nominal or
ordinal scale).

30
Q

What does quantitative data represent?

A

Quantitative data represent measurements or counts.
It is always numeric (use the interval or ratio scale).
* Arithmetic operations are meaningful only with
quantitative data.

31
Q

What does level of measurement determine?

A

The level of measurement determines the amount of
information contained in the data.

32
Q

What does level of measurement also indicate?

A

The level of measurement also indicates the data
summarisation and statistical analyses that are most
appropriate.

33
Q

How many levels of measurement are there?

A

There are four levels of measurement: nominal, ordinal,
interval, and ratio.

34
Q

What does nominal data consists of?

A

Nominal data consists of labels or names used for
identification, may be non-numeric or numeric.
The categories are in no logical order and have no particular
relationship. The categories are said to be mutually exclusive
since an individual, object, or measurement can be included in
only one category.

35
Q

What is ordinal data?

A
  • Ordinal data exhibits properties of nominal data and may be
    rank-ordered.
    Thus, values in one category are larger or smaller than values
    in other categories (e.g. rating: excellent, good, fair, poor).
36
Q

What is interval data?

A

Interval data have the properties of ordinal data but also
show uniform distances between successive values.

37
Q

What is ratio data?

A

Ratio data have all the properties of interval data and the
ratio of two values is meaningful.
Scale must have a natural zero point (i.e. there is a
nonarbitrary zero point).

38
Q

Why care about level of measurement?

A
  • Measurement levels of most variables are not inherently fixed.
  • The higher the level of measurement, the more precise the
    data is, but a precise measure does not ensure accuracy.
  • You can always collapse into fewer categories, but not to more
    specific from more general.
  • For example, from different education classes to degree/
    non-degree.
39
Q

Why is with every dataset critical to ask whether the contents of the dataset are understood?

A
  • With every dataset, it is critical to ask whether the contents of
    the dataset are understood and well-motivated:
    – How and why were the specific cases in this dataset
    selected?
    – Is there information about the variables that we care about?
    – Why are the variables measured in the way that they are?
    – Are they good measures of the concept?
40
Q

What does big data refer to?

A

Big data refers to the large and diverse sets of information
that grow at ever-increasing rates.

41
Q

What are the three v’s of big data?

A

Three V’s of Big Data: The volume of information, velocity
(or speed) at which data are created and collected, and
the variety of data available.

42
Q

Where does big data often come from?

A

Big data often comes from data mining and arrives in
multiple formats.

43
Q

What is big data diverse in?

A

Big data is a great quantity of diverse information that
arrives in increasing volumes and with ever-higher
velocity.

44
Q

How is big data structured?

A

Big data may be structured (often numeric, easily
formatted and stored) or unstructured (more free-form,
less quantifiable).

45
Q

Where is big data often stored in?

A

Big data is most often stored in computer databases and
analysed using software specifically designed to handle
large, complex data sets.

46
Q

Where can big data be collected from?

A

Big data can be collected from publicly shared comments
on social networks and websites, voluntarily gathered
from personal electronics and apps, through
questionnaires, product purchases, and electronic check-ins, among others.