Intro to Data and Data Science Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What is Analysis?

1.2

A

‘how’ and ‘why’ something happened

performed on past data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are Analytics?

1.2

A

Analytics apply logical reasoning to info obtained from analysis

Explores the future and looks for patterns

2 types:
Qualitative and
Quantitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are Qualitative Analytics?

1.2

A

The use of:
intuition
experience and
analysis

to plan the next business move

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are Quantitative analytics?

1.2

A

The application of formulas and algorithms to numbers gathered from analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Business Intellegence?

1.4

A

Process of analysing and reporting historical business data

Preliminary step to predictive analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Machine Learning?

1.4

A

Ability of machines to predict outcomes without being programmed to do so

The machines use data to:

  • Make predictions
  • analyse patterns
  • give recommendations
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are advanced analytics?

1.4

A

all types of analytic processes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Symbolic reasoning is a type of AI that makes an exception and does not use ML and deep learning.
It is based on high-level human-readable representations of problems and logic.

True or False:
Symbolic reasoning is commonly used in practice

1.4

A

False:

Very rarely used in practice.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

5 Primary Columns om the 365 infographic

1.5

A
traditional data
big data
business intelligence
Applying traditional data science techniques
Using ML techniques
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is “Data”

2.0

A

information stored in a digital format

used for:

a) analysis
b) decision making

2 Types:

a) Traditional
b) Big Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is traditional data?

2.0

A

Data in the form of tables containing numeric or text values;

Data that is structured and stored in databases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is big data?

2.0

A

Extremely large data;

It can be in various formats:

  • structured
  • semi-structured
  • unstructured

often characterized by ‘V’ (volume, variety, velocity, etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Data Science?

2.0

A

an interdisciplinary field that combines:

statistical,

mathematical,

programming,

problem-solving, and

data-management tools.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are Traditional Methods?

2.0

A

derived from stats and adapted for business

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Raw Data?

4.1

A

AKA Primary Data

  • cannot be analysed immediately
  • accumulated and unorganized. The organization is called data collection
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Class labelling?

4.1

A

Labelling the data point to the correct data type

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is data cleansing?

4.1

A

AKA Data Scrubbing

  • Deals with inconsistent data
  • -containing typos or missing info
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is data balancing?

4.1

A

Ensuring the sample gives equal priority to each class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is Data Shuffling?

4.1

A

Shuffles data to ensure data is free from unwanted patterns from collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is a numerical variable?

4.2

A

Manipulatable numbers that provide useful information

21
Q

What is a categorical variable?

4.2

A

Numbers with no numerical value.

Dates are also considered categorical

22
Q

What is text data mining?

4.3

A

The process of deriving valuable, unstructured data from text.

23
Q

What is data masking?

4.3

A

data masking conceals the original data with random and false data,

allows you to conduct analysis and keep confidential information in a secure place.

24
Q

What is a metric?

4.5

A

a value derived from obtained measures

aims at gauging business performance/progress (has business meaning)

25
Q

What is a measure?

4.5

A

simple stats of past performance (no business meaning)

26
Q

What is a KPI?

4.5

A

Key Performance indicator

metrics + business objective

27
Q

What is clustering?

4.7

A

grouping the data in neighbourhoods to analyse meaningful patterns

28
Q

What is a time series?

4.7

A

used in economics and finance

shows the development of certain values over time (i.e. stock prices, sales volume)

29
Q

What is a model in machine learning

4.9

A

an algorithm to recognize certain patterns

30
Q

What is an objective function?

4.9

A

The specification of a machine learning problem;

a function to be maximized or minimized depending on the task

31
Q

What is an optimization algorithm?

4.9

A

Algorithm that compares previous solutions until reaching the reaching the optimal solution

32
Q

What are the three main types of machine learning?

A

Supervised

Unsupervised

Reinforcement

33
Q

What is supervised learning?

4.10

A

Provides feedback

whether they did ‘good’ or whether they need to improve

Uses labelled data

34
Q

What is unsupervised learning?

4.10

A

In this case, the algorithm trains itself

algorithm uses unlabelled data

35
Q

What is reinforcement learning?

4.10

A

A reward system is introduced.

maximize a reward (not minimize an error)

36
Q

What is deep learning?

4.10

A

modern state-of-the-art approach to machine learning

– leverages the power of neural networks

can be both supervised and unsupervised

37
Q

Python and R have their limitations. They are not able to address problems specific to some domains. One example is ‘relational database management systems’. In these instances, ______ works best

5.

A

SQL

38
Q

Data architect

6

A

designs the way data will be retrieved processed and consumed

39
Q

Data engineer

6

A

processes the data for analysis

40
Q

database administrator

6

A

– handles this control of data; works with traditional data

41
Q

BI analyst

6

A

performs analyses and reporting of past historical data

42
Q

BI consultant

6

A

– ‘external BI analyst’

43
Q

BI developer

6

A

performs analyses specifically designed for the company

44
Q

Data scientist

6

A

employs traditional statistical methods or unconventional machine learning
techniques for making predictions

45
Q

Data analyst

6

A

prepares advanced analyses

46
Q

Machine learning engineer

6

A

applies state-of-the-art ML techniques

47
Q

200,000 lines of data constitute big data – TRUE or FALSE?

A

FALSE

-It is not just volume that defines a data set as ‘big’
– variety, variability, velocity, veracity and other characteristics play an important role as well

48
Q

Qualitative analysis such as SWOT are not used for quantitative analysis. Hence, they are not
part of business intelligence –TRUE or FALSE

A

False