Lecture 7 Flashcards

1
Q

Transaction processing system (TPS)

A

System that records all transactions in an organisation (aka. fundamental operations), and saves them in a database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two main types of TPS?

A
  1. Batch Processing: puts transactions into temporary storage and then processes them all together (in a batch) at a specific time.
    Benefit: More efficient process, as all transactions can be processed when computing resources are less busy
    Disadvantage: The database does not reflect the current state of the business
  2. Online transaction processing: all transactions are processed immediately in real time.
    Advantage: current state of the business is always reflected in the database
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are enterprise systems?

A

They aim at consolidating the data that is collected and processed in various departments of the company.

These systems only provide interfaces, not the actual infrastructure (all about the front end).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is an Enterprise Resource Planning System (ERP)?

A

Most famous type of enterprise system and it is at the core of the enterprise.

Integrates core functions of the company into a homogenous system.

Focus: to allocate resources to specific departments

Smaller companies often pick smaller, more customisable systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

ERP leading vendors? and what kind of market are they operating in?

A

SAP and Oracle

The market is fragmented, which means neither of them have very high market shares.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Customer Relationship Management (CRM)?

A

Integrates customer data to be used by various departments.

Interface to the end customer.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Supply Chain Management (SCM)?

A

Provides a holistic overview of the value chain and is about the inventory of the company.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a Data Warehouse?

A

It collects and stores data from several different transactional systems in an organisation.

The data is consolidated, formatted and cannot be altered once its there –> standardisation.

Provides tools for querying, reporting, analysis which helps to make sense of the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a Data Mart?

A

A Data Mart contains specific (focused) data from the data warehouse and possible third-party external data to help solve a particular problem by particular users.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Business Intelligence?

A

It is the output side of data. Responsible for producing information and outputs from the data which can then be used to make decisions.

“Refers to tools for consolidating, analysing, and accessing data to support organisational decision-making”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is online analytical processing (OLAP)?

A

Aggregates and summarises statistics/operations. The results of this analysis are stored in a data cube.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a data cube? Why are they useful?

A

It stores results of OLAP analysis. Updated every time a new transaction is made.

Running a query on a data cube is enables a much quicker response time than running them on the original database, since less data needs to be analysed.

Multidimensional

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is data mining? and what are three basic patterns?

A

It is the use of specific algorithms to identify hidden patterns in large sets of data.

The three basic patterns uncovered through data mining:

  1. Associations
  2. Clustering
  3. Sequential relationships (timeseries)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are Associations?

A

Certain attribute values that frequently occur together within a data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a standard application of association analysis?

A

Market basket analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is Association rule mining? And what are its two central concepts?

A

Seeks to identify the most common affinities among items.

  1. Support s(X): is the fraction of transactions that contain a certain set of items X
  2. Confidence c(X –> Y): the fraction of transactions that contain Y among those transactions that contain X

An association is strong when there is both high confidence and high support.

17
Q

What is clustering?

A

Clustering seeks to identify natural groupings in data. The optimum number of clusters in unknown in advance and clusters require interpretation.

18
Q

What are the four Vs of big data?

A

Velocity = the speed at which new data is coming in to be processed

Variety = the kind of data and variety of formats that it comes in

Volume = the amount of data that needs to be processed

Veracity = the reliability of the data

19
Q

Analytics applies traditional statistical methods and AI to derive actionable insights from big data. What is an example of such a methods?

A

Neural networks: trained using huge historical data sets on the outcome of interests and other variables.
–> black box method

20
Q

What is a black box method?

A

Neural networks are usually this type of method, it is extremely hard to quantify the impact of a particular variable on the outcome. ie. no external variables are used to calculate the output.

21
Q

What is a database management system (DBMS)?

A

It stores and retrieves the data that an application creates and uses.

Different enterprise systems can share a DBMS to share common data –> localised between operation system and application level.

Improves efficiency since data is all in one place

22
Q

What is Hadoop?

A

It is a open-source software used for storage and analysis of big data sets.

Allows distributed computing ie. split the data into multiple parts to run on different machines, and then put it back together again.

Uses MapReduce framework to process data

23
Q

What are the four advantages of Hadoop?

A
  1. Flexibility: it can handle any kind of data from any source
  2. Scalability: it can be run on your own personal PC or it can be scaled efficiently to work on hundreds of computers
  3. Cost efficiency: it is fully open source software
  4. Fault tolerance: designed to avoid single points of failure