Module 16 Data Analytics Flashcards

1
Q

4 key features of big data

A
  • Volume
  • Velocity
  • Variety
  • Veracity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Big data - volume

A
  • The volume of data is beyond the processing power of a simple IT infrastructure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Big data - velocity

A

Big data can be processed at speed, allowing businesses to change their strategy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Big data - variety

A
  • Big data can be any of a multitude of types of data and can be very far reaching
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Big data - veracity

A
  • The information extrapolated from big data must be trustworthy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Uses and outcomes of big data

A
  • Making informed business decisions
  • Improving products and/or customer service and improving operating efficiency
  • Assisting in identifying weaknesses
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Limitations to the implementation of big data analysis

A
  • Cost of implementing
  • Compliance and There is no one company which security of data
  • Employing the correct people
  • Data quality
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Data retention policy

A
  • how long data is to be stored for
  • how it is to be stored and the security associated with it
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Conceptual modelling

A
  • Shows the mapping between information
  • Employee’s record will be linked to their national insurance number and all their payslips
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Logical modelling

A
  • Describes the actual tables and columns to be used in the system
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Physical modelling

A

Describes the storage of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Examples of data base management systems

A
  • Oracle
  • IBM DB2
  • Microsoft Access
  • Microsoft SQL Server
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Storage devices

A
  • Hard drives
  • Cloud Storage
  • Compact Disk (CD)
  • Flash memory (USB)
  • Digital Versatile Disks (DVD)
  • Blu-ray disks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Advantages of edge computing

A
  • Quicker sharing of the data between machines, reducing the latency when using cloud computing
  • Increased privacy
  • Bandwidth savings
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

disadvantages to edge computing

A
  • considerable time to develop and implement edge
  • once implemented they need to maintain it
  • significant capital expenditure
  • devices may not be compatible with each other
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Data Mining

A
  • Way of identifying patterns or trends from a large data set
  • Uses mathematical algorithms to predict likely outcomes based on historical information
  • Done using a mix of statistics, Artificial Intelligence and Machine Learning.
17
Q

The 6 Steps of KDD

A
  1. Business Understanding - what the objectives and goals
  2. Data Understanding - data is collected and explored so it contains the right information
  3. Data Preparation - data is cleansed and formatted
  4. Data Mining/Modelling - data is now analysed by the system and any patterns identified
  5. Evaluation - analysed to see if the results are suitable to meet the business objectives
  6. Deployment - easily understood by all stakeholders and decisions made
18
Q

Graphical or pictorial representation of data can be used to help

A
  • See trends and outliers
  • Understand the key features of the data set
  • Make data understandable even for non-experts in a specific subject area
19
Q

Consideration when sharing data

A
  • Is the data confidential and how it can be kept secure
  • Is the data complete, accurate and unbiased to allow a fair decision to be made
  • How is it most appropriate the data is shared – electronically or in a hard copy
  • Who needs to see the data and are they aware of any data protection and retention policies
20
Q

Super Computers

A
  • Powerful
  • Expensive
  • Fast to process
  • Used for massive data manipulation
21
Q

Mainframes

A
  • Powerful
  • Expensive
  • Allows many concurrent users
  • Operates at high speed
  • Typical users are manufacturers, insurance companies and airlines
22
Q

Servers

A
  • Accommodates simultaneous multiple users
  • Used for running networks and internet applications
  • Large memory and storage capacities
  • Fast and efficient for multiple users
  • Susceptible to failure
23
Q

Microcomputers

A
  • Includes personal computers and workstations
  • Common
  • Can be understood and easily used by most people
  • Can be networked together within an organisation
  • Can often break
24
Q

Portable Computers

A
  • Allow “off-site” working
  • Similar capabilities to a microcomputer
25
Q

Handhelds

A
  • Portable
  • Supports basic functions but lacks processing power of more complex
  • machines
  • User friendly
  • Most people have access to a handheld device
  • Often cannot perform difficult tasks