Exam 3 Flashcards

1
Q

Rate of Data Growth

A

Doubles every 6 months

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Information Abundance

A

World has changed, jobs have changed - so much information - need to geek up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Business Intelligence

A

Firms that are basing decisions on hunches aren’t managing, they are gambling.
Having good data gives the business the power to make an informed decision.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Analytics

A

The extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data

A

Raw facts and figures

tells you nothing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Information

A

Data presented in a context so it can ‘answer a question’ or ‘support decision making’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Knowledge

A

Insight derived from experience and expertise

what humans bring to the table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Structed Data

A

Organized

Predefined Characteristics “Schema”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Unstructured Data

A

Not Organized – No Schema

Text – email, Facebook pages, news stories, etc.
Binary – Images, audio, video

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Table

A

An organized collection of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Records

A

Rows in a database table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Fields

A

Columns in a database table

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Relational database

A

Multiple tables that are related

Uses a Key Field (unique identifier) to link tables together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a “transaction”?

What are its two key characteristics?

A

Any business exchange

  • Standardized schema
  • Occurs repeatedly
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Point of Sale system

A

Retail sales transactions - a cash register

Tracks transactions when item is scanned at checkout and sold to a customer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do loyalty cards generate valuable data?

A

The company is paying you for data about you that you otherwise would not give them
(helps the company to see who is buying what items instead of cash anonymous)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

ERP

A

Enterprise Resource Planning

Look into paychecks, invoices, payments become a business transaction and data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

SCM

A

Supply Chain Management

Each order for finished goods, each order for raw materials are a transaction and data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Sources of customer-provided data

A

Customer surveys
Product registration cards
Contests

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Data Aggregator

A

Firms that trawl the Internet and other sources for data, then package that data up for resale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Business operations – examples

A

Healthcare Industry – patient data (pharmaceutical research)

Michigan – tags cows at birth

Transportation – engine on Boeing (new airbus aircrafts have over 100k sensors gathering data)

Switzerland – put sensors on 9k trains and 5k km of tracks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Top CIOs say that data growth is the #1 challenge today. What two problems arise from that challenge?

A

Handling explosive growth with constrained budgets

and Exploiting all that data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is an SSD? How does it address the problems with data growth?

A
Solid State Drive 
Uses flash memory (faster)
Lower power consumption (less heat)
RAID
Prices dropping
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is Automated Data Tiering?

A

Match storage performance to access frequency (automatically make data decisions)

Top Tier: Currently working data
Mid Tier: Recently used data
Bottom Tier: Historical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is DeDupe?

A

Software that identifies where there are duplicates and eliminates the extra data in order to tame the growth of unstructured data
(eliminates growth in area of unstructured data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Data Silos

A

No sharing / communication possible

Can be caused by data trapped in obsolete legacy systems or incompatible systems

Causes us to miss opportunities to see correlations, patterns and trends

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Operational data

A

Data that is continually generated in the day-to-day business operations of a business. When an order is entered, operational data is created and is used immediately by systems that pick inventory from the warehouse, print labels, and arrange for shipping

Things like customer, inventory, and purchase data fall into this category. This type of data is pretty straightforward and will generally look the same for most organizations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How does the analysis of operational data compete with customers?
What can a company do about this problem?

A

Putting extra load on the system that slows the system down and customers and sales can be lost

Add separate data repositories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Data Warehouse? What are its characteristics

A

Collection of databases that supports decision making

Many sources
Fast queries and Exploration

Best way to let your managers do analytics without harming the performance of your operational system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

How is a Data Mart different from a Data Warehouse?

A

Similar but, the scale is different.

Instead of looking at an enterprise, we are looking at a specific problem and a specific unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What three characteristics are necessary for something to be “Big Data”?

Explain what each “V” means

A

Volume – “too big” to be analyzed
(all credit card transactions on a day within Europe)

Velocity – “too fast” Rapid arrival; Feedback Loop
(Twitter messages or Facebook posts)

Variation – “too all over the place” too little consistency (text, images, sounds, human input, sensors)

32
Q

HaDoop

A

An open source system that is designed to be able to consume any kind of data

33
Q

Data Mining

A

The process of using computer software to identify patterns in enormous data sets and build models from that data. The idea is that the models will help identify current and future trends.

34
Q

Canned Report

A

Canned reports are preformatted reports distributed to a whole organization or to specifically defined user groups.
Answers specific questions

Needs to be easy to use with almost zero training required! Fortunately, each employee needs to see the same output; there’s no need for each user to be able to customize his/her report.

Pro: “Easy for users”
Con: Inflexible, IT overhead

35
Q

Ad-Hoc Report

A

Tools that allow users to create custom reports based on the questions they need answered. Users define their own reports and the tool is powerful and flexible.

Pro: powerful, flexible
Con: “demanding on user”, potentially steep learning curve, business knowledge, understand data schema

36
Q

OLAP (Online Analytical Processing)

A

Pros:
Huge data
Pre-processed and Summarized
User reports fast!

Con:
No access to details (due to summarization – never going to get all the way down to the detailed data)

37
Q

Network

A

Collection of devices connected together via communications devices and transmission media

38
Q

LAN

A

Network covering a limited geographical area, such as a single building

39
Q

Protocol

A

Rules that govern networking communications

40
Q

MAC Address

A

Uniquely identifies a node from every other one on the planet; however provides no information about its location

Similar to a social security number, each device on the planet has a unique MAC address. This number is permanent and never changes.

41
Q

Router

A

A specialized computer with multiple network ports and specialized software that is used to interconnect networks

42
Q

DNS

A

the technology that converts hostnames like www.espn.com into IP addresses like 199.181.132.250

43
Q

The single greatest cause of lost data? How do you protect against that risk?

A

User Error

Backup everything to protect against that risk. You need a good backup program that backs up the data files and an image backup.

Backup to tape cartridges (cheap but slow) or an array of backup disks (faster than tapes).

44
Q

Peer-to-Peer

A

A network where each computer has the ability to both share and use resources

45
Q

Server

A

A computer that’s attached to the network, whose primary purpose is to provide service to other nodes on that network

46
Q

Node

A

Any device that is attached to a network

47
Q

Packet

A

Data is divided into packets, which is done by software, in order for them to be sent across the internet more efficiently

48
Q

WAN

A

Wide Area Network- links LANs together

Only works if private data circuit goes to area

Does not work for mobile users (needs private data circuit)
Works for remote office needs because data circuit exists

49
Q

What are the key differences between copper UTP and fiber Ethernet cabling?

A
UTP = Unshielded Twisted Pair 
Fiber = Long runs, High speed
50
Q

Fiber Optic cable is non-conducting – why is that good?

A

Avoids electrical interference (Florida summers/weather)

Made of a type of glass

51
Q

Ethernet Switch

A

What you use to connect a group of nodes together

Some switches have management capabilities. You can look at them and see the status of the device.

52
Q

PoE

A

Power over Ethernet

Utilizes unused cables, saves money, can be used for security cameras

53
Q

Backhoe problem - How do you protect your network against this problem?

A

You can only have a single cable and it can get destroyed by a differ in construction

Protect this by having multiple paths, separate, different pathways instead of just one.

54
Q

What guidance did Mr. Olson offer about WiFi range?

A

They don’t like things that get in the way, like walls. If you think about how far the radio waves can travel,
Maybe 100 ft indoors… more outdoors

55
Q

What common devices can interfere with WiFi networks? Which radio spectrum is affected?

A

2.4 GHz is affected

Microwave, cordless phone, baby monitor can interfere

56
Q

AP – Define the acronym. What does it do? How does it connect to rest of the corporate LAN?

A

Access Point

Allows for other Wi-Fi devices to connect to a wired network

57
Q

Site survey

A

Walking around the building to find weak and strong signals for best coverage

58
Q

Why is WiFi security important?

A

Prevents hackers from potentially seeing your internet traffic

59
Q

Rogue APs

A

Someone installs their own personal router at the office that can become/creates security threats

60
Q

Client-Server service model

A

Clear division of labor, IT does the services, client/user just receives it

Client sends request to server, server send back a response.

61
Q

RAID

A

Redundant Array of Inexpensive Disks

Protects against data loss

62
Q

How does RAID 1 (mirroring) work?

A

2 hard drives; RAID system copies the information onto both computers so they’re mirror images of each other

63
Q

How does RAID 5 work?

A

3+ drives

Error correction – You use a bunch of hard drives. IT will take your data and divide it into several different drives.

n+1 – On the extra drives stores error correcting data. Which is mathematically computed from the other drives. If one of those drives crashes, the data on the surviving drives plus the error correcting data can be used with the same math formula to calculate the data on the dead drive

64
Q

How can a company protect itself against LAN hardware failure?

A

Put multiple NICs in the server; multiple switches

65
Q

What technology was described that can protect against total server failure? Two basic versions of this technology were described. How do they work? How are they different?

A

Clustering – communicate to make sure work load is shared and protect against crashing

Active-Passive – Active server handles all requests, passive only goes up once active crashes

Active-Active – Both servers are working, if one fails then the other just picks up the load

66
Q

URL – what are the component parts and what does each do for you?

A

Uniform Resource Locator

Application transfer protocol – defined how the data will be handled (http, https, ftp, itpc)

Host name - (www.)

Organization Domain name - (youtube)

Top level domain name - (com, .edu)

Path - folder in file system; file - specific piece of content (/tech)

File - The specific content you want. This part of the URL is case sensitive. (/index.hmtl – index.mp4)

67
Q

Domain Registrar – what is it? why would you use one?

A

Pay annual fee to reserve host name, first come first serve

68
Q

What is meant by the term “last mile”? Why should an organization care about the last mile?

A

Core of the internet is fast but, when you get to the edge of coverage, speed plummets

69
Q

DAS

A

DAS = Distributed Antenna System

Cell phones often don’t work as well in buildings. You can install multiple small antennas to boost the signal. You can also do this in fields and arenas where you expect large crowds so there is higher signal strength to handle the load.

70
Q

Satellite wireless - what is “latency”? What are the differences between MEO and GEO?

A

Latency - the delay for the satellite and the signal

GEO - a satellite looking down from above in space

MEO - has to go faster than earth is rotating, uses multiple satellites

71
Q

Net Neutrality - what’s the basic issue? Who is on each side of the issue and why?

A

The principle that all internet traffic should be treated equally by ISPs
Consumers vs providers, going to cost us/ISPs more money if it passes

72
Q

Last Mile

A

Users that have unacceptably slow links to the Internet

73
Q

Each of the last mile technologies and describe it in very general terms

A

Analog modems – Standard telephone lines (POTS)

Broadband – digital connections.

Cable broadband – Used for cable TV

DSL – Digital subscriber line. Using existing telephone wires

FTTH – Fiber to the home. 100% Fiber Optic. Super high performance but extremely expensive.

Cellular Wireless – You don’t need to wire individual premises, but it is still expensive because the wireless system is expensive to license because people hate the idea of having more cell towers (NIMBY – not in my backyard).

74
Q

CAT Ratings

A

the CAT number tells you specifically how that cable has been engineered and how fast it can safely transmit data.

CAT 5 is the minimum remotely accessible wiring. If it is below that, it should be replaced.

CAT 6 is the standard you find now.

75
Q

What is “information overload”? What is its alleged impact?

A

Some allege there is so much information available that people cannot do their jobs

They allege there is a $900 billion cost to the economy

76
Q

TPS

A

Transaction Processing Systems

77
Q

What are two things to worry about with Data mining?

A

CLEAN data – your data needs to be clean. If you have inconsistent data, you could wind up with false results.

REPRESENTATIVE data – if the past data is not representative of current or future events, you could wind up with bogus models.