Scavenger Hunt 8 Flashcards

1
Q

What is the rate of data growth?

A

Doubling every 6 months…unprecedented and will continue regardless of budget restraints

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is “information overload”? What is its alleged impact?

A

Vast flood of info is disruptive, liking drinking from a fire hydrant
People’s cant do their jobs with so much data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

By contrast, what is “information abundance” and what are its implications for knowledge workers?

A

Take advantage of the presence of all the new data to learn how to make meaningful info from it and gain managerial knowledge
Geek up!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Business Intelligence

A

Combining aspects of reporting, data exploration & ad hoc queries, & sophisticated data modeling & analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Analytics

A

Statistical and quantitative analysis of data
Explanatory and predictive models, and fact-based management to drive decisions and actions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Data

A

Raw facts and figures
Tells you nothing alone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Information

A

Data that has been presented in such a way that it answers questions or supports decision making

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Knowledge

A

Insight derived from experience and enterprise-savvy information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Structured Data

A

Organized data that conforms to a model so that it can be searched, analyzed, and queried using traditional analysis tools

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Unstructured Data

A

Not organized, no schema
Ex. Text, email, Facebook pages, news stories
Binary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Table

A

Organized collection of data made up of records and fields

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Record

A

Part of table
Row of data
Individual observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Fields

A

Part of table
Column
Attribute for data (fixed schema-textual data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Relational Database

A

A database that correlates data from multiple tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Relational Database Benefits

A

Combines and simplifies data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Relational Database Key Field

A

One of the fields in a table marked with a key so that the data items are unique for that row
Never repeat

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Relational Database Valid Relationship Types

A

One:One -> Exactly one occurrence in the key field and Table B has only 1 occurrence as well
One:Many -> One occurrence in the key field but many occurrences in other tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Relational Database Views

A

Display data from multiple tables relationships by combining them for reporting and display

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

SQL

A

Structured Query Language
Most common language for creating & manipulating databases
Ruling champion database in business world

20
Q

TPS

A

Transaction Processing Systems
Ex. ATM, retail sales transactions, websites, searches

21
Q

(TPS) What is a “transaction?” What are its two key characteristics?

A

Any business exchange
1. Standardized-schema
2. Occurs repeatedly

22
Q

(TPS) Point of Sale system

A

Retail computer systems that collect sales data and are hooked directly into the store’s inventory-control system
Scan barcode, transaction happens

23
Q

(TPS) How do loyalty cards generate valuable data?

A

Membership program in which company is paying you through bonuses for data about you that you otherwise would not give them

24
Q

Enterprise Software

A

CRM, ERP, SCM
Applications that address the needs of multiple users throughout an organization or work group

25
CRM
Customer Relation Management System Every sales call, every customer inquiry, every follow up call=data
26
ERP
Enterprise Resource Planning System Paychecks, invoices, payments=business transactions/data to seek insights
27
SCM
Supply Chain Management System Each order for finished goods/raw materials=transactions
28
Business operations — examples
Health care patient data Michigan tags cows at birth which gives lifetime stream of data for each cow raised in the state Transportation industry: Plane engine produces 10tb of data every 30 min (sensors on aircrafts) Swiss Rails: 100 data items a second (sensors on train & track)
29
Sources of customer-provided data
Customer surveys: customer insight on products/services they received Product registration cards: data about customer income, where they live, highest education, hobbies Contests: cheap data where thousands of people apply, giving the company new data from all the entries
30
Data aggregator
A company whose sole job is to collect data from a wide variety of sources and organize it, clean it, and connect it to each other and then sell access to it to others
31
Data silos
Data collections completely separated with no possibility of communication or sharing
32
How do data silos come into being?
Company may have some data trapped inside of obsolete legacy systems
33
Why are data silos a problem?
Incompatible systems make it so there's missed opportunities to see patterns, trends, correlations, and develop new insights to answer questions and make decisions
34
How do inconsistent data formats impact a business?
Makes it hard to sort data
35
Operational data
Data that is produced by your organization's day to day operations Things like customer, inventory, and purchase data
36
How does the analysis of operational data compete with customers?
Delays and lost sales due to significant amount of additional load to the system during business hours (best if we do not query operational data)
37
What can a company do about the operational data vs customers problem?
Separate data repository: - One for operational data - One for reporting and analytics Combine data from many sources and cleaning it Historical data builds as months and days go by; used as a resource to see trends Periodic import from operational systems allows analytical system to be up to date enough to come up with inferences
38
What is a Data Warehouse? What are its characteristics
Collection of databases that supports decision making - Many sources - Operational systems-periodic transfer - Historical data - Fast Queries - Exploration
39
How is a Data Mart different from a Data Warehouse?
Same thing, different scale Data Mart looks at specific problem/unit rather than the entire enterprise
40
What three characteristics are necessary for something to be “Big Data”? (three V’s) Explain each
Volume: notion data is "too big" to be analyzed with traditional methods (hundreds and millions of data items) Velocity: Rapid arrival & feedback loop. Data is too fast. Cannot react fast enough. Variety: text, images, sound, video, human input, sensors, servers, so many types of data
41
What is Hadoop?
open source system designed to be able to consume ANY data you want (structured, unstructured, all data types) Distributing computing platform Scalable, cost effective, flexible, fault-tolerant
42
Examples of big data - How do you see the Three V’s in each?
Predictive Policing in LA (historical crime data) Tesco grocery chain (optimized fridge costs with in store fridges providing 70M data points per store per year; proactive maintenance reduced maintenance costs by ID'ing problems before they happen) Actions speak louder than words (Veteran Therapy; military suicide prevention in the US; uses pattern recognition to identify signs and types of psychological distresses through video measurements)
43
Canned Reports + Pros & Cons
Reports that provide regular summaries of information in a predetermined format Answer specific questions Pros: Easy & useful Cons: Inflexible & IT overhead
44
Ad-Hoc Reporting Tools + Pros & Cons
Tools that put users in control so that they can create custom reports on an as-needed basis by selecting fields, ranges, summary conditions, and other parameters Pros: Users define their own resorts, Powerful/flexible Cons: Demanding of user, Potentially steep learning curve, Business knowledge, Understand data schema
45
Dashboards
Graphic view of what is happening inside the software system Some customization A picture is worth a thousand words
46
OLAP + Pros & Cons
Online analytical processing The manipulation of information to create business intelligence in support of strategic decision making used for enormous amounts of data Pros: Huge data, Pre-processed + Summarized, User reports fast Cons: No access to details; user only sees summary
47
Data Mining
The process of analyzing data to extract information not offered by the raw data alone Enormous historical datasets Identify patterns Build Models Predict Future