Chapter 5: Data + Knowledge Management Flashcards
Data —-> Information —-> Knowledge
- Facts are digitized into data (I.e. numbers like 27, 17, etc)
- Data is put into context (analysed?), like 22 mph or 17 mph = speeding numbers makes Information
- Information is used + put into context of specific? situations to create knowledge (I.. e find reason for why people speeding on a road = Gov can make solutions for those people like alt routes)
Fact
The confirmation or validation of an event or an object (I.e. its a fact that Elephants can’t jump)
Information Age
The present time, during which infinite quantities of facts are available to anyone who uses a computer
Data
Raw facts that describe the characteristics of an event or object (I.e. Unit price, Product #, etc)
Information
Data converted into a menaingful and useful context (based on numbers, who is Jerrys best Cu based on sales, its Target at $272,000)
Business Intelligence
Info collected from multiple sources such as:
Suppliers
Cu
Competitors
Partners
+ Industries that analyse patterns, trends + relationships for strategic decision making
Knowledge
Skills, Experience + Expertise coupled with Info + Intelligence that creates a person’s intellectual resources
Knowledge Worker
Individual valued for their ability to interpret + analyse info
Difficulties in Managing Data
- Data Inc’s exponentially with time
- Multiple sources of data
- Data Degradation: I.e. Customers move to new addresses, change their names, etc
4.Data Rot: Refers primarily to the problems with the media that data is stored (I.e in old times, magnets could disrupt phones/data storages? Cause problems)
- Data Security (protect data?), Quality (good or bad qula data?), and Integrity (truthful/good data or not?)
- Gov regulation (like with Patriot Act, Cookies need permission)
Multiple Sources of Data
- Internal Sources (Corporate Databases + Comp doc’s)
- Personal Sources (Personal Thoughts, Opinions + Experiences)
- External Sources (Commercial Databases, Gov Reports + Corporate Web Sites)
Data Governance
An approach to managing info across an entire org
Master Data
Sum of all data relevant to org operations???
Master Data Mamgement
Ways to manage said data of org???
Database Approach include
- Old: Data Hierarchy
- New: Relational Database Model
Data Hierarchy
Data Hierarchy Term (DHT) 1. Bit (Binary digit)
Represents the smallest unit of data a computer can process + it consists only of a 0 and 1 (0 for no electric current, 1 for a current?)
DHT 2. Byte
A small group of eight bits that rep’s a single character (i.e. Letter, #, or a symbol)
DHT 3. Field
A column of data containing a logically grouping of characters into a word, a small group of words (e.g Last Name, Social Security #, etc)
DHT 4. Record
A logical grouping of related fields in a row (I.e. Students name, courses taken, the date + the grade(
DHT 5. Data File
Logical grouping of related records is called a data file or a table similar in appearance to an Excel Spreadsheet consisting of multiple columns + multiple rows
DHT 6. Database
Logical Grouping of related data files (aka database tables)
Hierarchy vs Databases
Seen after this card.
My thoughts/trying to remember class stuff:
Hierarchies make it difficult to access data sometimes/let specific apps access specific data
Database = more streamlined/efficient?
Database Management Systems Minimize these 3 Problems
- Data Redundancy: The same data is stored in multiple locations (i.e Photos In Phone, that takes up storage)
- Data Isolation:
Apps can’t qccess data associated with other apps (like WhatsApp not accessing contacts, you have to type it then)
- Data Inconsistency: Various copies of data don’t agree (with each other?)
2.
Database Management Systems (DBMS)
A set of programs that provide users with tools to create + manage a database
DBMS Maximizes these 3 things
- Data Security: Because data are “put in 1 place,” there’s a risk of losing a lot of data at 1 time
Therefore Databases must have extremely high security measures in place to minimise mistakes + deter attacks
- Data Integrity: Data meet certain constraints, I.e. there no Alphabetic characters in Social Security number field
- Data Independence: Apps + Data are independent of one another, Apps + Data aren’t linked to eachother, so all apps are able to access the same data
Relational Daatbase Model
Entity
Things like a Car, A person, an Object?
Instance
A Specific car/object, like a Blue Car?
Attribute
Something that describes an object (color?)
Primary Key
Specific, unique representation for an entity (I.e Student ID for Students)
Foreign Or Secondary Keys
Things that describe object that isn’t primary key (2ndary) + Keys that is Primary in one section (I.e. Student ID) but moved under a different section (I.e. Registration)
Structured Query Language
Most Popular Query Lang used for interacting with a database
SQL allows people to perform complicated searches by using relatively simple statements or key words
Query By Example (QBE)
The user fills out a grid or template (also known as: form) to construct a sample or a description of the data desired
SQL in a nutshell
SQL is a standard computer language for accessing + manipulating relational databases
What can SQL do?
Execute Queries against a database
Retrieve Data from a database
Insert records in a database….
Etc
SQL Commands
- SQL SELECT FROM Statement
Used to select Data from database
Data returned is stored in a result table, called a data set
SELECT column1, column2
From table___name;
I.e SELECT, Customername, City (these 2 = attributes) FROM Customers (entity table )
= SELECT all Cu’s Names + corresponding cities from the Cu tables
- SQL WHERE Clause
Is used to filter records
Is used to extract only those records that fulfill a specified condition
SELECT column1, column2
FROM table__name
WHERE condition;
I.e. = SELECT * FROM Customers
WHERE Country= ‘Mexico’;
Count = Attribute (In column)
Instance = ‘Mexico’ (one instance in column)
Entity Relationship Modeling
A process by which designers plan + create databases using an entity relationship diagram
ER Diagrams
Consists of: Entities, Attributes, and Relationships
To properly identify Entities, Attributes, + Relatiobships, Database designers 1st identify Bu rules for the particular data model
Bu Rules
Precise Descriptions of Policies, Procedures, or Principles that stores + uses data to generate info
Data Dictionary
Provides Info on each attribute, such as: its name, if its a key, part of a key, or a non-key attribute, the type of data expected (Alphanumeric, Numbering, dates, etc) + Valid values
Relationships
Illustrate an association between entities
Degree of a Relationship
Indicates the # of Entities associated with a relationship
Degree of a Relationship P2
Unary Relationship: exists when an association is maintained within a single entity (I.e. you can only have 1 Mother/Professor and Class?)
Binary Relationship: Exists when 2 Entities are associated (Professor and Class?)
Tertiary Relationship:
Exists when 3 Entities are associated (Professor, Class, Classroom?)
Cardinality
Refers to the # of times an instance of 1 entity can be associated with an instance in a related entity
Cardinality can be:
Mandatory Single (____|__| |Entity|)
Optional Single (____●_| |Entity|)
Mandatory Many (______|/ |Entity|)
\
or Optional Many (______|●/ |Entity|)
\
One to One Relationship I.e
|Student| ||__●_| |Parking Permt|
May have –>
<— Must Belong to
Student. Parking Permit.
Student ID # Permit #
(Prim Key?)
Foreign Key (¡)
Student Name. Student Name.
Student Address. Car Type
(Last 3 parts = attributes?)
One to Many Relationships I.e
|Professor|__|_|______●/__ |Class|
\
—->
Can Teach (many classes?)
<—-
Must have (at least 1?)
Professor:
Professor ID # (Prim Key?)
Professor Name
Professor Department (attributes?)
Class:
Class ID (Prim Key?)
Class name
Class Time
Class Place
Many to Many Relationships I.e
|Student|__|●______●/__ |Registration|\●_____●| |Class|
/ or \ = /_ = Many Symbol
\
Student —> Regis = Can Have
Student <—– Regis = Can Have
Regist —-> Class = Can Have
Regis <—– Class = Can Have
Studnet:
Student ID # (Primary key?)
Student Name
Student Address (Attributes?)
Registration:
Student ID #
Class ID # (Foregin Keys?)
Registration date (attributes?)
Class:
Class ID # (Prim Key?)
Class Name
Class Time
Class Place (attributes?)
Normalization
Is a method for Analyzing + Reducing a Relational Database to its Most Streamlined form to ensure Minimum Redundancy, Maximum Data Integrity + Optimal Processing Performance
Functional Dependancies
A means of expressing that the value of 1 particular Attribute is associated with a specific single value of another attribute
I.e:
A relationship between attributes such that the values in the 1st attribute (or set) ways determine the values in the 2nd attribute (or set)
E->A=/=A->E
a1 + a2 —> b1 Yes
a1 –> b1
\
—> b2 No
Normal Forms
1st Normal Form:
A table is in 1NF if every field contains only 1 value (Pizza Shop I.e. Each column = one value, like Order #, Pizza Code, etc)
2nd Normal Form:
A Table is in 2NF if it is in 1NF + each Non-Key attribute is only functionally dependant on the entire primary key (this situation only arises with tables that have multiple attribute keys)
(I.e Order #, Pizza Code +
Quantity columns sliced off from 1NF table, put in its own table)
3rd Normal Form:
A table is in 3NF if it’s in 2NF + NO Non-key attribute (or set) is Functionally dependant on any other
non-key attribute (or set)
No FDs among Non-Key Attributes (I.e Order columns in 1 table, Cu info in another, Order - Pizza Info in another, + Just Pizaa info in another)
Denormalization
Going back from 3NF to 2NF based on Comp resources/need for streamlining
Join Process with Tables of 3NF
Produce an Order
Step 1:
Order is placed with Order (with info of Order # = Prim Keys, Cu ID = Foreign Key Order Date + Total Price = under Order # + Cu ID)
+ Order Pizza (Order # there as Foreign Key, Pizza Code = Foreign Key, Quantity = under Pizza Code)
S2: From Order - Pizza Table, line is drawn to just Pizza Info Table (Pizza Code there as Prim Key, Pizza Name + Price under that)
S3:
Line from Order to Cu
Cu ID = Prim Key
Cu 1st Name
Cu Last Name
Cu Address
+ Zip Code there under Cu ID
Data Warehose
A Repository of Historical Data that are organized by Subject to Support Dec makers in the Org
Data Mart
Low - cost, Scaled down version of Data Warehouse designed for End-user needs in a Strategic Bu Unit (SBU) or Individual department
Data Warehouse Framework
On Left to Right direction:
Operational Systems/data (i.e. ERP, External Web Docs) get Info? extracted from all of them via Extraction Transformation Load
This puts data in a Federated Data Warehouse
Metadata Repository replicates info to go into Marketing Data Mart,
Supply Chain data goes into Enterprise Data Warehouse
The data (is also replicated/sent into) into Data Marts for diff Bu Functions (i.e Managemnet
Data Mart)
Middle Ware allows for data access (in a language all can understand, like English)
This goes into many diff Business Intelligence things (i.e DSS (Custom - Built Apps (4GL langs), EIS Reporting, Relational Query tools, OLAP/ROLAP, Data Mining, etc)
After going through Internet, data can be put in Web Browser too
Basic Characteristics of Data Ware houses + Dt Marts
- Organized by Bu Dimension or Subject (i.e. Marketing, Finance, etc)
- Use Online Analytical Processing (OLAP)
- Integrated (not just off in 1 place)
- Nonvolatile (doesn’t disappear into thin air?)
- Multidimensional (measures +???)
Generic Data WaHouse Environment (explanation for written down diagram)
Source Systems (ERP, POS, etc)
Data Integration (done via Extraction Transformation Load ?)
Storing the Data (in Data warehouse + Marts)
Metadata (translate data into something understandable?)
Data Quality (must be high quality, low quality dta lead to low qulaity decs)
Data Governance (keeping Dta in good Data security, integrity + independence?)
Relational Database
i.e. Tables from 2012, 2013 + 2014
Showing Product, Region + Sales in 1 column for each yr
Data Cube
Organized Data in more 3d way (Region = on the Height/left side, product types on bottom, years at. the left), which can be broken down into smaller cubes for each year
Eqivalence between Relational + Multidimensional Databases
Data must be the same on either databse
Knowledge Management
A Process that helps orgs manipulate important knowledge that comprises part of the Org’s ?
Knowledge
Info that is Contextual, Relevant + Useful (CoReUs)
Its Info in Action
Intellectual Capital (or Intellectual Assets) is another term for knowledge
Explicit Knowledge
More Objective, Rational + Technical Knowledge (ObRaTe) (i.e. Procedure to make Dairy Queen Ice Cream)
In an org, Explicit Knowledge consist of:
Policies
Procedural Guides
Reports
Products
Strat’s
Goals
Core Competencies
+ IT Infrastructure of the Enterprise
Tacit Knowledge
The cumulative Store of Subjective or Experiential (SuEx) Learning (i.e. Car Mechanic knowing problem with the car just by sound of the engine)
Tacit Know consists of:
An Org’s Experiences,
Insights,
Expertise,
Know-how,
Trade Secrets,
Skill Sets,
Understanding,
+ Learning
It’s generally imprecise + Costly to transfer
Knowledge Management Systems
Refer(s?) to the use of Modern Info technologies (i.e. the Internet, Intranet, Extranets, + Databases)
to Systematize, Enhance, + Expedite (SyEnEx) Intrafirm + Interfirm Know Management
KMS Cycle
- Create Knowledge (via data + info?)
- Capture Knowledge (in database????)
- Refine Knowledge (based on experiences? Some system?)
- Store Knowledge (Database again?)
- Manage Knowledge (via KMS???????)
- Disseminate Knowledge (to people in org??)
(CreCapRefStoManDis Know)
Big Data
Difficult to define
But we have 2 descriptions:
Gartner Reserach:
“Diverse, high-volume, high-velocity information assets that require new forms of processing to enable enhanced decision
making, insight discovery, and process optimization.
(www.gartner.com)”
Big Data Institute:
“(BD?) Exhibit Variety
Includes: Structured, Unstructured + Semistructured Data
Are generated at High Velocity with an uncertain pattern
Doesn’t fit neatly into Traditional, Structured, Relational Databases
Can be Captured, Processed, Transformed + Analyzed (CaPrTrAn) in a reasonalble amount of time only by sophisticated Information Systems”
Defining BData
BData generally consists of:
Traditional Enterprise Data
Machine - generated/sensor data (made by machine calculations?)
Social Data (?)
Images Captured by Billions of devices located around the World (Digital Cameras, Camera Phones, Medical Scanners + Security Cameras)
Characteristics of Bdata (5 V’s)
- Volume:
Much Larger Quantity of data than typical for Relational Databases
- Variety:
Lots of Diff Data Types + Formats (PNG?)
- Velocity:
Data comes at very Fast Rate (i.e. Mobile Sensors, Web Click stream)
- Veracity:
Traditional data quality methods don’t apply;
How to judge the data’s accuracy + relevance?
- Value:
BData is valuable to the Bottom Line ($$$)
+ for Fostering good Organizational actions + dec’s
(VolVarVelVerVal)
Issues with BData
- Untrusted Data sources (compu’s?)
- BD is Dirty:
Dirty Data refers to: Innacurate, Incomplete, Incorrect, Duplicate or Erroneous data
(Innac,Incom,Incor,Dupli,Erron)
BData Changes, especially in Data Streams:
Orgs must be aware that Data Quality in an Analysis CAN Change, becuase the Conditions under which the Data are captured Can Change
Managing BData
When properly analyzed, BData can reveal valuable Patterns + Info
Database environment (find suitable one??????????)
Traditional Relational Databases vs NoSQL Databases (?)
Open Source Solutions (?)
Putting BData to Use
Making BData Avaliable
Enabling Orgs to conduct Experiments
Micro-segmentation of Cu’s
Creating new Bu Models
Orgs Can Analyze Far more Data
BData used in Functional Areas of the Org
HR (workplace stuff?Empl data?)
Product Develoment (see segments of pop/preferences?_
Operations (make Operation channels?0
Marketing (do more Market research?)
Gov Operations (pay taxes faster?)