pic 3 - Data and Knowledge Management Flashcards

1
Q

Difficulties of managing data

A
  1. Amount of data is increasing exponentially.
  2. Data is scattered throughout organizations.
  3. Data is generated from multiple sources.
  4. New sources of data are constantly being developed. Data becomes let current over time.
  5. Data rot
  6. Data security, quality, and integrity are critical, yet they are easily jeopardized.
  7. Federal gov. regulations require companies to account for how information is being managed within their organizations. Companies are downing in data, much of which is unstructured.
  8. Big data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data silo

A

A collection of data held by one group that is not easily accessible by other groups. They hinder the process of gaining actionable insight from organizational data, create barriers to an overall view of the enterprise and its data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data streams

A

Data that are continuously generated by point-of-sale systems, clickstream data, social media, and sensors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data rot

A

Refers primarily to problems with the media on which the data are stored.
- Temperature, humidity, and exposure to light can cause physical problems with storage media and make it difficult to access data.
- Another aspect is finding the machines needed to access the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Data governance

A

An approach to managing information across an entire organization.
- Involves a formal set of business processes and policies that are designed to ensure that data are handled in a certain, well-defined fashion.
- Objective is to make info available, transparent, and useful for the people who are authorized to access it, from the moment it enters an organization until it become outdated and is deleted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Master data management

A

A process that spans all of an organization’s business processes and applications.
- Strategy for implementing data governance.
- Provides companies the ability to store, maintain, exchange, and synchronize a consistent, accurate, and timely “single version of the truth” for the company’s master data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Master data

A

A set of core data (e.g., customer, product, employee, vendor, geographic location) that span the enterprise’s information systems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Transactional data

A

Data generated and captured by operational systems that describe the business’s activities, or transactions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Data file

A

A collection of logically related records.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Database systems minimize the following problems:

A
  • Data redundancy – same data stored in multiple locations.
  • Data isolation – applications cannot access data associated with other applications.
  • Data inconsistency – various copies of the data do not agree.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Database systems also maximize the following benefits:

A
  • Data security – Since data is put in once place in databases, there is risk of losing a lot of data at once. Databases must have extremely high security measures in place to minimize mistakes and deter attacks.
  • Data integrity – Data meet certain constraints (E.g., there are no letter in a SIN).
  • Data independence – Applications and data are independent of one another; that is, applications and data are not linked to each other, so all applications are able to access the same data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A bit (binary digit)

A

the smallest unit of data a computer can process. Binary means it can only consist of a 0 or 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

A byte

A

a group of 8 bits that represent a single character. Can be a letter, number, or symbol.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Field

A

a characteristic of interest that describes an entity. It is a logical grouping of characters into a word, a small group of words, or an identification number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Record

A

a logical grouping of related fields.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Table

A

a logical grouping of related records. AKA a data file.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Database

A

a logical grouping of related files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Database management system (DBMS)

A

A set of programs that provide users with tools to create and manage a database.
- Managing a database refers to the processes of adding, deleting, accessing, modifying, and analyzing data that are stored in a database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Relational database model

A

Data model based on the simple concept of tables in order to capitalize on characteristics of rows and columns of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Data model

A

A diagram that represents entities in the database and their relationships

21
Q

Entity

A

A person, a place, a things, or an event about which an organization maintain information.
- A record generally describes an entity.

22
Q

Instance

A

Each row in a relational table, which is specific, unique representation of the entity.

23
Q

Attribute

A

Each characteristic or quality of a particular entity.

24
Q

Primary key

A

A field (or attribute) of a record that uniquely identifies that record so that it can be retrieved, updated, and sorted (e.g., student number)

25
Q

Secondary key

A

A field that has some identifying information, but typically does not uniquely identify a record with complete accuracy (e.g., Student’s major if a user wanted to identify all of the students majoring in a particular field of study).

26
Q

Foreign key

A

A field (or a group of fields) in one table that uniquely identifies a row of another table.
- Used to establish and enforce a link between two tables.

27
Q

Structured data

A

highly organized data in fixed fields in a data repository such as a relational database that must be defined in terms of field name and type (e.g., alphanumeric, numeric, and currency).

28
Q

Unstructured data

A

Data that do not reside in a traditional relational database (e.g., email messages, work processing documents, videos).

29
Q

Big Data

A

A collection of data that is so large and complex that it is difficult to manage using traditional database management systems.

30
Q

Characteristics of Big Data

A

Characteristics of Big Data
1. Volume – mass amounts of data.
- A single jet engine can generate 10 terabytes of data in 30 minutes.
2. Velocity – the rate at which data flow into an organization is rapidly increasing.
- Critical because it increases the speed of feedback loop between a company, its customers, its suppliers, and its business partners.
3. Variety – Big Data formats change rapidly.
- Includes satellite imagery, broadcast audio streams, digital music files, web page content, scans of government documents, and comments posted on social media networks.

30
Q

Big Data generally consists of the following:

A
  • Traditional enterprise data (e.g., customer relationship management systems, operations data, etc.).
  • Machine-generated/sensor data (e.g., smart meters, manufacturing sensors, sensors integrated into smartphones, automobiles.).
  • Social data (e.g., customer feedback comments, Microblogging sites such as Twitter, and social media sites.).
  • Images captured by billions of devices located throughout the world, from digital cameras and camera phones to medical scanners and security cameras.
31
Q

Issues with Big Data

A
  1. Big Data can come from Untrusted sources.
  2. Big data is dirty
    Dirty data - inaccurate, duplicate, or erroneous data.
    Ex. Problems such as misspelling of words and duplicate data such as retweets or company press releases that appear multiple times in social media.
  3. Big data changes
32
Q

Massively parallel processing

A

the coordinated processing of an application by multiple processors that work on different parts of the application, with each processor utilizing its own operating system and memory.

33
Q

Cold data

A

The storage of relatively inactive data that does not have to be accessed frequently or rapidly.

34
Q

Hot data

A

Data that must be accessed frequently and rapidly.

35
Q

Data warehouse

A

A repository of historical data that are organized by subject to support decision makers within the organization.

36
Q

Data mart

A

a low-cost, scaled-down version of a data warehouse that is designed for the end-user needs in a strategic business unit or an individual department.

37
Q

Knowledge management (KM)

A

a process that helps organizations manipulate important knowledge that comprises part of the organization’s memory, usually in an unstructured format.

38
Q

Explicit knowledge

A

The more objective, rational, and technical types of knowledge.

39
Q

Tacit knowledge

A

The cumulative store of subjective or experiential learning, white is highly personal and hard to formalize.

40
Q

Knowledge management systems (KMSs)

A

the use of modern technologies – the internet, intranets, extranets, and databases – to systemize, enhance, and expedite knowledge management both within one firm and among multiple firms.

41
Q

The KMS Cycle

A
  1. Create knowledge – Knowledge is created as people determine new ways of doing things or develop know-how. Sometime external knowledge is brough in.
  2. Capture knowledge – New knowledge must be identified as valuable and be presented in a reasonable way.
  3. Refine knowledge – New knowledge must be placed in context so that it is actionable.
  4. Store knowledge – useful knowledge must be stored in a reasonable format in a knowledge repository so that other people in the organization can access it.
  5. Manage knowledge – the knowledge must be kept current. Must be reviewed regularly to verify that it is relevant and accurate.
  6. Disseminate knowledge – knowledge must be made available in a useful format to anyone in the organization who need it, anywhere and anytime.
42
Q

Structured query language (SQL)

A

The most popular query language for requesting information from a relational database.

43
Q

Query by example (QBE)

A

A method to obtain information from a relational database by filling out a grid or template – also known as a form – to construct a sample or a description of the data desired.

44
Q

Entity-relationship modelling

A

The process of designing a database by organizing data entities to be used and identifying the relationships among them.

45
Q

Entity-relationship (ER) diagram

A

Document that shows data entities and attributes and relationships among them.

46
Q

Data dictionary

A

A collection of definitions of data elements; data characteristics that use the data elements; and the individuals, business functions, applications, and reports that use these data elements.

47
Q

Normalization

A

a method for analyzing and reducing a relational database to its most streamlined form to ensure minimum redundancy, maximum data integrity, and optimal processing performance.