DATABASES 1 INF2603 CH 1 Flashcards
CHAPTER 1 AND INTRO
what is data?
The word raw indicates that the facts have not yet been processed to reveal their meaning.
what is Information?
is the result of processing raw data to reveal its meaning.
knowledge (data)
-body of information and facts about a specific subject.
-familiarity, awareness, and understanding of information as it applies to an environment.
Data management
acronym(g,s,r)
proper generation, storage, and retrieval of data
Data quality (6 characteristics )
acronym(A,R,C,T,U,U)
■ Accuracy: Is the data accurate and has it been obtained from a verifiable source?
■ Relevance: Is the data relevant to the organization?
■ Completeness: Is the required data being stored?
■ Timeliness: Is the data updated frequently to meet the business requirements?
■ Uniqueness: Is the data unique and without redundancy?
■ Unambiguous: Is the meaning of the data clear?
explain Data governance.
- a strategy or methodology defined by an organization to safeguard data quality.
-Each company makes its own policies and procedures for managing the availability, usability, quality, integrity, and security of data state
-who owns the data, is authorized to create, update, and delete new records in the database
Master Data Management (MDM)
- technology of data governance strategy
- consistent and accurate data
- auditing, reporting, and compliance of data
A DATABASE
is a shared, integrated computer structure that stores a collection of:
■ end-user data, or raw facts of interest to the end user
■ metadata, or data about data, through which the end-user data are integrated and managed.
metadata
- provide a description of the data characteristics and the set of relationships that link the data found within the database.
-which essentially encapsulates the different properties, history, origin, versions, and other information
DATABASE MANAGEMENT SYSTEM (DBMS)
is a collection of programs that manages the database structure and controls access to the data stored in the database
Advantages of DBMS 6
■ IMPROVED DATA SHARING.
■ BETTER DATA INTEGRATION.
■ MINIMISED DATA INCONSISTENCY.
■ IMPROVED DATA ACCESS.(queries)
■ IMPROVED DECISION MAKING.
■ INCREASED END-USER PRODUCTIVITY.
QUERY (DBMS)
-is a specific request for data manipulation (for example, to read or update the data) issued to the DBMS.
- Simply put, a query is a question,
AD HOC QUERY(DBMS)
is a spur-of-the-moment question
QUERY RESULT SET(DBMS)
-The DBMS sends back an answer to the application. For example, end users, when dealing with large amounts of sales data, might want quick answers to questions (ad hoc queries) such as:
- What was the volume of sales by product during the past six months?
5 ways in which data can be classified in DBMS
- NUMBER OF USERS SUPPORTED,
- WHERE THE DATA ARE LOCATED,
- THE TYPE OF DATA STORED
- THE INTENDED DATA USAGE
- DEGREE TO WHICH THE DATA ARE STRUCTURED.
SINGLE-USER DATABASE
supports only one user at a time. if user A is using the database, users B and C must wait until user A is done.
DESKTOP DATABASE.
A single-user database that runs on a personal computer
MULTI-USER DATABASE
multiple users at the same time.
WORKGROUP DATABASE
When the multi-user database supports (usually fewer than 50) or a specific department within an organization,
ENTERPRISE DATABASE
When the database is used by the entire organization and supports many users (more than 50, usually hundreds) across many departments,
CENTRALISED DATABASE
a database that supports data located at a single site .
DISTRIBUTED DATABASE
supports data distributed across several different sites.
The most popular way of classifying databases today?
how they will be used and on the time sensitivity of the information gathered from them.
Other names for OPERATIONAL DATABASEs
ONLINE TRANSACTION PROCESSING (OLTP), TRANSACTIONAL OR PRODUCTION DATABASE
OPERATIONAL DATABASE, also called an ONLINE TRANSACTION PROCESSING (OLTP), TRANSACTIONAL OR PRODUCTION DATABASE.
support a company’s daily operations, product or service sales, payments and supply purchases reflect critical day-to-day operations andmust be recorded accurately and immediately.
Typically, analytical databases comprise of which two main components?
-A DATA WAREHOUSE
-ONLINE ANALYTICAL PROCESSING (OLAP) front end.
THE DATA WAREHOUSE
is a specialized database that stores data in a format optimized for decision support that contains historical data obtained from operational databases and other external sources.
-tactical and decision support
ONLINE ANALYTICAL PROCESSING(OLAP)
-is a set of tools that work together to provide an advanced data analysis environment for
- retrieving, processing, and modeling data from the data warehouse
UNSTRUCTURED DATA
- data that exists in their original (raw) state, in the format in which they were collected.
-Therefore, unstructured data exist in a format that does not lend itself to the processing that yields information.
STRUCTURED DATA
are the result of formatting unstructured data to facilitate its storage and use, and the generation of information.
Semi-structured data
have already been processed to some extent
XML
Extensible Markup Language
Extensible Markup Language (XML)
is a special language used to represent and manipulate data elements in a textual format.
An XML database
supports the storage and management of semi structured XML data
Data Massaging’ (Data Manipulation)
to extract information to formulate pricing decisions, sales forecasts, market positioning, etc. to make tactical or strategic decisions from Analytical databases
OPEN SOURCE
database system which allows users to build and modify a database of their choice, distribute the database, and improve the actual product