Databases-chap 1 Flashcards
Who is Colossus
The world’s first electronic, digital, programmable computer used by British code breakers to help read encrypted German messages during WW2
Data vs Information
Data: Raw facts, must be properly formatted for storage, processing and presentation
Information: Data processed to reveal meaning needs to be in context to reveal meaning
Pyramid
Knowledge: Body of information and facts about a specific subject
Information
Data
Data quality aspects
Accuracy
Relevance
Completeness
Timeliness
Uniqueness vs redundancy
Unambiguous
Data management
Discipline that focuses on the proper generation, storage and retrieval of data
Core activity of any business, gov agency, service org or charity
Data governance
Describes a strategy or methodology defined by an org to safeguard data quality
Policies and procedures to manage availability, usability, quality, integrity and security of data
What does DBMS do?
Manages the interaction between end user and the database
Database
Shared, integrated computer structure that stores
End user data(raw facts)
Metadata(data about data)
Name of each data element
Types of values(numeric, dates or text)-Data type
DBMS
Collection of programs that manages database structure and controls access to data
Intermediary between the user and database
Possible to share data among multiple applications or users
Makes data management more efficient and effective
Roles and advantages of the DBMS(notes)
- Improved data sharing – enables quick responses
- Improved data security – enforce policies
- Better data integration – see organization as a whole
- Minimized data inconsistency – different versions of data
- Improved data access – use of queries
- Improved decision making – can be competitive
- Increased end-user productivity – empowers employees
Disadvantages of DBMS
1.Increased cost
– Hardware, software, staff, training, licensing
2.Management complexity
3.Maintain currency
4. Vendor dependence
5.Frequent upgrade/replacement cycles
Types of databases
- Classified according to:
– Number of users supported
– Where the data is located
– Type of data stored
– Intended data usage
– Degree to which data is structured
Number of users supported
Single user
Multi-user
Workgroup
Enterprise
Location
Centralized:Supports data located at a single site
Distributed:Supports data distributed across several sites
Type of data stored
General purpose databases
– Wide variety of data used in multiple disciplines e.g. google
Discipline-specific databases
– Data focused on specific subject area e.g. medical or academic database
Internal data usage
Operational database
– Day to day operations e.g. sales
Analytical databases
– Storing historical data used for tactical or strategic decision making
Degree of data structuring
Unstructured data
– Exist in format it was collected
– Difficult to turn into information
* Structured data
– Formatted unstructured data to facilitate storage and use
Why is database design important
Poorly designed database generates errors>leads to bad decisions>can lead to failure of the org
Problems with File System
Data Processing
- Lengthy development times
- Difficulty of getting quick answers
- Complex system administration
- Lack of security and limited data sharing
- Extensive programming
Evolution of file systems
Manual file systems
* Paper and pencil systems
– Organized to facilitate the expected use of data
– Amount of data needs to be small
– Few reporting requirements
Computerized file systems
* Make use of a data processing specialist
– Each query needed a new program to be written
– Initially files were created to be similar to manual files
e.g. add to, update and delete data from files
– Time consuming from query to final report
Structural dependence
– Access to a file depends on its structure
– To add one field in a file system requires many changes
– None of the previous programs will work and must be
modified to confirm to the new file structure
Structural independence
Change the file structure without affecting the
applications ability to access the data
Data dependence
Is where data storage characteristics cannot be changed easily
Data independence
Changes in the data storage characteristics without affecting the application program’s ability to access the data