Chapter 12 Flashcards
How does ineffective data administration lead to poor data quality?
- Multiple data defintions, causing data integration problems
- Missing data elements, causing reduction in data value
- Inappropriate data sources and timing, cuasing lowered reliability
- Inadequate familitary, causing ineffective use of data for planning and strategy
- Poor response time and excessive downtime
- Damaged, sabotage, and stolen data
- Unauthorized access leading to embarassment to organization
What is data administration?
A high-level function that is responsible for the overall management of data resources in an organization, including maintaining corporte-wide definitions and standards
What is Database Administration?
A technical function that is responsible for physical database design and for dealing with technical issues such as security enforcement, database performance, and backup and recovery
What are traditional data administration functions?
- Data policies, procedures, standards
- Planning
- Data conflict (ownership) resolution
- Managing the information repository
- Internal marketing of DA concepts
What are traditional database administration functions?
- Analyzing and designing databases
- Selecting DBMS and software tools
- Intalling/Upgrading DBMS
- Tuning database performace
- Improving query processing performace
- Managing data security, privacy, and integrity
- Data backup and recovery
What is data warehouse administration?
New role that is coming with the growth in data warehouses.
Similar to DA/DBA roles but with an emphasis on integrations and coordination of metadata/data across many data sources.
What are the specific roles of Data warehouse administration?
- Support decision support applications
- Build a stable architecture - corporate information factory
- Establish service level agreements regarding data warehouses and data marts
What is Open Source DBMS?
An alternative to propriertary packages such as Oracle, Microsoft SQL Server, or DB2
Examples: MySQL, PostgreSQL
What are the advantages of Open Source DB Management?
- Pool of volunteer developers and testers
- Less expensive than proprietary packages
- Source code available, for modification
What are the disadvantages of Open Source DB Management?
- Absence of complete documentation
- Ambiguous licensing concerns
- Not as feature-rich as proprietary DBMSs
- Vendors may not have certification programs
What are some considerations when selecting an open source DBMS?
- Features
- Support
- Ease of use
- Stability
- Speed
- Training
- Licensing
What is database security?
Protection of the data against accidental or intentional loss, destruction, or misuse.
What makes database security tougher?
There is increased difficulty due to Internet access and client/server technologies
What are threats to data security?
- Accidental losses attributed to Human error, Software failure, Hardware failure
- Theft and fraud
- Loss of privacy (personal data) and confidentiality(corporate data)
- Loss of data integrity
- Loss of availability (sabotage, etc.)
Static HTML are easy to secure, how can you secure them?
- Standard database access controls
- Place Web files in protected directories on server
Dynamic pages are harder to secure, what can be done to secure them?
- User authentication
- Session security
- SSL for encryption
- Restrict number of users and open ports
- Remove unnecessary programs
For the W3C Web privacy standard, what is P3P?
Platform for Privacy Protection
What does P3P address?
- Who collects data
- What data is collected and for what purpose
- Who is data shared with
- Can users control access to their data
- How are disputes resolved
- Policies for retaining data
- Where are policies kept and how can they be accessed
What are features of database software security?
- Views or subschemas
- Integrity controls
- Authorization rules
- User-defined procedures
- Encryption
- Authentication schemas
- Backup, journalizing, and checkpointing
What are views?
- Subset of the database that is presented to one or more users
- User can be given access privilage to view without allowing access privilege to underlying tables
What are integrity controls?
- Protect data from unauthorized use
- Domains - set allowable values
- Assertions - Enforce database conditions
- Triggers - prevent innapropriate actions, invoke special handling procedures, write to log files
What are authorization rules?
Controls incorporated in the data management system.
What do authorization rules restrict?
- Access to data
- Actions that people can take on data
What does the authorization matrix for authorization rules include that expresses the rules in the form of a table?
- Subjects
- Objects
- Actions
- Constraints
Some DBMS also provide capabilities for what to customize the authorization process?
User-defined procedures
What is encryption?
The coding or scrambling of data so that humans cannot read them
What is the goal of authentication schemes?
To obtain a positve identification of the user
What is the first line of defense in authentication?
Passwords
They should:
- Be at least 8 characters long
- Combine alphabetic and numeric data
- Not be complete words or personal information
- Be changed frequently
Why are passwords flawed?
- Users share them with each other
- They get written down, could be copied
- Automatic logon scripts remove need to explicitly type them in
- Unencrypted passwords travel the Internet
What are some solutions to provide stronger authentication in addition to passwords?
- Two factor authentication (Smart care plus PIN)
- Three factor authentication (Smart care, biometric, PIN)
What is the purpose of the Sarbanes-Oxley (SOX) act?
- Requires companies to audit the access to sensitive data
- Designed to ensure integrity of public companies’ financial statements
What do SOX audits involve?
- IT change management
- Locical access to data
- IT operations
What is IT change management?
The process by which changes to operational systems and databases are authorized
What are the most common type of changes to databases?
Schema, Database configuration, updates to DBMS software
What three areas are required to be segragated in IT change management?
Development, test, production
What are two types of security policies and procedures for logical access to data?
Personnel controls
Physical Acces controls
What are personnel controls?
- Hiring practices, employee monitoring, security training, separation of duties
What are physical access controls?
Swipe cards, equipment locking, check-out procedures, screen placement, laptop protection
What are IT operations?
Policies and procedures for day-to-day management of infrastructure applications, and databases in an organization.
What is also involved in IT operations?
Vendor Management
What are responsibilies of vendor management?
- Review external maintenance agreements
- Access source code? (If vendor goes out of business)
What is database recovery?
A mechanism for restoring a database quickly and accurately after loss or damage
What are some recovery facilities for database recovery?
- Backup facilities
- Journalizing facilities
- Checkpoint facility
- Recovery manager
What should a DBMS provide that produces a backup copy (or save) of the entire database plus conrol files and journals?
Back-up facilities
How often should back-up facilities back up databases?
Periodically (nighty, weekly, etc.)
What is a cold backup?
Database is shut down during backup
What is a hot backup?
Selected portion is shut down and backed up at a given time
Where should backups be stored?
Secure, off-site location
What do journalizing facilities do?
Provide audit trail of transactions and database updates
What is a transaction log?
Record of essential data for each transaction processed against the database.
What is a database change log?
Contains before and after images of updated data
What are checkpoint facilities?
DBMS periodically refuses to accept new transactions.
This means the system is in a quite state.
In this time the Database and transaction logs are synchronized.
How do checkpoint facilities help the recovery manager?
It allows the recovery manager to resume processing from short period, instead of repeating the entire day.
What is the Recovery Manager?
DBMS module that restores the database to a correct condition when a failure occurs and then resumes processing user requests.
What are the Recovery and Restart Procedures?
- Disk Mirroring
- Restore/Run
- Transaction Integrity
- Backward Recovery (Rollback)
- Forward Recovery (Roll Forward)
What is disk mirroring?
Switching between identical copies of databases
What is restore/rerun?
Reprocess transactions against the backup (only done as a last resort)
What is transaction integrity?
Commit or Abort all transaction changes.
What is backward recovery (Rollback)?
Apply before images
What is Forward recovery (Roll forward)?
Apply after images (preferable to restore/run)
What are the transaction ACID properties?
- Atomic - Transaction cannot be subdivided
- Consistent - Constraints don’t change from before transaction to after transaction
- Isolated - Database changes not revealed to users until after transaction has completed
- Durable - Database changes are permanent
Types of failure and their recommended actions
What is a problem with controlling concurrent access?
In a multi-user environment, simultaneous access to data can result in interference and data loss (lost update problem)
What is the solution to the lost update problem?
Concurrency Control
The process of managing simultaneous operations against a database so that data integrity is maintained and the operations do not interfere with each other in a multi-user evironment
What happens to updates if there is simultaneous access to a database?
Updates cancel each other.
What is serializability?
Finish one transaction before starting another
What is a locking mechanism?
- The most common way of achieving serialization
- Data that is retrieved for the purpose of updating is locked for the updater
- No other user can perform update until unlocked
What are the locking levels?
Database - used during database updates
Table - used for bulk updates
Block or Page - very commonly used
Record - only requested row; fairly commonly used
Field - requires significant overhead; impractical
What are the types of locks?
Shared lock
Exclusive lock
What is a shared lock?
Read but no update permitted. Used when just reading to prevent another user from placing an exclusive lock on the record.
What is an exclusive lock?
No access permitted. Used when preparing to update.
What is a deadlock?
An impasse that results when two or more transactions have locked common resources, and each waits for the other to unlock their resources.
How do you prevent a deadlock?
- Lock all records required at the beginning of a transaction
- Two-phase locking protocol (Growing phase, Shrinking phase)
- Maybe be difficult to determine all needed resources in advance
How do you resolve a deadlock?
- Allow deadlocks to occur
- Mechanisms for detecting and breaking them (Resource usage matrix)
What is versioning?
An optimistic approach to concurrency control instead of locking
What is the assumption with versioning?
That simultaneous updates will be infrequent
How does versioning work?
- Each transaction can attempt an update as it wishes
- The system will create a new verison of a record instead of replacing the old one
What happens if a conflict occurs during versioning?
It accepts one user’s update and informs the other user that its update needs to be tried again.
What is a data dictionary?
Documents data elements of a database
What is a system catalog?
Systems-created database that describes all database objects
What is an information repository?
Stores metadata describing data and data processing resources