Test 1 Flashcards
Traditional applications?
Numeric and textual databases
Recent applications?
Multimedia
GIS
Biological
Data warehouse
Mobile
Real time
Social Networks started capturing a lot of information about people and about communications among people-posts, tweets, photos, videos in systems such as:
- Linked-In
Some part of the real world about which data is stored in a database. For example, student grades and transcripts at a university.
Mini-world
A database is a coherent collection of data with some inherent meaning . A random assortment of data cannot correctly be referred to as a database
Logically coherent
A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and some preconceived applications in which these users are interested.
Designed with a purpose
What is a database?
Mini-world
Logically coherent
Designed with a purpose
A collection of related data
Database
Known facts that can be recorded and have an implicit meaning
Data
A software package/ system to facilitate the creation and maintenance of a computerized database.
Database management system (DBMS)
The DBMS software together with the data itself. Sometimes, the applications are also included
Database system
Impact of databases and database technology
Businesses,
service industries,
education,
personalized application,
social networks,
environmental/scientific application,
medicine
Typical DBMS functionality
Define
Construct
Manipulating
Processing & sharing
What does manipulating a database intel?
Retrieval
Modification
Accessing the database through web application
Application interact with a database by generating:
Queries
Transactions
Access different parts of data and formulate the result of a request
Queries
May read some data and “update” certain values or generate new data and store that in the database
Transactions
Applications must not allow __________ users to access data
Unauthorized
Must keep up with changing user requirements against the database
Application
DBMS may additionally provide?
Protection/security measures to prevent unauthorized access
“Active ” processing to take internal action on data
Prsentation/visualization of data
Maintenance of the database
Maintenance of the database and associated programs over the lifetime of the database application
Called Database, software, and system maintenance
A DBMS _________ stores the description of a particular database.
The description is called:
Catalog
Metadata
Insulation between programs and data are called:
Program-data independence
Allows changing data structures and storage organization without having to change the DBMS access programs
Insulation between programs and data
Some newer systems such as a few __________ systems need no meta- data: they store the data definition within its structure making it self describing
NOSQL
is used to hide storage details and present the users with a conceptual view of the database.
A data model
refer to the data model constructs rather than data storage details
Programs
Each user may see a different view of the database, which describes _______ the data of interest to that user .
only
Allowing a set of _________________ to retrieve from and to update the database .
concurrent users
within the DBMS guarantees that each transaction is correctly executed or aborted
Concurrency control
ensures each completed transaction has its effect permanently recorded in the database
Recovery subsystem
is a major part of database applications. This allows hundreds of concurrent transactions to execute per second .
OLTP (Online Transaction Processing)
Users may be divided into:
Called: actors in the scene
Those who control the database, who develop/maintain database application
Called: workers behind the scene
Those who develop the DBMS software, related tools, and computer systems operators
Actors on the scene include
Database administrators
Database designers
•Responsible for authorizing access to the database, for coordinating and monitoring its use, acquiring software and hardware resources , controlling its use and monitoring efficiency of operations.
•Responsible to define the content, the structure, the constraints, and functions or transactions against the database. They must communicate with the end-users and understand their needs.
Database administrators
Database designers
Actors on the scene include:
End-users
They use the data for queries, reports and some of them update the database content.
End-users:
End-users can be categorized into:
- access database occasionally when needed
- they make up a large section of the end -user population .
- Casual
- Naïve or Parametric
These include business analysts, scientists, engineers, others thoroughly familiar with the system capabilities. Many use tools in the form of software packages that work closely with the stored database.
Sophisticated
Mostly maintain personal databases using ready-to-use packaged applications. An example is the user of a tax program that creates its own internal database. Another example is a user that maintains a database of personal photos and videos.
Stand-alone
This category currently accounts for a very large proportion of the IT work force.
System Analysts and Application Developers
They understand the user requirements of naïve and sophisticated users and design applications including canned transactions to meet those requirements.
System Analysts
Implement the specifications developed by analysts and test and debug them before deployment .
Application Programmers
There is an increasing need for such people who can analyze vast amounts of business data and real- time data (“Big Data”) for better decision making related to planning , advertising , marketing etc.
Business Analysts
Design and implement DBMS packages in the form of modules and interfaces and test and debug them. The DBMS must interface with applications, language compilers, operating system components, etc.
System Designers and Implementors
Design and implement software systems called tools for modeling and designing databases, performance monitoring, prototyping, test data generation, user interface creation, simulation etc. that facilitate building of applications and allow using database effectively.
Tool Developers
They manage the actual running and maintenance of the database system hardware and software environment
Operators and Maintenance Personnel
Advantages of using the database approach:
Controlling redundancy in data storage and in development and maintenance efforts.
Restricting unauthorized access to data. Only the DBA staff uses privileged commands and facilities.
Providing persistent storage for program Objects
Providing Storage Structures (e.g. indexes ) for efficient Query Processing
Providing optimization of queries for efficient processing.
Providing backup and recovery services .
Providing multiple interfaces to different classes of users .
Representing complex relationships among data .
Enforcing integrity constraints on the database .
Drawing inferences and actions from the stored data using deductive and active rules and triggers .
This is very crucial for the success of database applications in large organizations. _________ refer to data item names , display formats , screens , report structures , meta-data (description of data), Web page layouts , etc.
Standards
Implications of using the database approach:
Potential for enforcing standards
Reduced application development time
Flexibility to change data structure
Availability of current information
Economies of scale
When not to use a DBMS
Main inhibitors (costs) of using a DBMS
When a DBMS may be unnecessary
When a DBMS may be infeasible
When no DBMS may suffice
A set of concepts to describe the structure of a database, the operations for manipulating these structures, and certain constraints that the database should obey.
Data model
__________ are used to define the database structure.
__________specify some restrictions on valid data; these must be enforced at all times
Constructs
Constraints
Data Model operations are used for specifying database _________ and ________ by referring to the constructs of the data model.
retrievals
updates
Operations on the data model may include __________________ and _________________
basic model operations
user-defined operations
Categories of data models
Conceptual
Physical
Implementation
Self describing
Provide concepts that are close to the way many users perceive data:
Provide concepts that describe details of how data is stored in the computer. These are usually specified in an ad-hoc manner through DBMS design and administration manuals:
Provide concepts that fall between the above two, used by many commercial DBMS implementations (e.g. relational data models used in many commercial systems ):
Combine the description of data with the data values. Examples include XML, key-value stores and some NOSQL systems:
Conceptual
Physical
Implementation
Self describing
The description of a database. Includes descriptions of the database structure, data types, and the constraints on the database .
An illustrative display of ( most aspects of ) a database schema.
A component of the schema or an object within the schema , e.g. , STUDENT , COURSE .
The actual data stored in a database at a particular moment in time. This includes the collection of all the data in the database. Also called database instance (or occurrence or snapshot).
Database schema
Schema Diagram
Schema Construct
Database State
Refers to the content of a database at a moment in time .
Refers to the database state when it is initially loaded into the system .
A state that satisfies the structure and constraints of the database .
Database State
Initial Database State
Valid State
Changes very infrequently:
Changes every time the database is updates:
Database schema
Database state
Schema is also called:
State is also called:
Intension
Extension
Three schema architecture defines at three levels:
Internal
Conceptual
External
level to describe physical storage structures and access paths ( e.g indexes):
level to describe the structure and constraints for the whole database for a community of users:
level to describe the various user views:
Internal schema
Conceptual schema
External schemas
The capacity to change the conceptual schema without having to change the external schemas and their associated application programs:
The capacity to change the internal schema without having to change the conceptual schema:
Logical Data Independence
Physical Data Independence
When a schema at a lower level is changed, only the ___________ between this schema and higher level schemas need to be changed in a DBMS that fully supports data independence .
The higher-level schemas themselves are _____________.
mappings
unchanged
Used by the DBA and database designers to specify the conceptual schema of a database . In many DBMSs, is also used to define internal and external schemas (views ) .
Used to specify database retrievals and updates commands (data sublanguage ) can be embedded in a general -purpose programming languages. A library of functions can also be provided to access the DBMS from a programming language Alternatively, stand-alone commands can be applied directly (called a query language ).
Data Definition Language (DDL)
Data Manipulation language (DML)
Types of DML:
High level/ non procedural language
Low level/ procedural language
For example, the SQL relational language Are “set”-oriented and specify what data to retrieve rather than how to retrieve it. Also called declarative languages:
Retrieve data one record -at -a-time ; Constructs such as looping are needed to retrieve multiple records , along with positioning pointers:
High Level or Non- procedural Language
Low Level or Procedural Language
Programmer interfaces for embedding DML in a programming languages:
•Embedded Approach
•Procedure Call Approach
•Database Programming Language Approach
•Scripting Languages
e.g embedded SQL (for CC++, etc.), SQLJ (for Java )
e.g. JDBC for Java, ODBC (Open Databse Connectivity) for other programming languages as API’s (application programming interfaces)
e.g. ORACLE has PL/SQL, a programming language based on SQL; language incorporates SQL and its data types as integral components
PHP (client-side scripting) and Python (server-side scripting) are used to write database programs.
•Embedded Approach
•Procedure Call Approach
•Database Programming Language Approach
•Scripting Languages
User friendly DBMS interfaces:
Menu based
Forms based
Graphic based
Natural languages
is accessed by DBMS software and users/DBA:
is accessed by users /DBA only:
Active data dictionary
Passive data dictionary
Combines everything into single system including DBMS software , hardware , application programs , and user interface processing software.
Centralized DBMS
Specialized servers with what specialized functions?
Print server
File server
DBMS server
Web server
Email server
Client and server must install appropriate client module and server module software for ODBC or JDBC
A client program may connect to several DBMSs , sometimes called the data sources.
In general , data sources can be files or other non-DBMS software that manages data.
Two tier client server architecture
Common for web applications
Intermediate later called application sever or web server
Can enhance security
Three tier client-server architechure
Classification of DBMS
Legacy
Currently used
Recent technologies
NOSQL systems
Network, hierarchical:
Relational , Object -oriented , Object relational:
Key - value storage systems:
document based , column - based, graph-based and key -value based. Native XML DBMSS:
Legacy
Currently Used
Recent Technologies
NOSQL systems
Variations of distributed DBMS/DDBMS
Homogeneous DDBMS
Heterogenous DDBMS
Federated or multi-database systems
Distributed database systems
Cost considerations for DBMS?
Cost range
Free relational DBMS
Commercial DBMS
Different licensing options
Type of access paths
General purpose vs special purpose
_________ is a basic concept for the ER model . Are specific things or objects in the mini- world that are represented in the database .
____________ are properties used to describe an entity.
A specific entity will have a value for each of its:
Each attribute has a ________ associated with it - e.g. integer , string , date , enumerated type , ..
Entity
Attributes
attributes
value set
Types of attributes
Each entity has a single atomic value for the attribute. For example, or Sex:
The attribute may be composed of several components. For example: Address(Apt#, House#, Street, City, State, ZipCode, Country), or Name(FirstName, MiddleName, LastName):
An entity may have multiple values for that attribute. For example, Color of a CAR or PreviousDegrees of a STUDENT Denoted as Coloror PreviousDegrees:
Simple
Composite
Multi-valued
In general , _____________ and _____________attributes may be nested arbitrarily to any number of levels , although this is rare.
composite and multi -valued
Entities with the same basic attributes are ________ into an entity type . For example , the entity type EMPLOYEE and PROJECT .
An attribute of an entity type for which each entity must have a unique value is called a _______________ of the entity type . For example , SSN of EMPLOYEE .
Grouped
key attribute
A key attribute may be ____________.
An _______ type may have more than one key.
Each key is ___________ (Note: this is different from the relational schema where only one is)
composite
entity
underlined
Each entity type will have a collection of entities stored in the database. This called the ___________ or sometimes _____________
Entity set is the current _______ of the entities of that type that are stored in the database
entity set
entity collection
state.
A ____________ specifies the set of values associated with an attribute
value set
A ____________ relates two or more distinct entities with a specific meaning .
Relationships of the same type are grouped or typed into a ________________.
The degree of a relationship type is the number of participating _________. Both MANAGES and WORKS ON are binary relationships.
relationship
relationship type
entity types
_______________: the schema description of a relationship. Identifies the relationship name and the participating entity types. Also identifies certain relationship constraints
_______________: The current set of relationship instances represented in the database. The current state of a relationship type
Relationship Type
Relationship Set
A relationship type between the same participating entity type in ____________.
Also called a _____________ relationship type.
distinct roles
self-referencing
The relational Model of Data is based on the concept of a:
Relation
is a mathematical concept based on the ideas of sets:
Relation
- Informally , a relation looks like a _________ of values .
- A relation typically contains a:
- The data elements in each row represent certain facts that correspond to a real-world:
- In the formal model, rows are called:
- Each _________ has a header that gives an indication of the meaning of the data items in that column
- In the formal model, the column header is called an:
- table
- set of rows
- world entity or relationship
- tuples
- column
- attribute
Each row has a value of a data item (or set of items) that uniquely identifies that row in the table called the:
key
Sometimes row-ids or sequential numbers are assigned as keys to identify the rows in a table called:
artificial key or surrogate key
The description of a relation is called:
Schema
Each attribute has a ________ or a set of valid values.
Domain
A ________ is an ordered set of values (enclosed in angled brackets ‘<…>’)
Tuple
A relation is a set of __________(rows)
Tuples
A ________ has a logical definition.
Domain
The _______________ is a subset of the Cartesian product of the domains of its attributes
relation state
The tuples are not considered to be _________, even though they appear to be in the tabular form .
ordered
Values in a tuple
All values are considered ________ (indivisible).
Each value in a tuple must be from the __________ of the attribute for that column
A special ______ value is used to represent values that are unknown or not available or inapplicable in certain tuples .
atomic
domain
null
- These are based on the data model itself.
- They are expressed in the schema by using the facilities provided by the model.
- These are beyond the expressive power of the model and must be specified and enforced by the application programs
- Inherent or Implicit Constraints
- Schema-based or Explicit Constraints
- Application based or semantic constraints
Constraints are ___________ that must hold on ___ valid relation states .
conditions
all
There are three main types of constraints that can be expressed in the relational model:
Key constraints
Entity integrity constraints
Referential integrity constraints
Another schema-based constraint is the ___________ constraint.
domain
Domain constraint means every value in a tuple must be from the domain of its ____________ (or it could be null, if allowed for that attribute)
attribute
A set of attributes with the following condition:
No two tuples in any valid relation state will have the same value.
This condition must hold in any valid state.
Superkey
A minimal superkey.
Key
A _________ is a _________ but not vice versa.
Key
Superkey
If a relation has several ________________, one is chosen arbitrarily to be the primary key.
candidate keys
A _______________ is a set of relation states a state of and such that the relation states satisfy the integrity constraints specified.
A relational database state is sometimes called a relational database:
We will not use the term _________ since it also applies to single tuples .
A database state that does not meet the constraints is an ________ state
relational database state
snapshot or instance
instance
invalid
Each _________ will have many tuples in its current relation state.
The ___________________ is a union of all the individual relation states
relation
relational database state
The primary key attributes of each relation schema cannot have null values in any tuple:
Entity integrity
In referential integrity
constraints involve _____ relations.
Used to specify a ______________ among tuples in two relations.
The two relations being:
Two
Relationship
Referencing relation
Referenced relation
Several update operations may have to be _________ together.
Updates may __________ to cause other updates automatically . This may be necessary to maintain integrity constraints.
grouped
propagate
In case of integrity violation, several actions can be taken:
Cancel the operation that causes the violation (RESTRICT or REJECT option)
Perform the operation but inform the user of the violation
Trigger additional updates so the violation is corrected ( CASCADE option, SET NULL option )
Execute a user -specified error- correction routine
INSERT may violate any of the constraints:
Domain constraint
Key constraint
Referential integrity
Entity integrity
one of the attribute values provided for the new tuple is not of the specified attribute domain:
if the value of a key attribute in the new tuple already exists in another tuple in the relation:
if a foreign key value in the new tuple references a primary key value that does not exist in the referenced relation:
if the primary key value is null in the new tuple:
Domain constraint
Key constraint
Referential integrity
Entity integrity
DELETE may violate only referential integrity:
If the primary key value of the tuple being deleted is referenced from other tuples in the database
How can a DELETE violation be remedied?
RESTRICT option
CASCADE option
SET NULL option
__________ option: reject the deletion
__________ option: propagate the new primary key value into the foreign keys of the referencing tuples
__________ option: set the foreign keys of the referencing tuples to NULL
RESTRICT
CASCADE
SET NULL
UPDATE may violate domain constraint and ___________ constraint on an attribute being modified
NOT NULL
UPDATE what may cause a violation:
Updating the primary key
Updating a foreign key
Updating an ordinary attribute
Goals of mapping:
Preserve all information
Maintain the constraints to the extent possible
Minimize null values
ER-to-Relational mapping algorithm steps:
Step 1: mapping or regular entity types
Step 2: mapping of weak entity types
Step 3: mapping of binary 1:1
Step 4: mapping or binary 1:N
Step 5: mapping or binary M:N
Step 6: mapping of multivalued attributes
Step 7: mapping of N-ary relationship types
Three possible approaches of ER-to-relational mapping algorithm
Foreign key approach
Merged relation option
Cross-reference or relationship relation option
Choose one of the relations-say S-and include a foreign key in S the primary key of T. It is better to choose an entity type with total participation in R in the role of S:
An alternate mapping of a 1:1 relationship type is possible by merging the two entity types and the relationship into a single relation. This may be appropriate when both participations are total:
The third alternative is to set up a third relation R for the purpose of cross referencing the primary keys of the two relations S and T representing the entity types:
Foreign Key (2 relations) approach:
Merged relation (1 relation ) option:
Cross- reference or relationship relation (3 relations ) option :