Topic 2 Flashcards
collection of data which is related by some aspect
Database
stores data, in such a way which is
easier to retrieve, manipulate and helps to produce information
database management system
CHARACTERISTICS
Real-world entity
Relation-based tables
Isolation of data and application
Less redundancy
Consistency
Query Language
ACID Properties
Multiuser and Concurrent Access
Multiple views
: DBMS allows entities and relations among them to form as tables. This eases the concept of data saving. A user can understand the architecture of database just by
looking at table names etc.
Relation-based tables:
Modern DBMS are more realistic and uses real world entities to design its
architecture. It uses the behavior and attributes too.
Real-world entity
: A database system is entirely different than its data. Where
database is said to active entity, data is said to be passive one on which the database works and
organizes. DBMS also stores metadata which is data about data, to ease its own process.
Isolation of data and application
DBMS follows rules of normalization, which splits a relation when any of its
attributes is having redundancy in values. Following normalization, which itself is a
mathematically rich and scientific process, make the entire database to contain as less
redundancy as possible.
Less redundancy
DBMS always enjoy the state on consistency where the previous form of data storing
applications while file processing does not guarantee this. Consistency is a state where every
relation in database remains consistent. There exist methods and techniques, which can detect
attempt of leaving database in inconsistent state.
Consistency
DBMS is equipped with query language, which makes it more efficient to
retrieve and manipulate data. A user can apply as many and different filtering options, as he or
she wants. Traditionally it was not possible where file-processing system was used.
Query Language
DBMS follows the concepts for __ ___, which stands for ____. These concepts are applied on transactions, which
manipulate data in database. A____ maintains database in healthy state in multi- transactional environment and in case of failure.
ACID Properties
ACID STANDS FOR
Atomicity,
Consistency, Isolation and Durability
DBMS support multi-user environment and allows them to
access and manipulate data in parallel. Though there are restrictions on transactions when they
attempt to handle same data item, but users are always unaware of them.
- Multiuser and Concurrent Access
: DBMS offers multiples views for different users. Security: Features like multiple
views offers security at some extent where users are unable to access data of other users and
departments. DBMS offers methods to impose constraints while entering data into database and
retrieving data at later stage. DBMS offers many different levels of security features, which
enables multiple users to have different view with different features.
Multiple views
DBMS architecture can be seen as
single tier or multi tier
or n-tier architecture.
DBMS is the only entity where user directly sits
on DBMS and uses it.
1 Tier Architecture
DBMS is ____ then must have some application,
which uses the DBMS. Programmers use ___
where they access DBMS by means of
application.
2-tier Architecture
Most widely used architecture ___. it separates it tier
from each other on basis of users.
3-tier
architecture
tells how the logical structure of a database is modeled.
Data model
represents the nature of data, and the business
logic to control the data. It is also organize the
database
Data Modelling
a class of real world objects having
common attributes (e.g., sites, variables,
methods).
Entity–
A characteristic or property of an
entity (site name, latitude, longitude)
- Attribute
an association between two or
more entities
- Relationship–
– the number of entities on either
end of a relationship (one-to-one, one-to many, many-to-many, etc.)
Cardinality–
is based on the notion of
real world entities and relationship among them
Entity-Relationship model
Every attribute is
defined by its set of values called
domain.
The logical association among entities is
called
Relationship
are represented by means of
rectangles. Rectangles are named with the
__ set they represent.
Entities
are the properties of entities. are represented by means of ellipses
Attributes
If the attributes are composite , they are further
divided in a tree like structure.
- ellipses that are connected
with an ellipse
composite attributes
are depicted by dashed
ellipse
Derived attributes
are depicted by double
ellipse.
Multivalued attributes
Types of Attributes
Simple attribute
Composite attribute
Derived attribute
Single-value attribute
Multi-value attribute
Entity-Set
and Keys
Super Key
Candidate Key −
Primary Key
are the
attributes that do not exist in the physical database, but
their values are derived from other attributes present in
the database.
Derived attribute
are atomic values,
which cannot be divided further. For example, a
student’s phone number is an atomic value of 10 digits
Simple attribute
are made
of more than one simple attribute. For example, a
student’s complete name may have first_name and
last_name.
Composite attribute
contain
single value. For example − Social_Security_Number.
- Single-value attribute
may
contain more than one values. For example, a person
can have more than one phone number, email_address,
etc.
Multi-value attribute
A set of attributes (one or
more) that collectively identifies an entity in
an entity set.
Super Key
A minimal super key; is a set of attributes that
uniquely identify tuples in a table
Candidate Key
is one of the
candidate keys chosen by the database
designer to uniquely identify the entity set.
Primary Key
is a group of single or multiple
keys which identifies rows in a table. A ___ ____may have additional attributes that are
not needed for unique identification.
Super Key
defines the number of entities in one
entity set, which can be associated with the
number of entities of other set via relationship set.
Cardinality
One entity from entity set A can be
associated with at most one entity of entity set B
and vice versa.
One-to-one
More than one entities from
entity set A can be associated with at most one
entity of entity set B, however an entity from
entity set B can be associated with more than
one entity from entity set A.
Many-to-one
One entity from entity set A can
be associated with more than one entities of
entity set B however an entity from entity set B,
can be associated with at most one entity
One-to-many
More than one entity
from A can be associated with more than
one entity from B and vice versa
Many-to-many −
Each entity is involved in the
relationship. __ ___ is represented by double
lines.
Total Participation
Not all entities are involved in the
relationship. __ ___ is represented by single
lines
Partial participation
has the power of expressing
database entities in a conceptual hierarchical
manner.
.
ER Model
the process of generalizing entities, where the generalized
entities contain the properties of all the generalized entities, is
called
Generalization
a group of entities is
divided into sub-groups based on their
characteristics.
Specialization
is an important
feature of Generalization and
Specialization. It allows lowerlevel entities to inherit the
attributes of higher-level entities.
Inheritance
Codd’s 12 Rule
Rule 1: Information Rule -
Rule 2: Guaranteed Access Rule
Rule 3: Systematic Treatment of NULL Values
Rule 4: Active Online Catalog
Rule 5: Comprehensive Data Sub-Language Rule
Rule 6: View Updating Rule
Rule 7: High-Level Insert, Update, and Delete Rule
Rule 8: Physical Data Independence
Rule 9: Logical Data Independence -
Rule 10: Integrity Independence
Rule 11: Distribution Independence
Rule 12: Non-Subversion Rule
The data stored in a database, may it be user data or metadata, must be a value of
some table cell. Everything in a database must be stored in a table format.
Rule 1: Information Rule
Every single data element (value) is guaranteed to be accessible logically with
a combination of table-name, primary-key (row value), and attribute-name (column value). No other means,
such as pointers, can be used to access data
Rule 2: Guaranteed Access Rule
The NULL values in a database must be given a systematic and
uniform treatment. This is a very important rule because a NULL can be interpreted as one the following − data
is missing, data is not known, or data is not applicable
Rule 3: Systematic Treatment of NULL Values
The structure description of the entire database must be stored in an online
catalog, known as data dictionary, which can be accessed by authorized users. Users can use the same query
language to access the catalog which they use to access the database itself.
Rule 4: Active Online Catalog
A database can only be accessed using a language having
linear syntax that supports data definition, data manipulation, and transaction management operations. This
language can be used directly or by means of some application. If the database allows access to data without
any help of this language, then it is considered as a violation
Rule 5: Comprehensive Data Sub-Language Rule
All the views of a database, which can theoretically be updated, must also be
updatable by the system
Rule 6: View Updating Rule
A database must support high-level insertion,
updation, and deletion. This must not be limited to a single row, that is, it must also support union,
intersection and minus operations to yield sets of data records.
Rule 7: High-Level Insert, Update, and Delete Rule
The data stored in a database must be independent of the
applications that access the database. Any change in the physical structure of a database must not have
any impact on how the data is being accessed by external applications.
Rule 8: Physical Data Independence
The logical data in a database must be independent of its user’s
view (application). Any change in logical data must not affect the applications using it. For example, if
two tables are merged or one is split into two different tables, there should be no impact or change on
the user application. This is one of the most difficult rule to apply
Rule 9: Logical Data Independence -
If a system has an interface that provides access to low-level records,
then the interface must not be able to subvert the system and bypass security and integrity constraints.
Rule 12: Non-Subversion Rule
Data is stored in tables called relations.
* Relations can be normalized.
* In normalized relations, values saved are
atomic values.
* Each row in relation contains unique value
* Each column in relation contains values
from a same domain.
Relational Model
The end-user must not be able to see that the data is distributed
over various locations. Users should always get the impression that the data is located at one site only.
This rule has been regarded as the foundation of distributed database systems
Rule 11: Distribution Independence
A database must be independent of the application that uses it. All its
integrity constraints can be independently modified without the need of any change in the application.
This rule makes a database independent of the front-end application and its interface
Rule 10: Integrity Independence
Relational Model Concepts
Tables
Tuple
Relational Instance
Relation schema
Relation Key
Attribute domain
− In relational
data model, relations
are saved in the format
of Tables. This format
stores the relation
among entities. A table
has rows and columns,
where rows represents
records and columns
represent the
attributes
Tables
A single row of
a table, which contains
a single record for that
relation is called a
Tuple
describes the relation
name (table name),
attributes, and their
names
Relation schema
− A
finite set of tuples in
the relational database
system represents
Relation instance
Every attribute has
some pre-defined
value scope, known as
Attribute domain
Each
row has one or more
attributes, known as
_____, which can
identify the row in the
relation (table)
uniquely.
relation key
Every relation has some conditions that must hold for it to be a valid relation. These
conditions are called
Relational Integrity Constraints
There are three main integrity
constraints
- Key constraints
- Domain constraints
- Referential integrity constraints
- in a relation with a key attribute, no two
tuples can have identical values for key
attributes. - a key attribute can not have NULL values
Key Constraints
There must be at least one minimal subset of attributes in the relation, which can identify a tuple uniquely. This minimal subset of attributes is
called ___ for that relation. If there are more than
one such minimal subsets, these are
called ___ ___.
key, candidate keys
Attributes have specific value in real world scenario. For example, age can only be a positive integer. The same constraints have been tried to employ on the attribute is bound to have a specific range of values.
Domain Constraints
- work on
the concept of Foreign Keys. A foreign key
is a key attribute of a relation that can be
referred in other relation - states that
if a relation refers to a key attribute of a
different or same relation, then that key
element must exist
Referential integrity constraints
- is a key used to link two tables together
- is a field (or collection of fields) in one table that
refers to the PRIMARY KEY in another table
Foreign Key