Data and Database Flashcards
Data
is a collection of facts, statistics, or information that can be represented and stored in various formats
Database
is a structured and organized storage system for data. It is designed to efficiently store, manage, retrieve, and manipulate data
Data Independence
→ Data independence is the characteristic in which we are able to modify the schema at one level of database system without altering the schema at the next higher level.
→ Data independence is a concept in database management systems (DBMS) that refers to the separation of the logical structure of the database from its physical storage details.
→ This separation allows for greater flexibility, maintainability, and scalability of a database system.
Physical Data Independence
→ Physical data independence allows you to make changes to the physical storage and access methods of the database without affecting the logical schema.
→ This means you can change how data is stored on disk, reorganize files, or choose different indexing techniques without impacting the way users or applications interact with the data.
→ Applications and users remain unaware of changes made to the underlying physical storage, ensuring that they can continue to interact with the data without disruptions.
→ This type of independence is particularly useful for database administrators and system architects who need to optimize database performance, security, or storage management.
Logical Data Independence
→ Logical data independence allows you to make changes to the logical schema of the database without affecting the external schema or the applications that use the database.
→ This means you can modify the structure of the database, such as adding or removing tables, changing data types, or modifying relationships between tables, without requiring changes in the application code.
→ Ensures that the external schema, which defines how data is presented to users and applications (through views, queries, and interfaces), remains unchanged when modifications are made to the internal schema (the way data is stored).
→ This type of independence helps in maintaining data consistency and facilitating application development and evolution.
Advantages of data independence
→ Data security and privacy
→ Easier maintenance
→ Scalability: As the database grows, data independence facilitates the addition of new data elements or tables without affecting existing queries or applications.
→ Flexibility: allows for changes to be made in the database schema (structure) without affecting the way data is accessed or presented to users.
main purpose of data abstraction
to achieve data independence
Levels of abstraction
- Physical level (Internal level):
- The internal level, also known as the physical level, is the level of abstraction that defines how the data is physically stored and organized in the database.
- It includes the details of how the data is stored on disk, the access methods used to retrieve the data, and the algorithms used to perform operations on the data.
- Logical level (Conceptual level):
* The conceptual level, also known as the logical level, is the level of abstraction that defines the overall structure of the database.
* It defines the relationships between the data elements, the constraints on the data, and the operations that can be performed on the data. - View level (External level):
* The external level, also known as the View level, is the level of abstraction that is closest to the end users.
* It defines how the data is viewed by the users or applications that access the database.
It provides a high-level view of the database that is tailored to the specific needs of each user or application. (highest level of abstraction)
Schema
A database schema defines how data is organized within a relational database.
Types of Schema
- Internal Schema (Physical Schema):
- The physical schema describes how data is stored on the underlying hardware, including details such as file organization, storage allocation, and indexing methods.
- It focuses on optimizing database performance and storage efficiency.
- It is concerned with the physical implementation of tables, indexes, and other database objects.
- Conceptual Schema) Logical Schema:
- The logical schema defines the logical structure of the database, including the tables, relationships, and constraints, without specifying how data is physically stored.
- It is independent of the underlying hardware and storage details.
- It serves as a high-level abstraction that helps users and application developers understand the database’s structure and relationships.
- External Schema (User Schema):
- An external schema defines the view of the database as seen by the end users or applications. It presents a specific subset of the data and functionality to different user groups.
- External schemas are often tailored to meet the needs of specific user roles or applications.
- They provide a level of data abstraction and security by controlling what data and operations are accessible to different user groups.
Relational Data Model
→ The relational model represents how data is stored in Relational Databases.
→ A relational database consists of a collection of tables, each of which is assigned a unique name.
Relation Model Concepts
▪ Attribute: An attribute is a characteristic or property of an entity that is stored in a database. It corresponds to a column in a table and defines the type of data that can be stored in that column (e.g., name, age, address).
▪ Tables: Tables are two-dimensional structures in a relational database that organize data into rows (tuples) and columns (attributes). Tables are also referred to as relations in the relational model.
▪ Tuple: A tuple is a single data record or row in a table (relation). It contains values for each attribute, representing a specific instance of data.
▪ Relation Schema: The relation schema defines the logical blueprint of a relation or table. It specifies the names and data types of the attributes, as well as any constraints or rules that apply to the data in the table.
▪ Degree: The degree of a relation is the total number of attributes (columns) present in that relation. It indicates the “width” of the table.
▪ Cardinality: Cardinality refers to the number of tuples (rows) in a relation or table. It indicates the “height” of the table.
▪ Column: A column is a vertical component of a table that represents a single attribute. It contains values for that attribute for each tuple in the table.
▪ Relation Instance: A relation instance is a specific set of data contained within a table at a given point in time. It represents the actual data in the table, which consists of tuples.
▪ Relation Key: A relation key is one or more attributes (columns) in a table that uniquely identify each tuple in the table. Common types of keys include primary keys and candidate keys.
▪ Attribute Domain: The attribute domain specifies the set of all possible values that an attribute can take. It defines the data type, range, and constraints for the attribute.
Relational Integrity Constraints
Domain, Key, and Referential Integrity Constraints
Domain Constraint
§ Domain constraints define the permissible values for attributes (columns) within a table. These constraints specify the data type and range of valid values for each attribute.
§ Domain constraints ensure that data entered into a column adheres to specific data types (e.g., integers, strings) and meets any predefined limitations (e.g., a specific range of values, character length).
Key Constraint
Key constraints are used to enforce uniqueness within a table. A key constraint designates one or more attributes (columns) as keys, which guarantee that the values in those columns are unique for each row (tuple) in the table.