Test 1 Flashcards

1
Q

What is a Database (informal Definition)

A

A database is a collection of related data. For example, consider the names, telephone numbers, and addresses of the people you know. Nowadays, this data is typically stored in mobile phones, which have their own simple database software. This data can also be recorded in an indexed address book or stored on a hard drive, using a personal computer and software such as Microsoft Access or Excel. This collection of related data with an implicit meaning is a database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data

A

known facts that can be recorded and that have implicit meaning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a Database (Formal Definition)

A

A model of structures of reality
- A database represents some aspect of the real world, sometimes called the miniworld or the universe of discourse (UoD). Changes to the miniworld are reflected in the database.
- A database is a logically coherent collection of data with some inherent meaning. A random assortment of data cannot correctly be referred to as a database.
- A database is designed, built, and populated with data for a specific purpose. It has an intended group of users and some preconceived applications in which these users are interested.

In other words, a database has some source from which data is derived, some degree of interaction with events in the real world, and an audience that is actively interested in its contents.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Database Management System (DBMS)

A

A database management system (DBMS) is a computerized system that enables users to create and maintain a database. The DBMS is a general-purpose software system that facilitates the processes of defining, constructing, manipulating, and sharing databases among various users and applications

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Database System

A

To complete our initial definitions, we will call the database and DBMS software together a database system.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Defining a Database

A

Defining a database involves specifying the data types, structures, and constraints of the data to be stored in the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Designing a Database

A

Design of a new application for an existing database or design of a brand new database starts off with a phase called requirements specification and analysis. These requirements are documented in detail and transformed into a conceptual design that can be represented and manipulated using some computerized tools so that it can be easily maintained, modified, and transformed into a database implementation. (We will introduce a model called the Entity-Relationship model in Chapter 3 that is used for this purpose.)

The design is then translated to a logical design that can be expressed in a data model implemented in a commercial DBMS. (Various types of DBMSs are discussed throughout the text, with an emphasis on relational DBMSs in Chapters 5 through 9.)

The final stage is physical design, during which further specifications are provided for storing and accessing the database. The database design is implemented, populated with actual data, and continuously maintained to reflect the state of the miniworld.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

meta-data

A

The database definition or descriptive information is also stored by the DBMS in the form of a database catalog or dictionary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Query VS. Transaction

A

A query typically causes some data to be retrieved; a transaction may cause some data to be read and some data to be written into the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Main Characteristics of Database

A
  • Self-describing nature of a database system
  • Insulation between programs and data, and data abstraction
  • Support of multiple views of the data
  • Sharing of data and multiuser transaction processing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Main Characteristics of Database: Self-describing nature of a database system

A

A database system is considered self-describing because it not only contains the data but also metadata, which is data about data. This metadata describes the structure of the database, the types and formats of stored data, and the relationships between parts of the data. This self-describing nature allows the database system to be more flexible and adaptable to changes than traditional file-processing systems, where such metadata might be embedded in the application programs themselves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Main Characteristics of Database: Insulation between programs and data, and data abstraction

A
  • In a database system, there is a level of insulation (also known as program-data independence) between the data and the programs that access the data. This means that changes in the structure of the database do not necessarily require changes in the programs that access that data.
  • Data abstraction refers to the process of hiding certain details of how data is stored and maintained while presenting a simple interface to the user. For example, a user can interact with the data through high-level queries without needing to know how the data is physically stored.

The characteristic that allows program-data independence and program-operation independence is called data abstraction. A DBMS provides users with a conceptual representation of data that does not include many of the details of how the data is stored or how the operations are implemented. Informally, a data model is a type of data abstraction that is used to provide this conceptual representation. The data model uses logical concepts, such as objects, their properties, and their interrelationships, that may be easier for most users to understand than computer storage concepts. Hence, the data model hides storage and implementation details that are not of interest to most database users.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Main Characteristics of Database: Support of multiple views of the data

A

A database system can provide multiple views of the data to different users. A view is a subset of the database or a particular presentation of the data designed for a specific purpose or user. Different users or user groups can have different views depending on their needs and privileges, allowing for a personalized interaction with the database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Main Characteristics of Database: Sharing of data and multiuser transaction processing

A
  • One of the key advantages of a database system over a file-processing system is the ability to allow multiple users to access and modify the data concurrently. This is essential in many environments where different users need to access the same data at the same time.
  • Multiuser transaction processing refers to the system’s ability to handle and manage transactions from multiple users without letting the transactions interfere with each other. This includes maintaining data integrity and consistency even when multiple transactions are occurring simultaneously.

The DBMS must include concurrency control software to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct. For example, when several reservation agents try to assign a seat on an airline flight, the DBMS should ensure that each seat can be accessed by only one agent at a time for assignment to a passenger. These types of applications are generally called online transaction processing (OLTP) applications. A fundamental role of multiuser DBMS software is to ensure that concurrent transactions operate correctly and efficiently.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Database Operation

A

In some types of database systems, such as object-oriented and object-relational systems (see Chapter 12), users can define operations on data as part of the database definitions. An operation (also called a function or method) is specified in two parts.
The interface (or signature) of an operation includes the operation name and the data types of its arguments (or parameters). The implementation (or method) of the operation is specified separately and can be changed without affecting the interface. User application programs can operate on the data by invoking these operations through their names and arguments, regardless of how the operations are implemented. This may be termed program-operation independence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Database Administrators

A

The DBA is responsible for authorizing access to the database, coordinating and monitoring its use, and acquiring software and hardware resources as needed. The DBA is accountable for problems such as security breaches and poor system response time. In large organizations, the DBA is assisted by a staff that carries out these functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Database Designers

A

**Database designers are responsible for identifying the data to be stored in the database and for choosing appropriate structures to represent and store this data. **These tasks are mostly undertaken before the database is actually implemented and populated with data. It is the responsibility of database designers to communicate with all prospective database users in order to understand their requirements and to create a design that meets these requirements. In many cases, the designers are on the staff of the DBA and may be assigned other staff responsibilities after the database design is completed. Database designers typically interact with each potential group of users and develop views of the database that meet the data and processing requirements of these groups. Each view is then analyzed and integrated with the views of other user groups. The final database design must be capable of supporting the requirements of all user groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

End Users

A

End users are the people whose jobs require access to the database for querying, updating, and generating reports; the database primarily exists for their use. There are several categories of end users:

  • Casual end users occasionally access the database, but they may need different information each time. They use a sophisticated database query interface to specify their requests and are typically middle- or high-level managers or other occasional browsers.
  • Naive or parametric end users make up a sizable portion of database end users. Their main job function revolves around constantly querying and updating the database, using standard types of queries and updates—called canned transactions—that have been carefully programmed and tested. Many of these tasks are now available as mobile apps for use with mobile devices. The tasks that such users perform are varied. A few examples are:
    – Bank customers and tellers check account balances and post withdrawals and deposits.
    – Reservation agents or customers for airlines, hotels, and car rental companies check availability for a given request and make reservations.
    – Employees at receiving stations for shipping companies enter package identifications via bar codes and descriptive information through buttons to update a central database of received and in-transit packages.
    – Social media users post and read items on social media Web sites.
  • Sophisticated end users include engineers, scientists, business analysts, and others who thoroughly familiarize themselves with the facilities of the DBMS in order to implement their own applications to meet their complex requirements.
  • Standalone users maintain personal databases by using ready-made program packages that provide easy-to-use menu-based or graphics-based interfaces. An example is the user of a financial software package that stores a variety of personal financial data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

System Analysts and Application Programmers (Software Engineers)

A

System analysts determine the requirements of end users, especially naive and parametric end users, and develop specifications for standard canned transactions that meet these requirements.
Application programmers implement these specifications as programs; then they test, debug, document, and maintain these canned transactions. Such analysts and programmers—commonly referred to as software developers or software engineers—should be familiar with the full range of capabilities provided by the DBMS to accomplish their tasks.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Workers behind the Scene

A

In addition to those who design, use, and administer a database, others are associated with the design, development, and operation of the DBMS software and system environment. These persons are typically not interested in the database content itself. We call them the workers behind the scene, and they include the following categories:

  • DBMS system designers and implementers design and implement the DBMS modules and interfaces as a software package. A DBMS is a very complex software system that consists of many components, or modules, including modules for implementing the catalog, query language processing, interface processing, accessing and buffering data, controlling concurrency, and handling data recovery and security. The DBMS must interface with other system software, such as the operating system and compilers for various programming languages.
  • Tool developers design and implement tools—the software packages that facilitate database modeling and design, database system design, and improved performance. Tools are optional packages that are often purchased separately. They include packages for database design, performance monitoring, natural language or graphical interfaces, prototyping, simulation, and test data generation. In many cases, independent software vendors develop and market these tools.
  • Operators and maintenance personnel (system administration personnel) are responsible for the actual running and maintenance of the hardware and software environment for the database system.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Why use models

A
  • Models can be useful when we want to examine or manage parts of the real world
  • The costs of using a model are often considerably lower than the bosts of using or experimenting with the real world itself.
22
Q

What is a model

A
  • A model is a means of communication
  • Users of a model must have a certain amount of knowledge in common
    Features of a model:
  • Only emphasizes selected aspects
  • is described in some language
  • Can be erroneous
  • may have features that do not exist in reality
23
Q

When to Use/Benefits of Databases

A
  • Best used for data-intensive applications
  • Persistent storage of data
  • centralized control of data
  • control of redundancy
  • control of consistency and integrity (relates back to redundancy)
  • multiple user support
  • sharing of data
  • data documentation
  • data independence
  • contral of access and security
  • backup and recovery
24
Q

When not to use DBMS

A
25
Q

Data Models

A

the tool used to create DBMS
3 different types of models:
1) Entity-Relationship Model
2) Relational Model
3) Hierarchial Model

26
Q

Relational Model- Data Structures

A

Schema:
- The table name, column names, and data types represent the schema.
- Represents the structure of the data
- The schema represents aspects of the data that will stay stable over time (i.e. will not change)
- We call each object in the schema—such as STUDENT or COURSE—a schema construct.

State/Snapshot
- Represented by the rows of the table reflects aspects that are dynamic and change over time.
- Represents the current state of reality that is modelled by the table

The distinction between database schema and database state is very important. When we define a new database, we specify its database schema only to the DBMS. At this point, the corresponding database state is the empty state with no data. We get the initial state of the database when the database is first populated or loaded with the initial data. From then on, every time an update operation is applied to the database, we get another database state. At any point in time, the database has a current state. The DBMS is partly responsible for ensuring that every state of the database is a valid state—that is, a state that satisfies the structure and constraints specified in the schema. Hence, specifying a correct schema to the DBMS is extremely important and the schema must be designed with utmost care. The DBMS stores the descriptions of the schema constructs and constraints—also called the meta-data—in the DBMS catalog so that DBMS software can refer to the schema whenever it needs to. valid state—that is, a state that satisfies the structure and constraints specified in the schema.

The schema is sometimes called the intension, and a database state is called an extension of the schema.

27
Q

Relational Model- Constraints

A

Constraints are not expressed by the table structure or data types of the column

28
Q

Data Model-Operations

A

Operations support change and retrieval of data

29
Q

Data Models: Keys and Identifiers

A

Keys are uniqueness constraints

30
Q

Integrity and Consistency

A

Integrity: For row with Louise Mark, it is assumed the person is a boy. This is a poor reflection of reality
Consistency: For 1st row, notice the cities are different when comparing city and address between the two tables. This is bad consistency.

31
Q

Named-Based Vs Surrogate-Based Representation

A

A surrogate:
- System-generated artificial identifier for an entity.
- Represents an entity of the real world inside the database
- is immutable by the application programs

Surrogate-Based Representation is preferred. Think of a lady who changes her last name or someone who moves. In both instances, we have no way of telling if that is a different person or the same person.

32
Q

Data Normalization

A

In the database approach, the views of different user groups are integrated during database design. Ideally, we should have a database design that stores each logical data item—such as a student’s name or birth date—in only one place in the database. This is known as data normalization, and it ensures consistency and saves storage space (data normalization is described in Part 6 of the text).

33
Q

Controlled Redundancy & Denormalization

A

However, in practice, it is sometimes necessary to use** controlled redundancy** to improve the performance of queries. For example, we may store Student_name and Course_number redundantly in a GRADE_REPORT file (Figure 1.6(a)) because whenever we retrieve a GRADE_REPORT record, we want to retrieve the student name and course number along with the grade, student number, and section identifier. By placing all the data together, we do not have to search multiple files to collect this data. This is known as denormalization.

In such cases, the DBMS should have the capability to control this redundancy in order to prohibit inconsistencies among the files. This may be done by automatically checking that the Student_name–Student_number values in any GRADE_REPORT record in Figure 1.6

(a) match one of the Name–Student_number values of a STUDENT record (Figure 1.2). Similarly, the Section_identifier–Course_number values in GRADE_REPORT can be checked against SECTION records. Such checks can be specified to the DBMS during database design and automatically enforced by the DBMS whenever the GRADE_REPORT file is updated. Figure 1.6(b) shows a GRADE_REPORT record that is inconsistent with the STUDENT file in Figure 1.2; this kind of error may be entered if the redundancy is not controlled. Can you tell which part is inconsistent?

34
Q

ANSI/SPARC 3-Level DBB Architecture

A

This architecture divides the database system into three levels:
External Level (or View Level):
- The highest level of the ANSI/SPARC database architecture.
- It consists of several external schemas or user views.
- Each external schema describes the part of the database that a particular user group is interested in and hides the rest of the database from that user group.
- This level provides a customized view of the data based on the specific needs and rights of different users. For example, a sales manager and a human resources manager would have different views of the same database.
- It helps in achieving data security, since users can be limited to accessing only certain parts of the database.

Conceptual Level (or Logical Level):
- The middle level of the architecture.
- This level contains the conceptual schema which describes the structure and constraints for the whole database. This schema is a complete view of the data requirements of the organization that the database serves.
- It defines what data is stored in the database and the relationships among those data.
- The conceptual level provides a level of abstraction that hides the details of how the data is stored and managed (physical level details).
- It ensures that the database’s structure can be modified without affecting the users or user applications, as changes are managed through the external level.

Internal Level (or Physical Level):
- The lowest level of the architecture.
- It includes the internal schema, which describes the physical storage structures and access paths (indexes, for example). Essentially, it specifies how the data is actually stored in the database.
- This level deals with the efficiency of the database, like how quickly and efficiently data can be retrieved.
- It allows the database administrator to optimize database performance without affecting the higher-level schemas.

Overall, the ANSI/SPARC 3-level architecture provides a framework that separates the user’s view of the database (external level), the way the database is logically organized (conceptual level), and how the data is physically stored (internal level). This separation enhances the flexibility, security, and data independence of the database system.

35
Q

External Schema

A

Definition and Purpose:
- The external schema, also known as the view level, consists of various user views of the database.
- Each external schema presents a tailored view of the database to meet the specific needs and permissions of individual users or user groups. It filters and formats the database information so that users can focus on the most relevant data for their tasks.

Customization and Security:
- The external schemas are highly customizable. For instance, a database may have one external schema for a sales department, showing sales-related data, and another for the human resources department, showing employee data.
- This customization helps in implementing security measures, as users are restricted to accessing only the data they are authorized to see. It helps in enforcing data privacy and confidentiality policies.

Data Independence:
- External schemas contribute to data independence by isolating the user applications from changes in the conceptual schema. If the structure of the database (conceptual schema) changes, these changes do not necessarily reflect in the external schemas.
- This means that users can continue to interact with the database through their familiar views even if the underlying database structure is altered.

Interface with Applications:
- Applications interact with the database through these external schemas. They serve as the interface between the users and the database.
- This level translates requests and responses between the database’s conceptual schema and the user’s view, ensuring that users can access and manage data as per their specific context and requirements.

Use in Database Design:
- When designing a database, defining external schemas is a critical step to ensure that the database serves its diverse user base effectively.
- It involves understanding the different requirements of various user groups and creating views that cater to these requirements efficiently.

In summary, external schemas in a database system provide customized views for different user groups, ensuring security, data independence, and ease of use. They allow users to interact with the database in a way that is most relevant to their specific needs, without exposing the entire database structure or unnecessary data.

36
Q

Conceptual Schema

A

At the conceptual level, the only thing that is visible to the query language is the table and the columns in the table

Conceptual Schema Overview:
- The conceptual schema is a representation of the entire content of a database at a high level of abstraction, typically the logical level.
- It includes all the entities, attributes, relationships, constraints, and other elements that are relevant to the organization’s data model.

What Applications See at the Conceptual Schema Level:
- At this level, applications are exposed to the structures of the database, such as the names of tables and their columns (attributes).
- It provides a map of entities and their relationships, which is crucial for understanding how data is organized and interconnected within the database.
- However, it abstracts away the physical storage details. It doesn’t reveal how the data is stored, indexed, or the specific physical data paths.

Importance for Applications:
- The conceptual schema is important for application development and interaction because it provides a clear and consistent view of the data structure without burdening the application with underlying physical storage details.
- This allows applications to be developed and function independently of the physical database design, promoting database longevity and flexibility in the face of changes to the physical layout.

37
Q

Internal Schema

A

Definition and Role:
- The internal schema, also known as the physical level,** defines the physical storage structure of the database.**
- It specifies how the data is stored in the database, including the data storage format, indexing strategies, data paths, physical data structures, and storage allocation.

Physical Storage Details:
- This schema includes details about the data files, record layouts, file structures, and indexes used to achieve efficient access to stored data.
- It also determines how data is compressed, encrypted, or split across multiple storage locations (like in partitioning).

Optimization for Performance:
- The internal schema is primarily focused on optimizing database performance and efficient data storage.
- Decisions made at this level impact the speed and efficiency of database operations, such as retrieval, update, and transaction processing.

Data Independence:
- Just as the external schema provides independence from the conceptual schema, the internal schema provides physical data independence.
-** Changes made to the internal schema (like using a different storage device or changing the file structure) do not affect the conceptual or external schemas. This means applications and users can remain unaffected by changes in physical storage or optimization strategies.**

Invisible to End Users:
- Unlike the external schema, the internal schema is completely hidden from end-users and application developers.
- Users interact with the database at the conceptual and external levels, unaware of the complexities involved in how the data is physically stored and managed.

In summary, the internal schema of a database system deals with the physical aspects of data storage and management. It is crucial for optimizing database performance, ensuring efficient data management, and providing physical data independence. The internal schema enables the database to operate effectively and efficiently, without exposing these complexities to the end users or application developers.

38
Q

Physical Data Independence

A

The three levels of the ANSI-SPARC Architecture provides two types of independence. The first one is Physical Data Independence. Database technology allows you to make that measurement almost perfect.

Generally, physical data independence exists in most databases and file environments where physical details, such as the exact location of data on disk, and hardware details of storage encoding, placement, compression, splitting, merging of records, and so on are hidden from the user. Applications remain unaware of these details.

39
Q

Logical Data Independence

A

The second part of independence provided by ANSI Sparc Architecture. It is more difficult than Physical Data Independence. The reason why is because the external schema is logically derived from the conceptual schema

logical data independence is harder to achieve because it allows structural and constraint changes without affecting application programs—a much stricter requirement.

40
Q

client/server DBMS architecture

A

In a basic client/server DBMS architecture, the system functionality is distributed between two types of modules.
A client module is typically designed so that it will run on a mobile device, user workstation, or personal computer (PC). Typically, application programs and user interfaces that access the database run in the client module. Hence, the client module handles user interaction and provides the user-friendly interfaces such as apps for mobile devices, or forms- or menu-based GUIs (graphical user interfaces) for PCs.

The other kind of module, called a server module, typically** handles data storage, access, search, and other functions**.

41
Q

Data Abstraction

A

Data abstraction generally refers to the suppression of details of data organization and storage, and the highlighting of the essential features for an improved understanding of data. One of the main characteristics of the database approach is to support data abstraction so that different users can perceive data at their preferred level of detail.

42
Q

Data Model

A

A data model—a collection of concepts that can be used to describe the structure of a database—provides the necessary means to achieve this abstraction.

By structure of a database we mean the data types, relationships, and constraints that apply to the data. Most data models also include a set of basic operations for specifying retrievals and updates on the database.

In addition to the basic operations provided by the data model, it is becoming more common to include concepts in the data model to specify the dynamic aspect or behavior of a database application. This allows the database designer to specify a set of valid user-defined operations that are allowed on the database objects. An example of a user-defined operation could be COMPUTE_GPA, which can be applied to a STUDENT object.

Basic Data Model Operations-generic operations to insert, delete, modify, or retrieve any kind of object are often included in the basic data model operations.

43
Q

High-Level/Conceptual Data Models

A

Conceptual data models use concepts such as entities, attributes, and relationships. An entity represents a real-world object or concept, such as an employee or a project from the miniworld that is described in the database. An attribute represents some property of interest that further describes an entity, such as the employee’s name or salary. A relationship among two or more entities represents an association among the entities, for example, a works-on relationship between an employee and a project.

44
Q

Low-level/physical data models

A

Physical data models describe how data is stored as files in the computer by representing information such as record formats, record orderings, and access paths. An access path is a search structure that makes the search for particular database records efficient, such as indexing or hashing.

45
Q

Representational Models

A

Between High-Level/Conceptual and Low-Level/Physical data models s a class of representational (or implementation) data models, which provide concepts that may be easily understood by end users but that are not too far removed from the way data is organized in computer storage

Representational or implementation data models are the models used most frequently in traditional commercial DBMSs.
Representational data models represent data by using record structures and hence are sometimes called record-based data models.

46
Q
A
47
Q

Schema Evolution

A

Although, as mentioned earlier, the schema is not supposed to change frequently, it is not uncommon that changes occasionally need to be applied to the schema as the application requirements change. For example, we may decide that another data item needs to be stored for each record in a file, such as adding the Date_of_birth to the STUDENT schema in . This is known as schema evolution. Most modern DBMSs include some operations for schema evolution that can be applied while the database is operational.

48
Q

DBMS Catalog

A
  • The DBMS catalog is a repository of metadata, which contains information about the database structure, such as schema definitions, data types, and mappings.
  • In a multiple-level DBMS, this catalog must include metadata for all three levels. It needs to describe not only the data and its structure but also the relationships and mappings between the different levels.
49
Q

data definition language (DDL)

A

In many DBMSs where no strict separation of levels is maintained, one language, called the data definition language (DDL), is used by the DBA and by database designers to define both schemas. The DBMS will have a DDL compiler whose function is to process DDL statements in order to identify descriptions of the schema constructs and to store the schema description in the DBMS catalog.

  • In many DBMSs, the conceptual and internal schemas are defined using a language known as the Data Definition Language (DDL).
  • The DDL allows database administrators (DBAs) and designers to define both the conceptual and internal schemas in a standardized format.
  • In systems where a strict separation of levels is not maintained, the DDL serves as a unified way to define both the logical structure and physical storage details of the database.
  • In DBMSs where a clear separation is maintained between the conceptual and internal levels, the DDL is used to specify the conceptual schema only.
50
Q
A
51
Q
A