WSI Flashcards
Introduction to Information
refers to databases that don’t follow the traditional relational model used by SQL databases. The term can be interpreted as “non-SQL” or “not only SQL.” NoSQL databases offer more flexibility and natural data storage compared to SQL databases. NoSQL is a broader database management approach, while SQL is specifically a query language.
NoSQL
database
Key-value database
Document database
Wide-column database
Graph database
is information that’s set up for easy access, management and updating.
database
HISTORY: Early databases used flat files, which were unstructured and inefficient. Hierarchical and network models were developed to improve organization and relationships between data. The relational database model, introduced by Edgar F. Codd, revolutionized database management by using tables with rows and columns. His ideas led to the development of SQL (Structured Query Language).
flat files -> Hierarchical -> network -> relational (Edgar F. Codd)
define what constitutes a true relational database management system. The rules cover various aspects of data storage, access, manipulation, and integrity.
While no commercial database fully implements all 12 rules, they
serve as an ideal standard for relational database systems.
Codd’s rules
Pioneered relational concepts and SQL.
IBM System R
Early commercial RDBMS inspired PostgreSQL
Ingres
First commercial RDBMS, gaining market dominance.
Oracle
Popularized relational databases for microcomputers
BASE II
Robust, flexible, and advanced features.
PostgreSQL
Fast, simple, and widely used for web applications.
MySQL
Fork of MySQL with additional features and optimizations
MariaDB
Microsoft’s RDBMS for enterprise environments.
MS SQL Server
IBM’s RDBMS for high availability and scalability.
DB2
The central component that stores, manages,
and processes data.
Server
Applications that interact with the database
by sending requests to the server.
Client
Clients send requests to the server,
and the server processes them and returns results.
Communication
Centralized management, improved security,
better resource allocation, and scalability
Benefits
- Developed by IBM researchers Donald D. Chamberlin and Raymond F. Boyce in the 1970s
- Based on Codd’s relational model.
- Designed for querying, manipulating, and managing data in relational databases.
- Standardized by ANSI and ISO.
- Widely used in RDBMS such as MySQL, PostgreSQL, Oracle, and Microsoft SQL
SQL - Structured Query Language
- Unique Features: AUTO_INCREMENT, LIMIT, REPLACE INTO
- Use Case: Web development and applications
MySQL
- Unique Features: JSON/JSONB, WINDOW FUNCTIONS, RETURNING clause
- Use Case: Complex queries, data integrity, high concurrency
PostgreSQL
- Unique Features: TOP, TRY-CATCH, MERGE, CTEs
- Use Case: Enterprise-level applications, Microsoft ecosystem
Microsoft SQL Server (T-SQL)
- Unique Features: CURSORS, TRIGGERS, SEQUENCES, procedural extensions
- Use Case: Large enterprises requiring complex business logic
Oracle SQL (PL/SQL)
- Unique Features: Single-file database, limited concurrency, no user management
- Use Case: Embedded systems, mobile apps, small-scale software
SQLite
- Unique Features: Virtual columns, storage engines, compatibility with MySQL
- Use Case: Performance-enhanced alternative to MySQL
MariaDB
- Unique Features: PureXML, multi-dimensional clustering, RCAC
- Use Case: Large organizations requiring high throughput and complex queries
IBM DB2
For text fields like Title, Author, Category, ISBN, Publisher, Address, and Phone.
NVARCHAR(200):
The process of organizing data to minimize redundancy and ensure data integrity.
Normalization
A set of rules for achieving normalization.
Normal Forms
Data in a row-column intersection should be indivisible
Atomicity
- Unique Identifier: A primary key uniquely identifies each record.
- Avoid Changing: Primary keys should be stable to prevent errors.
- Artificial Primary Key: A generated integer or other unique value.
Primary Key
Separate Related Data: Move columns related to other entities into separate tables (e.g., publisher, author).
Redundancy
- Foreign Key Constraints: Automatically enforce referential integrity by ensuring foreign key values point to existing primary key values.
- Data Integrity: Prevent orphaned records and broken references.
Constraints: Ensuring Data Integrity
- SQL is declarative, allowing users to specify the desired result without detailing the process.
- This simplifies querying and makes it efficient for large datasets.
Declarative Language
- Different database systems have their own SQL dialects with variations in syntax and features.
- This provides flexibility and optimization for specific use cases.
SQL Dialects
- DQL (Data Query Language): Retrieves data.
- DDL (Data Definition Language): Defines database structure.
- DML (Data Manipulation Language): Modifies data.
- DCL (Data Control Language): Manages permissions and security.
- TCL (Transaction Control Language): Manages transactions.
SQL Sublanguages
- DQL (Data Query Language):
Retrieves data.
- DDL (Data Definition Language):
Defines database structure.
- DML (Data Manipulation Language):
Modifies data.
- DCL (Data Control Language):
Manages permissions and security.
- TCL (Transaction Control Language):
Manages transactions.
- CREATE TABLE: Creates a new table.
- ALTER TABLE: Modifies an existing table.
- DROP TABLE: Deletes a table.
- CREATE INDEX: Creates an index for faster querying.
- DROP INDEX: Deletes an index.
DDL Statements
Returns rows that have a match in both tables.
INNER JOIN
Returns all rows from the left table, along with matching rows from the right table.
LEFT JOIN
Returns all rows from the right table, along with matching rows from the left table.
RIGHT JOIN
Returns all rows from both tables, whether or not they have a match in the other table.
FULL OUTER JOIN
Types of Data Models
Hierarchical Data Model
Network Data Model
Relational Data Model
Entity-Relationship (ER) Model
Object-Oriented Data Model
NoSQL Data Models
- Tree-like structure with parent-child relationships.
- Limited flexibility due to rigid hierarchy.
Hierarchical Data Model