SQL - Introduction Flashcards
Review Major Concepts of SQL Programming Language Basics
What does the acronym SQL stand for?
Sequel or Structured Query Language.
Originally SEQUEL, or Structured English Query Language. SEQUEL ultimately became SQL, the Structured Query Language. Ironically, the SQL standard treats “SQL” as the name of this language and not as an acronym. Also, “SQL Query Language.”
What is SQL for?
SQL is the most widely used language to administer and query relational databases.
Research: relational databases and Relational database management systems (RDBMS)
An RDBMS is defined as a system whose users view data as a collection of tables related to each other through common data values. Data is stored in tables, which are composed of rows and columns. Tables of independent data can be linked (or related) to one another if they each have unique, identifying columns of data (called keys) that represent data values held in common.
When was SQL first created and by whom?
In the 1970’s. In the early 1970s, the seminal work of IBM research fellow Dr. E. F. Codd led to the development of a relational data model product called SEQUEL.
E. F. Codd first described relational database theory in his landmark paper “A Relational Model of Data for Large Shared Data Banks,” published in the Communications of the ACM (Association for Computing Machinery) in June 1970.
Research: IBM research fellow Dr. E. F. Codd
Who publishes the standard for SQL?
Originally, the American National Standards Institute (ANSI) in cooperation with the International Standards Organization (ISO).
Now the International Standards Organization (ISO) publishes the standard.
Research: what is the latest version of the SQL standard?
hint: search ISO/IEC 9075
ISO/IEC 9075-11:2023
2023 as of June 2024
not all of 2023 standard is implemented yet.
List: some vendors of databases that use SQL.
Commercial databases: Oracle and Microsoft SQL Server.
Open source databases: MySQL, PostgreSQL, and MariaDB
How is math related to SQL?
based upon earlier mathematical concepts—in particular, relational algebra.
Codd applied rigorous mathematical theories (primarily relational algebra and set theory) to the management of data, from which he compiled a list of criteria a database must meet to be considered relational.
This table-centric orientation is the heart of set programming. Consequently, almost all SQL commands operate much more efficiently against sets of data within or across tables than against individual records. Said another way, effective SQL programming requires that you think in terms of sets of data, rather than of individual rows.
Relational databases use relational algebra and tuple relational calculus to mathematically model the data in a given database and queries acting upon that data
Research: relational algebra and set theory.
What is ‘normalization’ in SQL?
Dr. E. F. Codd also articulated rules that governed how a relational database should be designed. The process for applying these rules is now known as normalization.
List and Describe: Codd’s Twelve Principles of Relational Databases
Rule 0: For any system that is advertised as, or claimed to be, a relational database management system, that system must be able to manage databases entirely through its relational capabilities.
Rule 1: Information is represented logically in tables.
Rule 2: Data must be logically accessible by table, primary key, and column.
Rule 3: Null values must be uniformly treated as “missing information,” not as empty strings, blanks, or zeros.
Rule 4: Metadata (data about the database) must be stored in the database just as regular data is.
Rule 5: A single language must be able to define data, views, integrity constraints, authorization, transactions, and data manipulation.
Rule 6: Views must show the updates of their base tables and vice versa.
Rule 7: A single operation must be available to do each of the following operations: retrieve data, insert data, update data, or delete data.
Rule 8: Batch and end-user operations are logically separate from physical storage and access methods.
Rule 9: Batch and end-user operations can change the database schema without having to re-create it or the applications built upon it.
Rule 10: Integrity constraints must be available and stored in the metadata, not in an application program.
Rule 11: The data manipulation language of the relational system should not care where or how the physical data is distributed and should not require alteration if the physical data is centralized or distributed.
Rule 12: Any row processing done in the system must obey the same integrity rules and constraints that set-processing operations do.
How is data shown in the relational model?
In the relational model,
Tables: Data is shown logically as a two-dimensional table that describes a single entity (for example, business expenses). Academics refer to tables as entities.
Rows & Columns: Tables are composed of rows, or records (academics call them tuples), and columns (called attributes, since each column of a table describes a specific attribute of the entity).
Values: The intersection of a record and a column provides a single value. However, it is quite common to hear this referred to as a field, from spreadsheet parlance.
Primary Key: The column or columns whose values uniquely identify each record can act as a primary key.
Relational design handles data on a table-by-table basis, not on a record-by-record basis.
Describe: Use SQL’s terminology used to describe the hierarchical data structures used by a relational database.
The Hierarchical Data Structures used by relational databases:
**Catalogs **: catalogs contain sets of schemas;
Schemas: schemas contain sets of objects, such as tables and views; and
Tables: tables are composed of sets of columns and records.
Define the term ‘atomic’ in SQL terminology.
atomic; that is, each column is supposed to contain one, and only one, value.