Course 4: Introduction to Relational Databases (RDBMS) Flashcards
information model
abstract, formal representation of entities that includes their properties, relationships and the operations that can be performed on them
data models
concrete, specific and include details
hierarchal data model
organizes its data using tree structure with root of the tree being parent and followed by child nodes
relational data model
most used model. data stored in tables, providing logical data independence, physical data independence, and physical storage independence
entities
noun such as a person, place, or thing and represented by rectangle in ER diagram
attributes
properties or characteristics of an entity
relationship sets
represented by diamonds with lines connecting associated entities to show relationship and whether one-to-one, one-to-many, or many-to-many relationship
relation
another term for table
relation instance
table made up of rows and columns
degree
number of attributes (columns) in a relation
cardinality
number of tuples (rows)
single-tier database architecture
database resides on user’s system and access is often limited to a single user. useful for dev or testing or when database is embedded in a local application
client-server (2-tier) database architecture
resides on a remote server and users access it from client systems often through web page or local app. used for multi-user scenarios and typical for production environments
cloud database architecture
easy for users to access and dont have to maintain support infrastructure. client apps and users typically access through an application server layer or interface in the cloud. flexible and used for dev, testing, and full production environments
layers of a database management system software
1 data access layer
2 database engine layer
3 database storage layer
4 data access layer
3-tier database architecture
resides on a remote server and users access it through a middle-tier
distributed architectures
- mission critical, large scale workloads
- high availability and scalability
- databases distributed on cluster of servers
- shared disk storage
- shared nothing architecture - replication and partitioning
replication
changes taking place on db server are replicated to one or more db serverse
high availability replica
when the replica is within the same location as primary db. clients can be rerouted to replica in case of primary failure
partitioning
separating large quantities of data into multiple logical subsets
sharding
when partitions are placed on separate nodes in a cluster and has its own compute resources
3 main classes of database users
1 data engineers
2 data scientists and business analysts
3 app developers
common data engineer and data administrator task tools
gui or web based management tools
graphical tools
command line interfaces
APIs
common data scientist task tools
jupyter r studio zeppelin SAS SPSS SQL
common business analyst tools
excel ibm congos power bi tableau microstrategy SQL
application development popular ORM (object relational mapping) frameworks
ActiveRecord in Ruby apps Django in python Entity Framework in .NET Hibernate in Java Sequelize in JavaScript
storage engine
component that handles the SQL ops on a table and defines what features that table can use
PostgreSQL
free open source object-relational database management system
MySQL
object-relational database that supports many operating systems, a range of languages for client application development, relational and JSON data, multiple storage engines, and high availability and scalability options
Data Definition Language Statements (DDL)
used to define, change, or drop database objects such as tables
Data Manipulation Language Statements (DML)
are used to read and modify data in tables. These are also sometimes referred to as CRUD operations, that is, Create, Read, Update and Delete rows in a table
Common DDL statement types
CREATE, ALTER, TRUNCATE, and DROP.
Common DML statement types
INSERT, SELECT, UPDATE, and DELETE