INTRODUCTION Flashcards
_ refers to the process of collecting, storing, organizing, and sharing information in a way that makes it accessible, useful, and secure. It involves handling data and information from various sources, ensuring it is accurate, easy to retrieve, and protected from unauthorized access or loss. Good_ helps businesses make better decisions, stay compliant with regulations, and improve efficiency by ensuring that the right information is available to the right people when they need it.
Information Management
Gathering information from multiple sources, such as internal databases, external systems, or manual input.
Data Collection
Structuring and categorizing information so it can be easily found. This might include creating taxonomies, tags, or databases.
Organization
Choosing how and where information will be stored. This could be physical _(e.g., paper records) or digital storage (e.g., cloud systems, databases).
Storage
Ensuring that authorized individuals can easily retrieve the information they need. This involves creating efficient search systems and using appropriate indexing techniques.
Access and Retrieval
Safeguarding sensitive information and ensuring compliance with relevant laws and regulations (e.g., GDPR, HIPAA).
Security and Compliance
Facilitating effective sharing of information among teams, departments, or stakeholders in a timely and secure manner.
Communication
Raw facts and figures without context
DATA
Data that is processed, organized, and meaningful
INFORMATION
A _ is an organized collection of data that is stored and accessed electronically, typically in a digital format. It allows for efficient storage, retrieval, management, and manipulation of data. _are structured in a way that makes it easy to store large amounts of information and retrieve it quickly using queries.
Database
are software tools used to create, manage, and interact with databases.
Different types of _exist based on how data is organized and how users interact with it.
Database Management Systems (DBMS)
Data is organized into tables (rows and columns), and relationships are established using keys.
Relational DBMS (RDBMS)
Examples OF Relational DBMS (RDBMS)
MySQL, PostgreSQL, Oracle, SQL Server
Uses Structured Query Language (SQL) for data manipulation.
Data is highly structured, making it easy to query and join tables.
Ideal for transactional applications, such as banking systems.
KEY FEATURES OF RDBMS
More flexible than relational databases; data can be stored in various formats such as key-value pairs, documents, graphs, or wide-columns.
NoSQL DBMS Structure
Examples OF NoSQL DBMS
MongoDB (document-based), Cassandra (wide-column), Redis (key-value), Neo4j (graph-based)
Designed for handling large volumes of unstructured or semi-structured data.
Suitable for big data applications, real-time web apps, and scalable systems.
Typically doesn’t require fixed schemas or relationships like relational databases.
Key Features OF NoSQL DBMS
Data is stored as objects, similar to how data is represented in object-oriented programming (OOP).
Object-Oriented DBMS (OODBMS) Structure
Examples OF Object-Oriented DBMS (OODBMS)
db4o, ObjectDB.
Data is stored in a hierarchy, resembling parent-child relationships.
Suitable for applications with a clear, predefined structure, like telecom or banking systems.
Less flexible than relational models and can be harder to scale.
Key Features OF Object-Oriented DBMS (OODBMS)
Data is organized in a tree-like structure, where each record has a single parent and can have multiple children.
Hierarchical DBMS Structure
Examples OF Hierarchical DBMS
IBM’s Information Management System (IMS).
Data is stored in a hierarchy, resembling parent-child relationships.
Suitable for applications with a clear, predefined structure, like telecom or banking systems.
Less flexible than relational models and can be harder to scale.
Key Features OF Hierarchical DBMS
Similar to hierarchical DBMS, but allows more complex relationships with multiple parent-child connections (many-to-many relationships).
Network DBMS Structure
Examples OF Network DBMS
Integrated Data Store (IDS), TurboIMAGE.
Uses a graph-like structure where records can have multiple relationships.
Better for complex relationships compared to hierarchical models, but still rigid compared to relational databases.
Key Features OF Network DBMS
A modern take on relational databases designed to handle the scalability of NoSQL while retaining SQL features.
NewSQL DBMS Structure
Examples OF NewSQL DBMS
Google Spanner, NuoDB, CockroachDB
Designed to offer the same ACID properties (Atomicity, Consistency, Isolation, Durability) as RDBMS, but with higher scalability and performance.
Suitable for cloud applications and large-scale, distributed systems.
Key Features OF NewSQL DBMS
Data is stored in columns rather than rows, which can be more efficient for certain types of queries.
Columnar DBMS Structure
Examples OF Columnar DBMS
Apache Cassandra, HBase.
Optimized for read-heavy workloads and analytical queries.
Well-suited for data warehousing and business intelligence tasks where queries often involve aggregating data over large volumes.
Key Features OF Columnar DBMS
Stores data primarily in RAM rather than on disk for faster access and processing.
In-Memory DBMS Structure
Examples OF In-Memory DBMS
Redis, Memcached, SAP HANA
Extremely fast read and write operations due to memory-based storage.
Typically used for real-time applications, caching, and performance-critical systems.
Key Features OF In-Memory DBMS
A _ is a type of database management system that organizes data into structured tables with rows and columns, following a relational model. It allows for the storage, retrieval, and management of data efficiently by defining relationships between tables using keys (primary, foreign, etc.).
Relational Database Management System (RDBMS)
Data is stored in tables (also called relations), which are made up of rows (records) and columns (attributes or fields).
Each _ represents an entity (e.g., Customers, Orders) and each column represents a property or attribute of that entity (e.g., Customer ID, Order Date).
Tables (Relations)
_ is the standard language used to interact with an RDBMS. It is used to query, insert, update, and delete data.
_ commands include operations like SELECT, INSERT, UPDATE, and DELETE, which allow you to manage the data.
SQL (Structured Query Language)
Data Integrity
ACID
Atomicity
Consistency
Isolation
Durability
Properties ensure reliable transactions in an RDBMS
ACID
A transaction is all-or-nothing.
Atomicity
The database remains in a valid state before and after the transaction.
Consistency
Transactions do not interfere with each other.
Isolation
Once committed, the transaction’s effects are permanent.
Durability
Relationships
One-to-One
One-to-Many
Many-to-Many
Each record in one table is linked to exactly one record in another.
One-to-One
One record in a table is related to multiple records in another table (e.g., one Customer can have multiple Orders).
One-to-Many
Multiple records in one table are related to multiple records in another (e.g., Students and Courses).
Many-to-ManY
The process of organizing data to reduce redundancy and dependency by dividing large tables into smaller, more manageable ones.
Helps improve data integrity and efficiency, avoiding data anomalies during operations like updates or deletions.
NormalizatioN
An _ improves the speed of data retrieval operations on a table. It acts like a lookup table to quickly find rows based on specific columns.
Commonly used on columns that are frequently queried or used in joins.
Indexes
Advantages of RDBMS
Structured Data Management
Data Integrity
Flexibility with Queries
Scalability
RDBMS excels in managing structured data with clear relationships.
Structured Data Management
ACID properties ensure reliability and consistency of data during operations.
Data Integrity
Complex queries can be easily run using SQL, which supports filtering, sorting, joining, and aggregating data across tables.
Flexibility with Queries
RDBMS systems can scale for large applications, though scaling horizontally (across servers) may be more challenging compared to NoSQL
Scalability
Disadvantages of RDBMS
Performance Overheads
Rigid Schema
Not Ideal for Unstructured Data
As databases grow in size, complex queries and joins can slow down performance.
Performance Overhead
Schema changes can be difficult, especially in large, mature databases.
Rigid Schema
RDBMS is best suited for structured data; handling unstructured data (e.g., images, text) can be cumbersome
Not Ideal for Unstructured Data