TB2 878u7 Flashcards

Question 1

Q

Normalisation

Answer

A

Normalisation is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anomalies.

1NF: A table is in 1NF if it contains no repeating groups or arrays, which means each cell contains a single value. Each record needs to be unique.
Key Points: Elimination of duplicate columns from the same table, creation of separate tables for each group of related data, identification of each record with a unique attribute or set of attributes known as the PKs.

2NF: A table is in 2NF if it is in 1NF and every non-key attribute is fully functionally dependent on the PK. This means each non-key attribute must be a fact about the key, the whole key, and nothing but the key.
Key Points: Removal of subsets of data that apply to multiple rows of a table and place them in separate tables, creation of relationships between these new tables and their predecessors through the use of FKs.

3NF: A table is in 3NF if it is in 2NF and there are no transitive dependencies of non-key attributes on the PK. That is, every non-key attribute is non-transitively dependent on the PK.
Key Points: Elimination of fields that do not depend on the PK; that is, any field that is functionally dependent on another field which is not a PK itself should be moved to a separate table.

Question 2

Q

TYPES OF KEYS

Answer

A

In the relational model, keys are important because they are used to ensure that each row in a table is uniquely identifiable. They are also used to establish relationships among tables and to ensure the integrity of the data. A key consists of one or more attributes that determine other attributes.

Foreign Key
Primary Key
Composite Key
Alternate Key

A key that contain all the properties needed to become a Candidate Key are known as Alternate Keys. An Alternate Key can be a set of a single attribute or multiple attributes.
It can be NULL as well but they must be UNIQUE. (e.g. email or phone number)

Question 3

Q

JOINs

Answer

A

In SQL, a JOIN is an operation that combines rows from two or more tables based on a related column between them (The PKs/FKs). This allows data from multiple tables to be integrated and queried as if it were in a single table.

INNER JOIN: Also known as a simple JOIN, it returns rows when there is a match in both tables.

LEFT JOIN: (or LEFT OUTER JOIN) - Returns all rows from the LEFT TABLE and the matched rows from the RIGHT TABLE. If no match is found, the result is NULL on the right side.

Right Join (or RIGHT OUTER JOIN): Returns all rows from the RIGHT TABLE and the matched rows from the LEFT TABLE. If no match is found, the result is NULL on the left side.

FULL JOIN (or FULL OUTER JOIN): Returns rows when there is a match in one of the tables. If there is no match, the result is NULL on either side.

SELF JOIN: A self join is a regular join, but the table is joined with itself.

CROSS JOIN: A cross join returns the CARTESIAN product of rows from both tables.

Question 4

Q

LEFT JOINs

Answer

A

A LEFT JOIN is a type of join that combines rows from two or more tables based on a related column between them. The key feature of a LEFT JOIN is that it includes all rows from the left table plus any matching rows from the right table.
If there are no matches in the right table, the result still includes the rows from the left table, with NULL in the columns from the right table.

Question 5

Q

RIGHT JOINs

Answer

A

A RIGHT JOIN is a type of join that combines rows from two or more tables based on a related column between them. The key feature of a RIGHT JOIN is that it includes all rows from the right table , plus any matching rows from the left table
If there are no matches in the left table, the result still includes the rows from the left table, with NULL in the columns from the right table.

Question 6

Q

FULL JOINs

Answer

A

A FULL JOIN is a type of join that combines rows from two or more tables based on a related column between them. The key feature of a FULL JOIN is that it includes all rows from both the left and the right tables. When there is a match in the related column between the two tables, it combines the matched rows into a single row.

If there is no match, the result will still show every row from both tables, but with NULL in the columns from the table that lacks a corresponding match.

Thus, a FULL JOIN provides a complete set of records from both tables, with matching records from both sides where available. For non-matching records from either table, the query fills in with NULLs to indicate the absence of a match. This type of join is useful when you need to understand the relationship between two tables, including the presence of unmatched records on either side.

Question 7

Q

VIEWs

Answer

A

A VIEW in databases is a virtual table created by a query that selects data from one or more underlying tables. It does not store data itself but presents data from these tables in a format specified by the query. Views can simplify complex queries, provide a specific data perspective or subset, and enhance data security by restricting access to certain data.

Question 8

Q

Views adv

Answer

A

Advantages of Using VIEWs:
Data Abstraction: Views can simplify the complexity of data by providing a specific, focused perspective of the data, which is especially useful when the underlying data structures are complex with multiple tables and joins.

Security: Views can act as a layer of security to restrict user access to specific rows and columns of data, ensuring users only see the data they are authorized to access.

Query Simplification: Views allow users to save complex queries as virtual tables, so they can be reused without needing to rewrite the query. This can make frequent complex queries much more manageable.

Question 9

Q

Views Disadv

Answer

A

Disadvantages of Using VIEWs:
Performance: Views can sometimes lead to performance issues, especially when they are built on top of other views or involve complex calculations, as the database must execute the underlying queries each time the view is accessed.

Update Restrictions: Some views are not updatable or insertable, particularly those that contain aggregations, distinct clauses, or joins across multiple tables, which can limit their use in dynamic data environments.

Maintenance: If the underlying table structures change, the views may need to be updated or recreated. This can lead to additional maintenance overhead, especially if there are many views or if they are used in multiple applications.

Question 10

Q

INDEX-es

Answer

A

An INDEX in a database is a data structure that improves the speed of data retrieval operations on a table. It works similarly to an index in a book: just as a book index allows you to quickly find specific information without reading the entire book, a database index allows the database engine to find and retrieve specific rows much faster than it could by scanning the entire table.

Indexes are typically used on columns that are frequently searched or used as join keys. However, while they speed up data retrieval, they can slow down data insertion, deletion, and updating, as the index must be updated whenever the data it indexes is altered.

Question 11

Q

Advantages of INDEX-es:

Answer

A

Improved Query Performance: Indexes provide a way to quickly locate and retrieve specific rows from a table, significantly speeding up the execution of SELECT statements that involve filtered or sorted data.

Faster Joins: When multiple tables are joined in a query, indexes on the join columns help optimize the join operation, reducing the time needed to combine data from different tables.

Efficient Data Retrieval: Indexes allow for efficient data retrieval without the need to scan the entire table, making them particularly useful for large datasets and complex queries.

Facilitates Unique Constraints: INDEX-es can enforce the uniqueness of values in one or more columns, ensuring that duplicate data is not allowed in those columns. This is especially useful for maintaining data integrity and preventing data duplication in database tables. (e.g. CREATE UNIQUE INDEX index_name ON table_name (column1, column2);)

Question 12

Q

Disadvantages of INDEX-es

Answer

A

Increased Storage Overhead: Indexes require additional storage space to store the index data structure. This can be a concern for large tables with multiple indexes, as it can consume a significant amount of disk space.

Slower Data Modification: While indexes improve data retrieval speed, they can slow down INSERT, UPDATE, and DELETE operations, as the index structures need to be maintained whenever the underlying data changes.

Maintenance Overhead: Indexes need to be maintained, which means that as data is modified, indexes must be updated to reflect these changes. This maintenance overhead can impact system performance and requires careful management.

Index Selection: Choosing the right columns to index is crucial. Creating too many indexes or indexing the wrong columns can lead to unnecessary overhead and may not necessarily improve query performance. Proper index design and analysis are essential to benefit from indexes effectively.

Question 13

Q

SUBQUERIES (NESTED QUERIES)

Answer

A

A subquery, also known as a nested query, is a SQL query that is embedded within another SQL query. It is used to retrieve data from one or more tables and is enclosed within parentheses and it is used to perform operations that require multiple steps of logic in a database. The result of the subquery can be used as a part of the main query to filter, manipulate, or make decisions about the data being retrieved.

Question 14

Q

Advantages of Nested Queries:

Answer

A

Complexity Management: They help in breaking down complex problems into simpler, manageable parts.

Reusability: Subqueries can be reused in different parts of the outer query to avoid redundancy.

Logical Grouping: They allow for logical grouping of data retrieval in a single query, making it easier to understand the relationship between different data sets.

Performance Issues: Nested queries can lead to performance degradation, especially if not optimized properly, as they may require multiple scans of the same table.

Readability: Complex nested queries can become difficult to read and maintain, especially for those not familiar with the database schema.

Limited Flexibility: In some cases, subqueries are less flexible than JOIN operations, particularly when working with multiple tables.

Question 15

Q

SUBQUERIES vs JOINS

Answer

A

A subquery is a query nested inside another query, used to perform operations that require multiple steps of logic.

.A subquery is ideal for operations that require a single value returned, like in WHERE clauses or select lists.
.Useful for conditions that depend on the result of another query.
.Better for readability when performing complex queries that require a step-by-step approach.
Simplifies the SQL logic by breaking down complex conditions into manageable parts.
.Can be slower and less efficient than JOINs, especially if the subquery is executed multiple times (e.g., in a correlated subquery).
.Optimization by the SQL engine varies; performance can sometimes be improved by rewriting as JOINs.

A JOIN clause is used to combine rows from two or more tables, based on a related column between them.

.Best suited for retrieving data from multiple tables where the tables are directly related.
.Efficient for operations that require comparing large datasets.
.Generally faster for straightforward data retrieval from multiple tables.
.Allows for more dynamic and complex interactions between multiple tables.
.Typically more efficient for operations involving multiple tables.
.Better optimized by SQL engines, but can become complex and harder to read with multiple JOIN operations.

Question 16

Q

CHOSING SUBQUERIES vs JOINS

Answer

A

Consider the Query Purpose: Use subqueries for single-value calculations or when the intermediate result set is not needed. Choose JOINs for direct data correlations and operations involving multiple tables.

Performance Implications: Test both methods if performance is a concern. SQL engines optimize differently based on the query structure and database schema.

Readability and Maintenance: Use the approach that makes the SQL easier to understand and maintain, especially in the context of your practical sessions or company policies

Question 17

Q

DATABASE SECURITY FUNDAMENTALS

Answer

A

Database security encompasses the tools, strategies, and measures designed to protect databases from unauthorized access, data breaches, and other forms of security threats. It involves safeguarding the data itself, the database management system (DBMS), and the associated applications and network links.

Question 18

Q

DATABASE SECURITY FUNDAMENTALS

Components:

Importance:

Challenges in Database Security:

Answer

A

Components:

Physical Security: Protecting the physical machines and devices where databases are stored.
Network Security: Securing the network connections that access the database to prevent interception or intrusion.
Access Control: Managing who has permission to access different parts of the database and what actions they can perform.
Application Security: Ensuring that applications accessing the database are securely designed and do not introduce vulnerabilities.

Importance:

Data Integrity: Guaranteeing that the data remains accurate, consistent, and reliable over its lifecycle.
Confidentiality: Ensuring that sensitive information is not disclosed to unauthorized individuals.
Availability: Making sure that the database is accessible to authorized users when needed, which is crucial for the continuity of operations.

Challenges in Database Security:

Rapid Technological Advancements: As technology evolves, so do the techniques employed by cybercriminals, requiring constant vigilance and updating of security measures.
Increasing Complexity of Cyber Attacks: Attackers use sophisticated methods such as Advanced Persistent Threats (APTs) and ransomware, which can be difficult to detect and counteract.
Regulatory Compliance: With the introduction of regulations such as GDPR (General Data Protection Regulation) in Europe, organizations must ensure their database security practices comply with legal standards, adding another layer of complexity to database security management.

Question 19

Q

THREATS TO DATABASE SECURITY

Answer

A

Cyber Attacks

SQL Injection: A malicious technique where attackers execute unauthorized SQL commands by exploiting vulnerabilities in the input fields of an application, leading to unauthorized access and manipulation of the database.
Phishing: Attackers deceive database users into revealing sensitive information, such as login credentials, which can then be used to gain unauthorized access to the database.
Malware and Ransomware: Malicious software is used to disrupt database operations, steal data, or encrypt database contents for ransom. These attacks can cause significant downtime and data loss.

Insider Threats

Accidental Misuse: Users might unintentionally delete important data or grant excessive privileges due to lack of awareness or training.
Malicious Insiders: Employees or trusted individuals with access to the database might intentionally misuse their privileges for personal gain or to harm the organization.

Physical Threats

Theft: Physical theft of servers or devices containing database information can lead to a direct breach of data.
Natural Disasters: Events like floods, earthquakes, and fires can damage physical infrastructure, resulting in data loss unless proper backups and disaster recovery plans are in place.

Network Attacks

Denial of Service (DoS) and Distributed Denial of Service (DDoS) Attacks: These attacks aim to overwhelm the database server’s resources, making the database unavailable to legitimate users.
Man-in-the-Middle (MitM) Attacks: Attackers intercept and possibly alter the communication between two parties (e.g., between a database client and the server), which can compromise data integrity and confidentiality.

Question 20

Q

CHALLENGES AND IMPACT

Answer

A

Challenges in Mitigating Threats

Rapid Evolution of Threats: As technology advances, so does the sophistication of attacks, making it challenging to stay ahead of new vulnerabilities.
Complexity of Database Systems: Modern databases are complex and interconnected with numerous applications and systems, increasing the potential attack surface.
Human Factor: Even with robust technical defences, the human element remains a weak link; social engineering attacks exploit this vulnerability effectively.

The Impact of Threats

Data Breach and Loss: Unauthorized access can lead to sensitive data being stolen, sold, or publicly disclosed, damaging the organization’s reputation and incurring legal penalties.
Financial Loss: Beyond the immediate impact of a breach, organizations can face significant financial losses due to fines, litigation, and loss of business.
Operational Disruption: Attacks can disrupt operations, leading to downtime, loss of customer trust, and long-term damage to business relationships.

Question 21

Q

Securing the Physical Database

Answer

A

Access Controls: Limit physical access to the database servers to authorized personnel only. Use biometric access controls, security badges, and surveillance systems to monitor and control access to server rooms.

Environmental Controls: Protect hardware against environmental threats, such as excessive heat, humidity, and water damage, with proper cooling systems, fire suppression systems, and waterproof enclosures.

Hardware Security: Use hardware security modules (HSMs) for encryption key management and secure encrypted data storage devices to enhance data protection at the physical level.

Question 22

Q

Roles and Access Control

Answer

A

Principle of Least Privilege (PoLP): Ensure that users and applications have only the minimum levels of access—or permissions—needed to perform their tasks. Regularly review access privileges and adjust them as roles change within the organization.

Role-Based Access Control (RBAC): Define roles within your organization and assign permissions to these roles rather than to individual users. This makes managing access rights more scalable and understandable.

Audit and Monitor Access: Implement auditing and monitoring tools to track who accesses the database, when they access it, and what actions they perform. This not only helps in detecting unauthorized access attempts but also in ensuring compliance with regulatory standards.

Question 23

Q

Data Sanitation (Input Validation)

Answer

A

Validation: Ensure that all data input into applications is validated for type, length, format, and range. This helps prevent SQL injection and other forms of injection attacks by ensuring that inputs cannot be interpreted as commands or queries.

Sanitization: Use data sanitization techniques to cleanse input data, removing potentially harmful characters or patterns that could be used in an injection attack. This is particularly important for data that will be used in SQL queries or that comes from untrusted sources.

Prepared Statements and Parameterized Queries: Use prepared statements with parameterized queries in your application code to separate SQL logic from data values. This approach ensures that user input is treated as data, not as executable code, effectively neutralizing SQL injection attacks.

Question 24

Q

Implementing Comprehensive Security Measures

Answer

A

Encryption: Encrypt sensitive data both at rest and in transit to protect it from unauthorized access. Use strong encryption algorithms and keep your encryption keys secure.

Backup and Recovery Plans: Maintain regular, secure backups of your database and test your recovery procedures to ensure you can quickly restore data in case of loss or corruption.

Security Policies and Procedures: Develop and enforce comprehensive security policies and procedures that cover all aspects of database security, from physical security to user training and incident response plans.

Question 25

Q

ENCRYPTION

Answer

A

Encryption is a security method where information is encoded in such a way that only authorized parties can access it. It transforms readable data, or plaintext, into an unreadable format, known as ciphertext, using an encryption algorithm and a key. This ensures that even if data breaches occur, the information remains unreadable and secure.

Importance of Encryption

Data Security: Protects sensitive data such as personal information, financial records, and confidential corporate data from breaches and unauthorized access.
Privacy Compliance: Many industries are subject to regulations requiring the protection of sensitive information, such as GDPR, HIPAA, etc., which mandate the use of encryption.
Trust: Maintains the trust of customers and stakeholders by ensuring that their data is handled securely and responsibly.
Data Integrity: Ensures that data is not altered or tampered with during storage or transmission, maintaining its accuracy and reliability.

Types of Data That Require Encryption

Personal Identifiable Information (PII): Names, addresses, social security numbers, etc.
Financial Information: Credit card numbers, bank account details, transaction records.
Confidential Business Information: Trade secrets, proprietary technology, strategic plans.
`

Question 26

Q

Types of Data to Encrypt in Databases

Answer

A

Databases use various encryption techniques to protect data at different levels, from individual cells to entire databases. These techniques can be broadly categorized into symmetric and asymmetric encryption, each with its advantages and use cases.

Symmetric Encryption

Utilizes a single key for both encryption and decryption. This method is efficient and typically faster, making it suitable for encrypting large volumes of data.
Algorithms - Advanced Encryption Standard (AES), Triple Data Encryption Standard (3DES), and Blowfish.
Ideal for encrypting data at rest, such as entire database files or backups, where encryption and decryption speed is crucial.

Asymmetric Encryption

Employs a pair of keys – a public key for encryption and a private key for decryption. This method is more secure but also more computationally intensive.
Algorithms - RSA (Rivest-Shamir-Adleman), Elliptic Curve Cryptography (ECC), and Digital Signature Algorithm (DSA).
Best suited for encrypting data in transit or for establishing secure connections for remote database access.

Question 27

Q

ENCRYPTION TECHNIQUES IN DATABASES

Answer

A

Encryption at Rest vs. Encryption in Transit

Rest: Protects data stored within the database or on disk. Techniques include Transparent Data Encryption (TDE) and file-system level encryption. It’s crucial for preventing data breaches resulting from physical theft or unauthorized access to storage media.
Transit: Secures data as it moves between the database and applications or between servers. Implemented through protocols like TLS (Transport Layer Security) and SSL (Secure Sockets Layer), it safeguards against interception and eavesdropping.

Application-Level vs. Database-Level Encryption

Application-Level Encryption: The application encrypts data before sending it to the database. This approach provides fine-grained control over what data is encrypted and allows for application-specific encryption schemes.
Database-Level Encryption: The database system itself manages encryption, offering a more straightforward implementation that requires less modification to existing applications. However, it may provide less flexibility in terms of which data is encrypted.

Question 28

Q

DATA RETRIEVAL AND DECRYPTION

Answer

A

DATA RETRIEVAL AND DECRYPTION

The Retrieval Process

Request Authentication: Verifies the identity of the user or system requesting data, ensuring that only authorized parties can initiate data retrieval.
Decryption: Once access is granted, the database decrypts the requested data using the appropriate decryption key. This step requires careful management of encryption keys to ensure they are accessible to legitimate users while being protected from unauthorized access.
Data Presentation: The decrypted data is then presented to the user or application in a readable format. This step often involves additional security measures, such as secure transmission protocols and session management, to protect data as it’s delivered to the end user.

Challenges in Data Retrieval and Decryption

Performance Overhead: Encryption and decryption processes can introduce latency, affecting the performance of database queries and data retrieval operations.
Key Management Complexity: Managing the lifecycle of encryption keys, including generation, storage, rotation, and revocation, poses significant challenges. Mismanagement can lead to data loss or breaches.
Access Control and Authentication: Implementing robust access control mechanisms is essential to prevent unauthorized data access. This includes managing permissions and roles within the database management system.

Question 29

Q

PostgreSQL ENCRYPTION

Answer

A

The pgcrypto extension enables PostgreSQL with cryptographic functionality, facilitating encryption/decryption and secure storage of data directly within the database.

It supports both symmetric and limited asymmetric encryption methods, allowing for the protection of sensitive information and secure password storage through hashing.

The pgcrypto extension in PostgreSQL supports symmetric encryption using PGP_SYM and AES, allowing secure data storage. Remember: Symmetric encryption means the same key is used to encrypt and decrypt data.

PGP_SYM is versatile, ideal for encrypting text data with a passphrase, making it user-friendly for scenarios where data needs to be shared securely between parties who have the passphrase.

AES is known for its speed and robust security, suitable for encrypting large volumes of data efficiently. It’s recommended for situations demanding high performance and strong security, such as storing sensitive user information or financial records.

Question 30

Q

HASHING VS ENCRYPTION

Answer

A

In secure password storage practices, passwords are hashed, not encrypted. The distinction is crucial: hashing is a one-way process, meaning once a password is hashed, it cannot be reversed or decrypted back to its original plaintext form. This is why it’s impossible to “decrypt” and view the original password from a hash stored in the database.

Hashing is a one-way process used for verifying the integrity of data. The same input always produces the same output, but it’s computationally infeasible to reverse the process and retrieve the original input from the hash output.
Encryption is a two-way process that allows data to be made unreadable via encryption and then returned to its original readable form via decryption, using a specific key.

Why You Can’t Decrypt Hashed Passwords

The purpose of hashing passwords before storing them is to protect user credentials. Even if a database is compromised, the attackers cannot retrieve the actual passwords, only their hashes.
Secure password hashing algorithms, especially those designed for password storage like bcrypt, scrypt, or Argon2, are intentionally designed to make this reversal computationally impractical.

Question 31

Q

HASHING – CRYPT

Answer

A

Crypt Function

A PostgreSQL function used to hash passwords.
Utilizes a cryptographic hash function to convert plaintext passwords into a secure, fixed-size hash.
Ensures that stored passwords are not kept in plain text, enhancing security.

Algorithms for hashing passwords

BF (Blowfish): This is the algorithm used by bcrypt, which is well-regarded for its security due to its adaptive cost factor. It allows you to scale the algorithm’s complexity and resistance to brute-force attacks as hardware capabilities improve.
MD5: An older hashing algorithm that is much faster but significantly less secure than Blowfish. It’s generally not recommended for new systems due to vulnerabilities to collision attacks and its susceptibility to fast brute-force attacks.
XDES (Extended DES): An extension of the traditional DES (Data Encryption Standard) algorithm, offering better security through a configurable number of encryption rounds. Like Blowfish, it can be set to be computationally intensive, though it’s generally less used than bcrypt.
DES: The original Data Encryption Standard algorithm. It’s considered obsolete for most purposes due to its short key length, which makes it vulnerable to brute-force attacks.

Question 32

Q

HASHING – SALT

Answer

A

Salt Function

A salt is a random sequence of characters added to the input of a hash function along with the password.
The same password with different salts will result in different hashes.
Significantly improves hash security by preventing pre-computation attacks.
Typically used with crypt via the gen_salt function, which supports multiple hashing algorithms.

Purpose

Uniqueness: By adding a salt to each password before it is hashed, even identical passwords will produce unique hash values, thus preventing attackers from using pre-computed hash tables (rainbow tables) to crack the passwords.
Security Enhancement: Salts increase the complexity and uniqueness of hashed passwords, making them much harder to crack. This is particularly important in a database breach scenario where attackers gain access to hashed passwords.

Question 33

Q

ASYMMETRIC ENCRYPTION

Answer

A

Encrypting Data with the Public Key

SELECT pgp_pub_encrypt(‘Sensitive data here’, dearmor(‘—–BEGIN PUBLIC KEY—–\n…ThePublicKey…\n—–END PUBLIC KEY—–’)) AS encrypted_data;

Decrypting Data with the Private Key

SELECT pgp_pub_decrypt(encrypted_data, dearmor(‘—–BEGIN PRIVATE KEY—–\n…ThePrivateKey…\n—–END PRIVATE KEY—–’)) AS decrypted_data
FROM (SELECT pgp_pub_encrypt(‘Sensitive data here’, dearmor(‘—–BEGIN PUBLIC KEY—–\n…ThePublicKey…\n—–END PUBLIC KEY—–’)) AS encrypted_data)
AS Decrypted;

Security Considerations

Key Management: Properly secure the private key. It should never be exposed or stored in an insecure manner.
Key Storage: Ideally, keys should not be stored directly in the database. Use a secure key management service or infrastructure for storing and accessing cryptographic keys.
Data Sensitivity: Be cautious when encrypting and decrypting sensitive data, ensuring that only authorized users have access to the necessary keys and encrypted data.