Inno VI- Database and protocols Flashcards
Normalization Process in DBMS
Database Normalization **is any systematic process of organizing a database schema such that no data redundancy occurs and there is least or no anomaly while performing any update operation on data. **
In other words, it means dividing a large table into smaller pieces such that data redundancy should be eliminated. The normalizing procedure depends on the functional dependencies among the attributes inside a table and uses several normal forms to guide the design process.
It is like safety guarantee, the higher normal form the highest safety of data.
First Normal Form (1NF):
Ensures that each column contains only atomic values that cannot be divided, and each record is unique.
-data in a column has the same row (DB wont let you do it anyway)
-theres primary key
Każda kolumna musi zawierać tylko pojedyncze wartości (atomowe wartości):
Każdy wiersz musi być unikalny (unikatowy identyfikator) (Primary key):
Nie mogą występować grupy powtarzających się kolumn:
Second Normal Form (2NF):
Includes 1NF and removes subsets of data that would apply for more than one row and places that data in a separate table. It deals with a partial dependency example when a non-key attribute depends on part of a composite primary key.
-theres no update/delete/insertion anomaly
-each non-key attribute must depend on the entire primary key
Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form, or BCNF for short, is an extension of the Third Normal Form, 3NF, in that it seeks to eliminate certain types of redundancy that the latter does not catch. In BCNF, for a table to be said to be in BCNF, it needs to have the following condition met: for any nontrivial functional dependency
Denormalization in Database
Denormalization focuses on combining multiple tables to make queries execute quickly. It adds redundancies in the database though. This method can help us to avoid costly joins in a relational database made during normalization.
Denormalization is a database optimization technique in which we add redundant data to one or more tables. This can help us avoid costly joins in a relational database. Note that denormalization does not mean ‘reversing normalization’ or ‘not to normalize’. It is an optimization technique that is applied after normalization.
Basically, The process of taking a normalized schema and making it non-normalized is called denormalization, and designers use it to tune the performance of systems to support time-critical operations.
Denormalization
In a traditional normalized database, we store data in separate logical tables and attempt to minimize redundant data. We may strive to have only one copy of each piece of data in a database.
For example, in a normalized database, we might have a Courses table and a Teachers table. Each entry in Courses would store the teacherID for a Course but not the teacherName. When we need to retrieve a list of all Courses with the Teacher’s name, we would do a join between these two tables.
In some ways, this is great; if a teacher changes his or her name, we only have to update the name in one place. The drawback is that if tables are large, we may spend an unnecessarily long time doing joins on tables. Denormalization, then, strikes a different compromise. Under denormalization, we decide that we’re okay with some redundancy and some extra effort to update the database in order to get the efficiency advantages of fewer joins.
Example of Synchronous Communication:
HTTP Request-Response: One microservice can submit an HTTP request to another microservice and wait for a response by using HTTP protocols such as REST or SOAP.
RPC (Remote Procedure Call): Services can use RPC frameworks like gRPC to make remote procedure calls and wait for the response before continuing.
Synchronous Messaging: Some message brokers support synchronous messaging patterns, where a service sends a message and waits for a response from another service.
Applications of Synchronous Communication:
Real-Time Messaging Applications: Used in chat apps like WhatsApp or Slack where messages are exchanged instantly between users.
Database Operations: Suitable for operations requiring immediate confirmation, like reading or updating critical data in transactional systems.
Payment Gateways: Ensures immediate feedback for payment authorization or failure in online transactions.
APIs Requiring Immediate Response: Services like authentication APIs or search queries that require instant results.
Video Conferencing and Calls: Applications like Zoom or Google Meet use synchronous communication for real-time audio and video data transfer.
Remote Procedure Calls (RPCs): Often employed when one service needs an immediate response from another, as in microservices-based systems.
TCP- transmission control protocol:
-connection oriented protocol- two computers have to have connection (3 Hands handshake- SYN -> SYN ACK -> ACK RECIVED)
-reliable, data comes in right order
-guarantess deliver of data, if the data is missing it will be resend
-need to be a session
UDP
- user datagram protocol
-is connectionless
-does not establish a session and does not guarantee data delivery
-fire and forget
-faster
Http
hypertext transfer protocol
-viewing web pages
-all the information is transferred by plain text
Https- secure hypertext transfer protocol
-security feature
-encrypts the data
It uses one of two protocols:
1. SSL- secure sockets layer- uses public key to secure data
2. TLS- transfer layer security- successor to SSL, authenticates client, server, encrypts the data
What is Messaging in the context of distributed systems?
Messaging is the process of sending and receiving messages between systems or components in a network. It enables asynchronous communication and decouples the sender from the receiver, allowing systems to scale independently.
What is RabbitMQ?
RabbitMQ is an open-source message broker software that facilitates message passing between systems. It uses the Advanced Message Queuing Protocol (AMQP) to send, receive, and store messages for reliable communication.
What is AMQP (Advanced Message Queuing Protocol)?
AMQP is an open standard for messaging protocols that defines the rules for how messages should be sent and received between systems. It is the main protocol used by RabbitMQ to ensure reliable message delivery.
How does RabbitMQ ensure reliable message delivery?
RabbitMQ ensures reliable message delivery through features like message acknowledgment, message persistence, and message retries. It allows messages to be queued until they are successfully processed by consumers.
What are the main components of RabbitMQ?
The main components of RabbitMQ are:
Producer: Sends messages to RabbitMQ. Queue: Holds messages until they are processed. Consumer: Retrieves and processes messages from queues. Exchange: Routes messages from producers to queues.
What is Kafka?
Kafka is a distributed event streaming platform used for building real-time data pipelines and streaming applications. It is designed to handle high throughput and low-latency messaging with large volumes of data.
What are the key differences between RabbitMQ and Kafka?
RabbitMQ: Primarily a message broker that supports traditional message queuing with reliability guarantees.
Kafka: A distributed streaming platform that handles high-throughput event streaming and long-term storage.
What is the Kafka Producer and Consumer?
Producer: An application that sends messages (events) to Kafka topics.
Consumer: An application that reads messages from Kafka topics.
What is the role of Kafka Topics?
In Kafka, Topics are logical channels to which messages are sent by producers and from which messages are consumed by consumers. Topics allow Kafka to organize and partition data for better scalability.
What is Kafka’s Partitioning mechanism?
Kafka partitions messages within a topic across multiple servers (brokers), enabling parallel processing and scaling. Each partition is ordered, and messages within a partition are consumed in the same order they were written.
What is the difference between RabbitMQ and Kafka in message retention?
RabbitMQ: Deletes messages from the queue once they are acknowledged by consumers.
Kafka: Retains messages for a configurable period, even after they have been consumed, allowing consumers to replay messages.
What is a Message Queue in RabbitMQ?
A Message Queue in RabbitMQ is a temporary storage area where messages are stored until they are consumed. It ensures the decoupling of producers and consumers, allowing asynchronous processing.
What is Message Acknowledgment in RabbitMQ?
Message acknowledgment is a mechanism where the consumer signals to RabbitMQ that it has successfully processed a message. If a message is not acknowledged, RabbitMQ can retry delivering it or mark it as failed.