Backend Flashcards
What is difference between monolith and microservices?
Monolith is a development approach, when we have all our code inside one project. So, we have backend, frontend, different parts of our application, all in same space.
Microservice architecture means, that we split our application into smaller logical parts, each responsible only for narrow functionality, and treat each of them as separate project.
Both monolith and microservice architecture have their advantages and disadvantages.
Microservice advantages are next:
First of all, our microservices can be developed by smaller and separate teams. That means potentially higher programmer performance, because they need less time to communicate, plan their work, and also when new team member start working, there are no need to deeply learn how whole application works, you just need to learn how your microservice work.
Next thing, it that microservice are independent from each other. They can be developed, scaled and delivered separately. Also, if one of microservices will crash, this won’t lead to problems in other services, except if they directly depend on each other.
Third advantage, is that micrcoservices can be developed using different technologies and programming languages. That mean, that first of all, you can use language, that is most suitable for your needs, and not limited by need to select one more-less universal language.
To sum up, microservices give you flexibility, scalability and convenience in most cases.
From other side, monolith.
Monolith also has certain advantages over microservice architecture.
First of all, monolith is easier to manage and monitor. You have one service, not one hundred, and all in one place.
Second, is problem with data consistency. When you data is distributed among multiple microservices, you can have problems with keeping all data consistent at every moment of time. This can lead to potential bugs.
Third, is performance and data transfer. In monoliths, when you need to access other part of application, you just call required functions or methods. In microservices architecture, you must send request using network. This is much slower and reduces performance of whole system.
Finally, I would say, that testing in certain aspects can become more difficult. For example, that is much more difficult to test interaction between application parts in microservices, than in monolith.
To sum up, I would say that monolith is good solution for small and low load applications, because this case it makes development faster and more reliable, and microservices are perfect for large companies experiencing high load, because it provides better scalability and convenience.
What are first class functions?
First class functions are functions what can be treated as variables. They can be passed into other function as parameter, or received from other function as function return value.
What is Docker and what is it used for?
Docker is an instrument, that allows for creation and deployments of containers.
Container is isolated and portable software package, containing all required all components required to run certain applications in any environment
What is Kafka and what it is used for?
Kafka is a message broker, used to store and distribute messages.
The idea of Kafka is to handle messages, sent by producers, store them for a long time in order they were sent, and make them accessible for multiple consumers, and do all of these things with high performance and scalability.
So, when, in what situation we could need Kafka?
I would say, that typical use case is a situation when we have multiple services what have to communicate asynchronously and be sure that any important infromation won’t be lost.
For example, let’s imagine that we have two services, let’s name them A and B. Service A send request to service B, providing certain information, and want to receive response containing results of processing this information.
There are two potential problems we have.
First is that information can be lost. For example, we sent request from service A to service B, service B received it, but accidentally server B was turned off, so it lost all the information. And if we didn’t store this information on service A, and we don’t have any mechanisms for this exact case, that will resend this request after service B will be turned on, we have all chances to lose all information related to this request.
But if we use Kafka, then that is not a problem, because after server B reboot it will just check all messages from Kafka, which were not successfully handled by it, and restart the process, so no information will be lost at all.
Second problem is resources consumption. Yes, of course, we can try to handle this request asynchronously, and while server B is processing our request, thread on server A can do something else, but this anyway consumes processing power and memory, which can be quite noticeable if request handling takes long time.
But if we use Kafka, service A can just send a message containing all of the required information, and forget about it. After that service B reads this message, do some work, and send another message containing request result, which can be later read by service A.
This case we increase performance of our services.
Also, we need Kafka in case if we have multiple consumers, consuming same information.
For example, let’s imagine that we have service handling employees, and the employee resigned. And we need to send information about that to service what works with documents, to service what works with salaries, and service that works, I don’t know, with equipment issued to an employee.
Without Kafka, you will have to send three separate HTTP requests to each of these servers. It does not sound like something time-saving, isn’t it?
That’s why you need Kafka, because this case using it you can send one message and forget about it, and this message can be read by multiple consumers later.
On my current workplace in our team it is used mainly in order to notify other services about events related to employee status and employee documents. Was new employee hired or fired, which document employee signed up and which documents employee refused to sign, everything like that.
What is Kafka topic?
Kafka topic is some kind of channel dedicated only for certain type of messages with certain scructure.
It is needed in order to simplify messages handling.
Without it, if we had one common channel, we would need to somehow manually detect message structure, its origin and purpose on consumer side, which is not convenient.
But if we have topics, then consumers can be sure about purpose and structure of each message they read from them.
What is difference between different delivery gurantees types?
- At most once
- At least once
In case of at most once delivery, message will be delivered once or won’t be delivered in case of any problems on its way.
This is fastest variant of delivery, because this case we don’t have to save our messages on disk to gurantee their delivery, and we don’t need to implement additional mechanisms to ensure they were delivered.
This can be used in cases when we can afford losing our messages.
For example, this can be web analythics collection, or low-importance logs.
At least once delivery gurantee that message will be delivered once or more times.
This case message is stored on disk, and when it is send to Kafka, or read from Kafka, Kafka ensures that all these steps were successfull, and for example will tell producer if message uploading failed.
But this case there is a chance of message duplication, so we need to implement certain mechanisms hanlding it on consumer side.
What does it mean for Kafka consumer do be idempotent?
Idempotentance of consumer means that consumer through its internal mechanisms implements protection agains messages duplication
What is difference between classical databases and columnar databases?
The difference is that in classical SQL databases information is stored in rows, so whole table splitted into separate items, where each item is kind of object what consist of values of columns that you have in table.
Same time, in columnar databases, our data is stored by columns.
Practically, that means that when we make a request in regular database, then database has to scan the whole row, and only after reading it it can get values from certain columns that we need. In case of columnar databases, we don’t have such problem, because we read only these columns that we need and ignore rest of data.
Second major difference is use scenarios. In case of regular database this is situation when we have frequent inserts, updates and deletes, and we need access to all or to most of field in a row. Typical example is request returning certain user from database, with all its fields.
In case of columnar databases, typical use is selection of thousands or maybe millions of items, containing only few certain columns. Also, ususally we don’t make small requests or inserts, we usually operate with large amounts of data.
What is Kubernetes
Kubernetes is a software for containers management, which allows to automatize containers administration, monitoring, deployment and scaling of applications inside containers.
Kuberneted helps to make application always accessible, more suitable for operation under high load, and easily recoverable.
Kubernetes operates a group of servers, connected to each other. This group is called a cluster, and each server inside the group is called a node.
Nodes are splitted into two types - master node and worker node.
Master node is responsible for control and tasks distribution between worker nodes. It consists of few main blocks, which are:
- API server, block responsible for communication between master and workers
- Controller Manager, responsible for current cluster state observing and control, moving system from current state into desired state through pods management
- Schelduler, block which purpose is to decide what containers should be placed on what nodes depending on different factors, like current node load or potential node performance
- etcd storage. This is simple key-value database, where is stored information about cluster, like configuration data, nodes statuses, containers statuses etc
Worker node also consists of few blocks, which are:
- So called “kubelet”, communicating with master node and receiving instructions about what and how should work on this exact node.
- Second component is container runtime, responsible for containers images, staring and stopping it and resources management
- Kube-proxy, reponsible for communication and internal network balancing
Also, the main mechanism that Kubernetes is using to reach its goals is pods. Pods are some kind of lightweight wrappers over containers, and through distributing them between nodes, adding or removing new pods, and recovering containers inside pods in case of problems Kubernetes shares the load between different nodes and makes whole system reliable and recoverable.
// INFO
Container is isolated and portable software package, containing all required all components required to run certain applications in any environment
What is HTTP protocol?
HTTP is a text protocol that operates on the request-response principle.
An HTTP request consists of a start line (including a method, resource identifier, and protocol version), headers, and a message body.
An HTTP response is similar in structure to an HTTP request, but the start line contains a status code instead of a method and URI.
What are main HTTP methods?
GET, POST, PUT, DELETE
What is the difference between REST and RPC?
Rest is an architectural style for developing applications, which are using network and work in which is based on interaction with resources. REST uses some restrictions and patterns to keep system simple, reliable and scalable.
Each resource is represented by an API endpoint.
Each request is independent; servers don’t store client state
Final principle is code on demand. So, client does not have any source code on itself, instead, sever is sending source code to client, and then client executes it. Through this we get flexibility, because we can freely change source code on server side and it will automatically affect on final result on client side.
REST is best for standard, scalable APIs with predictable CRUD operations.
RPC (Remote Procedure Call) is a call-based model where the client executes remote functions as if they were local.
It is focused on function execution, not on CRUD operations.
Also RPC support more formats than REST. It support JSON, XML and also binary formats, like gRPC.
RPC is optimized for high-performance needs.
RPC is best for microservices or high-performance systems with actions over resources management.
What are cookies?
Cookies are small pieces with limited lifetime, can be sent to server through special HTTP headers. Both server and client can read and write cookies, and this is possible to make cookies accessible only from server side, which is useful for storing authorization tokens. Usually they are used to store small pieces of information about, for example, authorization session inside application, so this case they store access token, for example.
What is a closure, and how/why would you use one?
Closure is a function, which remembers its lexical environment variables when it was created, and then can access them later.
I would use a closure in case if I needed to create a callback, or in more specific cases, for example if I needed to create a function that returns another function that acts like a counter, returning increased value each time it is called.
Can you give an example of a curry function and why this syntax offers an advantage?
Curry function is a function of higher order, which takes arguments one by one and on each step except of last, returns another function.
Why this syntax offers an advantage? Well, probably is some cases we could need to have intermediate function with already partially applied arguments, and then we can use it in code later, without need to put again and again same parameters.
What is SOLID?
SOLID is set of rules that is recommended to follow to make code development easier.
There are five rules.
First rule is „Single responsibility“. This mean, that one class should be focused on making actions in only one aspect of application. For example, if we have a shop and for this shop we have class Product, that by idea describes one type of commodities we are selling, then this class Product can have properties that describe this product, but for example must not have any method that will directly implement saving information about this product into the database, because that is absolutely another functionality. To implement this functionality, we should create another class, what will be focused on interaction between any kind of product we have and database.
Second is “Open-Closed principle”. This mean that any class we have should be open for extension, but closed for modification. If to talk simply, we must provide possibilities to add new functionality into class without touching existing code. This is possible to do using abstract classes and interfaces.
For example, we have a function what takes array of different geometrical figures and must draw it. Each figure has its top left point coordinates, and figure-specific information, for example, for square that can be length of its side, and for triangle length of all its three sides, and for other figures something else.
There are two ways of doing this. First way, we could create separate classes for each of shapes, each with its own specific information and methods. And then, in function, we check what class we interacting with and depending on this, we draw it, calling class-specific methods or functions.
Other way, where we follow open-closed principle. We create certain interface for abstract shape, and in this interface we define only top left point and method Draw(). After that, we create all other shapes classes using that interface. During program execution, we handle each shape as abstract instance of this interface implementing object, and just call function Draw().
This case we are free to add new shapes classes without need to modify existing code of function.
Third is “Liskov Substitution principle”. It says, that any child class of base class should extend functionality, but never narrow it.
For example, we have a class that describes Rectangle. One moment, we want to add class, that is describing Square. The problem here is, that while Rectangle has width and height and these values can be different, Square width and height are always same. And if we inherit class Square from Rectangle and then override it that way, that when we set width, it also modifies height, and vice versa, then we violated this principle. This is bad, because this case any function or method that interacts with Rectangle, and during program execution can get instance of Square class as instance of Rectangle class, it will suffer from problems related to this width and height issues.
Fourth is interface separation principle. This principle says us, that this is better to have a big number of small and specific interfaces, that small number of big interfaces.
This is important because at first, helps us to develop program in cases, when there could be classes with similar functionality, but what are different in some moments.
For example, imagine we have a class describing a calculator, and we have one big interface for this calculator. This calculator has huge functionality and allows us to use many complex mathematical functions. One moment, we decide, that we need to create another calculator class, but this case we need just simplest functionality, like possibility to add, subtract, multiply and divide numbers. And there is the problem, because this case if we use same interface as for previous complex calculator class, then we have to implement many methods that will never be used. This is not good.
So, instead of making one big interface, we make a lot of small interfaces. For example, one interface describes method that sums up two numbers, second describes multiplying and so on. This case we can just implement few small interfaces for simple calculator class.
Last principle is “Dependency inversion principle”
This principle says that classes should depend on interfaces or abstract classes instead of exact classes or functions.
What is KISS?
KISS is principle, that says, that code must be kept as simple and understandable, as possible.
What design patterns do you know?
Builder, Singletone, Event Delegation, Decorator, Worker Pool, Fan-In Fan-Out
Builder. This pattern is useful when we want to have a convenient and understandable way to create a class instance with certain set of characteristics.
There are three ways we can go.
First way is to follow telescopic constructor pattern. This case we have a very complicated constructor with many optional parameters on input, or multiple overloads of this constructor with different parameters on input. The problem this case is that we have to provide big number of arguments, and this makes code writing and reading difficult.
Second way is so named JavaBeans pattern. This case we use constructor with no parameters to create a class instance, and then initialize all required fields of object by calling corresponding setters right after class instance creation. This pattern is solving readability issues, but is not so reliable, because you always can forget to initialize some of object fields. Also, this approach makes it not possible to make objects of this class immutable.
Finally, builder pattern. This way we create else another class, which we use in first class constructor. In builder class, we have all these fields, both optional and required, and getters and setters for each of them. We set up all builder class fields to match needed configuration, and then provide it as only argument of first class constructor.
This case we have all benefits, what have telescopic constructor and JavaBeans patterns. I mean, from one side, this is easy to read and edit, and we can initialize fields in any order, as in JavaBeans pattern, but we don’t lose possibility to make our final class instance immutable, or require some input parameters to be provided, like we have in telescopic constructor pattern.
Next pattern in Singletone. The idea of the pattern is to have only one instance of certain class in our application. This can be static class which we use to call API, or database, or just class storing our application settings.
Another pattern is event delegation. …
Finally, last pattern I can remember is decorator. With decorator pattern, we wrap an object or function with another object or function in order to modify its behavior without changing class or class instance. The simplest example of this pattern are function debouncing or throttling in JavaScript. For example, we have a function, which logs certain text into console, but we don’t want it to be called too often. So, we create another function, which takes our function as input parameter, and calls it only if certain time period passed from previous function call. In case of throttling, it also calls our function in the end of debouncing period, if it was called during it.
Worker Pool - when we want to do certain tasks in parallel, instead of just starting separate goroutine for each of them, we limit number of goroutines that can exist at same moment of time, in order to avoid resources overconsumption peaks.
Fan-In Fan-Out
We split our tasks between multiple goroutines / workers. We don’t control what data is sent into each of these goroutines, instead, we use channels to provide batch of data to any of ready-to-work worker and collect resulting data through reading from output channel, into what goroutines send their execution results.
What is CI/CD?
CI/CD means continuous integration and continuous delivery. The idea is to have an automated pipeline which will analyze and test code after developers made changes, and then deploy it into environment, where application is accessible by developers, or, if that were fully tested and approved features, by final users.
The aim of CI/CD is to improve reliability of code development and deployment process and automatize some things which other way would have to be made by developers manually.
Usually typical CI/CD pipeline consist of next steps:
First, detect changes in version control system, and build the project.
Then, move final code into test environment, where is possible to run it, and turn on application and related things, like databases, API endpoints, and others.
After that, go through the application and test it for bugs.
If everything is okay, then deploy an application, else, stop the pipeline and notify developers about it.
MVC, MVP, MVVM
MVC – Model-View-Controller
MVP – Model-View-Presenter
MVVM – Model-View-View-Model
In MVC we have three parts of our application. This is model, view and controller.
The model is a business logic layer of application, or in other worlds, layer, responsible for manipulations with data.
View is an interface layer, responsible for showing data to user.
And controller is a layer between them, responsible for reliable communication between other two layers. So, for example, when user presses the button on View layer, it is sending request for Controller layer to return some data from Model. Controller Is checking if everything is okay with request and if yes, then returns requested data.
The benefit from this organization of code is that our Model, View, and Controller are mostly separated from each other, and in MVC application modification of each of these layers can be made independently.
What is clean function?
Clean function is a function, which has two properties.
First of all, it does not have any side effects. So, all changes, what it can do, it makes inside its internal scope.
Second moment is that in case of clean function each time we provide same data on input, we get absolutely same output.
Clean functions are easier to test and support, and also in case of clear function we can use caching decorator, to make it work faster in some cases.
How do you achieve clean architecture in your code?
There are few things what I can remember now that you can use to make code better.
First of all, responsibilities separating. So, your code should be divided into modules, and each of modules should be responsible only for certain narrow functionality.
Then, whole project structure must be predictable. So you need to have certain agreement, where you put different parts of code in your project.
After that, code should be clean. So, no five hundred lines methods, no senseless names, nothing like that. That should be easy to understand what your code is about, and easy to edit it.
Then, you should follow certain architectural patterns, like MVC or any other, so your project should be divided into separate layers, each of them doing only certain kind of tasks.
Finally, you should follow SOLID and KISS principles to make your code more understandable and reliable.
What is big O notation?
Big O notation is a concept, which is used to measure resources consumptions increase rate when using different algorithms. This resource can be time, number of operations required to calculate result, or memory consumption.
Big O notation shows us, how much certain resource consumption changes, when we change input data in certain direction.
Right now I can remember few of Big O notation complexity types.
Complexity can be linear, or in other words, resource consumption increases in same rate as amount of data to process on input.
Also complexity can be logarithmic, linear-logarithmic, square or have another power, or can be factorial, constant or exponential.
What are primitives?
Primives are types, which are provided by value.