Backend Flashcards

1
Q

What is difference between monolith and microservices?

A

Monolith is a development approach, when we have all our code inside one project. So, we have backend, frontend, different parts of our application, all in same space.
Microservice architecture means, that we split our application into smaller logical parts, each responsible only for narrow functionality, and treat each of them as separate project.

Both monolith and microservice architecture have their advantages and disadvantages.
Microservice advantages are next:

First of all, our microservices can be developed by smaller and separate teams. That means potentially higher programmer performance, because they need less time to communicate, plan their work, and also when new team member start working, there are no need to deeply learn how whole application works, you just need to learn how your microservice work.

Next thing, it that microservice are independent from each other. They can be developed, scaled and delivered separately. Also, if one of microservices will crash, this won’t lead to problems in other services, except if they directly depend on each other.

Third advantage, is that micrcoservices can be developed using different technologies and programming languages. That mean, that first of all, you can use language, that is most suitable for your needs, and not limited by need to select one more-less universal language.

To sum up, microservices give you flexibility, scalability and convenience in most cases.

From other side, monolith.
Monolith also has certain advantages over microservice architecture.

First of all, monolith is easier to manage and monitor. You have one service, not one hundred, and all in one place.

Second, is problem with data consistency. When you data is distributed among multiple microservices, you can have problems with keeping all data consistent at every moment of time. This can lead to potential bugs.

Third, is performance and data transfer. In monoliths, when you need to access other part of application, you just call required functions or methods. In microservices architecture, you must send request using network. This is much slower and reduces performance of whole system.

Finally, I would say, that testing in certain aspects can become more difficult. For example, that is much more difficult to test interaction between application parts in microservices, than in monolith.

To sum up, I would say that monolith is good solution for small and low load applications, because this case it makes development faster and more reliable, and microservices are perfect for large companies experiencing high load, because it provides better scalability and convenience.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are first class functions?

A

First class functions are functions what can be treated as variables. They can be passed into other function as parameter, or received from other function as function return value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Docker and what is it used for?

A

Docker is an instrument, that allows for creation and deployments of containers.
Container is isolated and portable software package, containing all required all components required to run certain applications in any environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Kafka and what it is used for?

A

Kafka is a message broker, used to store and distribute messages.
The idea of Kafka is to handle messages, sent by producers, store them for a long time in order they were sent, and make them accessible for multiple consumers, and do all of these things with high performance and scalability.

So, when, in what situation we could need Kafka?
I would say, that typical use case is a situation when we have multiple services what have to communicate asynchronously and be sure that any important infromation won’t be lost.

For example, let’s imagine that we have two services, let’s name them A and B. Service A send request to service B, providing certain information, and want to receive response containing results of processing this information.
There are two potential problems we have.
First is that information can be lost. For example, we sent request from service A to service B, service B received it, but accidentally server B was turned off, so it lost all the information. And if we didn’t store this information on service A, and we don’t have any mechanisms for this exact case, that will resend this request after service B will be turned on, we have all chances to lose all information related to this request.
But if we use Kafka, then that is not a problem, because after server B reboot it will just check all messages from Kafka, which were not successfully handled by it, and restart the process, so no information will be lost at all.
Second problem is resources consumption. Yes, of course, we can try to handle this request asynchronously, and while server B is processing our request, thread on server A can do something else, but this anyway consumes processing power and memory, which can be quite noticeable if request handling takes long time.
But if we use Kafka, service A can just send a message containing all of the required information, and forget about it. After that service B reads this message, do some work, and send another message containing request result, which can be later read by service A.
This case we increase performance of our services.

Also, we need Kafka in case if we have multiple consumers, consuming same information.
For example, let’s imagine that we have service handling employees, and the employee resigned. And we need to send information about that to service what works with documents, to service what works with salaries, and service that works, I don’t know, with equipment issued to an employee.
Without Kafka, you will have to send three separate HTTP requests to each of these servers. It does not sound like something time-saving, isn’t it?
That’s why you need Kafka, because this case using it you can send one message and forget about it, and this message can be read by multiple consumers later.

On my current workplace in our team it is used mainly in order to notify other services about events related to employee status and employee documents. Was new employee hired or fired, which document employee signed up and which documents employee refused to sign, everything like that.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Kafka topic?

A

Kafka topic is some kind of channel dedicated only for certain type of messages with certain scructure.
It is needed in order to simplify messages handling.
Without it, if we had one common channel, we would need to somehow manually detect message structure, its origin and purpose on consumer side, which is not convenient.
But if we have topics, then consumers can be sure about purpose and structure of each message they read from them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is difference between different delivery gurantees types?
- At most once
- At least once

A

In case of at most once delivery, message will be delivered once or won’t be delivered in case of any problems on its way.
This is fastest variant of delivery, because this case we don’t have to save our messages on disk to gurantee their delivery, and we don’t need to implement additional mechanisms to ensure they were delivered.
This can be used in cases when we can afford losing our messages.
For example, this can be web analythics collection, or low-importance logs.

At least once delivery gurantee that message will be delivered once or more times.
This case message is stored on disk, and when it is send to Kafka, or read from Kafka, Kafka ensures that all these steps were successfull, and for example will tell producer if message uploading failed.
But this case there is a chance of message duplication, so we need to implement certain mechanisms hanlding it on consumer side.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does it mean for Kafka consumer do be idempotent?

A

Idempotentance of consumer means that consumer through its internal mechanisms implements protection agains messages duplication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is difference between classical databases and columnar databases?

A

The difference is that in classical SQL databases information is stored in rows, so whole table splitted into separate items, where each item is kind of object what consist of values of columns that you have in table.
Same time, in columnar databases, our data is stored by columns.
Practically, that means that when we make a request in regular database, then database has to scan the whole row, and only after reading it it can get values from certain columns that we need. In case of columnar databases, we don’t have such problem, because we read only these columns that we need and ignore rest of data.

Second major difference is use scenarios. In case of regular database this is situation when we have frequent inserts, updates and deletes, and we need access to all or to most of field in a row. Typical example is request returning certain user from database, with all its fields.
In case of columnar databases, typical use is selection of thousands or maybe millions of items, containing only few certain columns. Also, ususally we don’t make small requests or inserts, we usually operate with large amounts of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Kubernetes

A

Kubernetes is a software for containers management, which allows to automatize containers administration, monitoring, deployment and scaling of applications inside containers.
Kuberneted helps to make application always accessible, more suitable for operation under high load, and easily recoverable.

Kubernetes operates a group of servers, connected to each other. This group is called a cluster, and each server inside the group is called a node.
Nodes are splitted into two types - master node and worker node.
Master node is responsible for control and tasks distribution between worker nodes. It consists of few main blocks, which are:
- API server, block responsible for communication between master and workers
- Controller Manager, responsible for current cluster state observing and control, moving system from current state into desired state through pods management
- Schelduler, block which purpose is to decide what containers should be placed on what nodes depending on different factors, like current node load or potential node performance
- etcd storage. This is simple key-value database, where is stored information about cluster, like configuration data, nodes statuses, containers statuses etc

Worker node also consists of few blocks, which are:
- So called “kubelet”, communicating with master node and receiving instructions about what and how should work on this exact node.
- Second component is container runtime, responsible for containers images, staring and stopping it and resources management
- Kube-proxy, reponsible for communication and internal network balancing

Also, the main mechanism that Kubernetes is using to reach its goals is pods. Pods are some kind of lightweight wrappers over containers, and through distributing them between nodes, adding or removing new pods, and recovering containers inside pods in case of problems Kubernetes shares the load between different nodes and makes whole system reliable and recoverable.

// INFO
Container is isolated and portable software package, containing all required all components required to run certain applications in any environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is HTTP protocol?

A

HTTP is a text protocol that operates on the request-response principle.
An HTTP request consists of a start line (including a method, resource identifier, and protocol version), headers, and a message body.
An HTTP response is similar in structure to an HTTP request, but the start line contains a status code instead of a method and URI.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are main HTTP methods?

A

GET, POST, PUT, DELETE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the difference between REST and RPC?

A

Rest is an architectural style for developing applications, which are using network and work in which is based on interaction with resources. REST uses some restrictions and patterns to keep system simple, reliable and scalable.

Each resource is represented by an API endpoint.
Each request is independent; servers don’t store client state

Final principle is code on demand. So, client does not have any source code on itself, instead, sever is sending source code to client, and then client executes it. Through this we get flexibility, because we can freely change source code on server side and it will automatically affect on final result on client side.

REST is best for standard, scalable APIs with predictable CRUD operations.

RPC (Remote Procedure Call) is a call-based model where the client executes remote functions as if they were local.

It is focused on function execution, not on CRUD operations.
Also RPC support more formats than REST. It support JSON, XML and also binary formats, like gRPC.
RPC is optimized for high-performance needs.

RPC is best for microservices or high-performance systems with actions over resources management.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are cookies?

A

Cookies are small pieces with limited lifetime, can be sent to server through special HTTP headers. Both server and client can read and write cookies, and this is possible to make cookies accessible only from server side, which is useful for storing authorization tokens. Usually they are used to store small pieces of information about, for example, authorization session inside application, so this case they store access token, for example.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a closure, and how/why would you use one?

A

Closure is a function, which remembers its lexical environment variables when it was created, and then can access them later.

I would use a closure in case if I needed to create a callback, or in more specific cases, for example if I needed to create a function that returns another function that acts like a counter, returning increased value each time it is called.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Can you give an example of a curry function and why this syntax offers an advantage?

A

Curry function is a function of higher order, which takes arguments one by one and on each step except of last, returns another function.

Why this syntax offers an advantage? Well, probably is some cases we could need to have intermediate function with already partially applied arguments, and then we can use it in code later, without need to put again and again same parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is SOLID?

A

SOLID is set of rules that is recommended to follow to make code development easier.

There are five rules.
First rule is „Single responsibility“. This mean, that one class should be focused on making actions in only one aspect of application. For example, if we have a shop and for this shop we have class Product, that by idea describes one type of commodities we are selling, then this class Product can have properties that describe this product, but for example must not have any method that will directly implement saving information about this product into the database, because that is absolutely another functionality. To implement this functionality, we should create another class, what will be focused on interaction between any kind of product we have and database.

Second is “Open-Closed principle”. This mean that any class we have should be open for extension, but closed for modification. If to talk simply, we must provide possibilities to add new functionality into class without touching existing code. This is possible to do using abstract classes and interfaces.

For example, we have a function what takes array of different geometrical figures and must draw it. Each figure has its top left point coordinates, and figure-specific information, for example, for square that can be length of its side, and for triangle length of all its three sides, and for other figures something else.
There are two ways of doing this. First way, we could create separate classes for each of shapes, each with its own specific information and methods. And then, in function, we check what class we interacting with and depending on this, we draw it, calling class-specific methods or functions.

Other way, where we follow open-closed principle. We create certain interface for abstract shape, and in this interface we define only top left point and method Draw(). After that, we create all other shapes classes using that interface. During program execution, we handle each shape as abstract instance of this interface implementing object, and just call function Draw().
This case we are free to add new shapes classes without need to modify existing code of function.

Third is “Liskov Substitution principle”. It says, that any child class of base class should extend functionality, but never narrow it.

For example, we have a class that describes Rectangle. One moment, we want to add class, that is describing Square. The problem here is, that while Rectangle has width and height and these values can be different, Square width and height are always same. And if we inherit class Square from Rectangle and then override it that way, that when we set width, it also modifies height, and vice versa, then we violated this principle. This is bad, because this case any function or method that interacts with Rectangle, and during program execution can get instance of Square class as instance of Rectangle class, it will suffer from problems related to this width and height issues.

Fourth is interface separation principle. This principle says us, that this is better to have a big number of small and specific interfaces, that small number of big interfaces.
This is important because at first, helps us to develop program in cases, when there could be classes with similar functionality, but what are different in some moments.

For example, imagine we have a class describing a calculator, and we have one big interface for this calculator. This calculator has huge functionality and allows us to use many complex mathematical functions. One moment, we decide, that we need to create another calculator class, but this case we need just simplest functionality, like possibility to add, subtract, multiply and divide numbers. And there is the problem, because this case if we use same interface as for previous complex calculator class, then we have to implement many methods that will never be used. This is not good.
So, instead of making one big interface, we make a lot of small interfaces. For example, one interface describes method that sums up two numbers, second describes multiplying and so on. This case we can just implement few small interfaces for simple calculator class.

Last principle is “Dependency inversion principle”

This principle says that classes should depend on interfaces or abstract classes instead of exact classes or functions.

17
Q

What is KISS?

A

KISS is principle, that says, that code must be kept as simple and understandable, as possible.

18
Q

What design patterns do you know?

A

Builder. This pattern is useful when we want to have a convenient and understandable way to create a class instance with certain set of characteristics.

There are three ways we can go.

First way is to follow telescopic constructor pattern. This case we have a very complicated constructor with many optional parameters on input, or multiple overloads of this constructor with different parameters on input. The problem this case is that we have to provide big number of arguments, and this makes code writing and reading difficult.

Second way is so named JavaBeans pattern. This case we use constructor with no parameters to create a class instance, and then initialize all required fields of object by calling corresponding setters right after class instance creation. This pattern is solving readability issues, but is not so reliable, because you always can forget to initialize some of object fields. Also, this approach makes it not possible to make objects of this class immutable.

Finally, builder pattern. This way we create else another class, which we use in first class constructor. In builder class, we have all these fields, both optional and required, and getters and setters for each of them. We set up all builder class fields to match needed configuration, and then provide it as only argument of first class constructor.
This case we have all benefits, what have telescopic constructor and JavaBeans patterns. I mean, from one side, this is easy to read and edit, and we can initialize fields in any order, as in JavaBeans pattern, but we don’t lose possibility to make our final class instance immutable, or require some input parameters to be provided, like we have in telescopic constructor pattern.

Next pattern in Singletone. The idea of the pattern is to have only one instance of certain class in our application. This can be static class which we use to call API, or database, or just class storing our application settings.

Another pattern is event delegation. …

Finally, last pattern I can remember is decorator. With decorator pattern, we wrap an object or function with another object or function in order to modify its behavior without changing class or class instance. The simplest example of this pattern are function debouncing or throttling in JavaScript. For example, we have a function, which logs certain text into console, but we don’t want it to be called too often. So, we create another function, which takes our function as input parameter, and calls it only if certain time period passed from previous function call. In case of throttling, it also calls our function in the end of debouncing period, if it was called during it.

19
Q

What is CI/CD?

A

CI/CD means continuous integration and continuous delivery. The idea is to have an automated pipeline which will analyze and test code after developers made changes, and then deploy it into environment, where application is accessible by developers, or, if that were fully tested and approved features, by final users.

The aim of CI/CD is to improve reliability of code development and deployment process and automatize some things which other way would have to be made by developers manually.

Usually typical CI/CD pipeline consist of next steps:
First, detect changes in version control system, and build the project.
Then, move final code into test environment, where is possible to run it, and turn on application and related things, like databases, API endpoints, and others.
After that, go through the application and test it for bugs.
If everything is okay, then deploy an application, else, stop the pipeline and notify developers about it.

20
Q

MVC, MVP, MVVM

A

MVC – Model-View-Controller
MVP – Model-View-Presenter
MVVM – Model-View-View-Model

In MVC we have three parts of our application. This is model, view and controller.
The model is a business logic layer of application, or in other worlds, layer, responsible for manipulations with data.
View is an interface layer, responsible for showing data to user.
And controller is a layer between them, responsible for reliable communication between other two layers. So, for example, when user presses the button on View layer, it is sending request for Controller layer to return some data from Model. Controller Is checking if everything is okay with request and if yes, then returns requested data.
The benefit from this organization of code is that our Model, View, and Controller are mostly separated from each other, and in MVC application modification of each of these layers can be made independently.

21
Q

What is clean function?

A

Clean function is a function, which has two properties.

First of all, it does not have any side effects. So, all changes, what it can do, it makes inside its internal scope.

Second moment is that in case of clean function each time we provide same data on input, we get absolutely same output.

Clean functions are easier to test and support, and also in case of clear function we can use caching decorator, to make it work faster in some cases.

22
Q

How do you achieve clean architecture in your code?

A

There are few things what I can remember now that you can use to make code better.

First of all, responsibilities separating. So, your code should be divided into modules, and each of modules should be responsible only for certain narrow functionality.

Then, whole project structure must be predictable. So you need to have certain agreement, where you put different parts of code in your project.

After that, code should be clean. So, no five hundred lines methods, no senseless names, nothing like that. That should be easy to understand what your code is about, and easy to edit it.

Then, you should follow certain architectural patterns, like MVC or any other, so your project should be divided into separate layers, each of them doing only certain kind of tasks.

Finally, you should follow SOLID and KISS principles to make your code more understandable and reliable.

23
Q

What is big O notation?

A

Big O notation is a concept, which is used to measure resources consumptions increase rate when using different algorithms. This resource can be time, number of operations required to calculate result, or memory consumption.

Big O notation shows us, how much certain resource consumption changes, when we change input data in certain direction.

Right now I can remember few of Big O notation complexity types.
Complexity can be linear, or in other words, resource consumption increases in same rate as amount of data to process on input.
Also complexity can be logarithmic, linear-logarithmic, square or have another power, or can be factorial, constant or exponential.

24
Q

What are primitives?

A

Primives are types, which are provided by value.

25
Q

What is difference between different databases (MySQL, PostgreSQL, SQLITE)

A

Well… If to talk about differences between relational databases, then there can be quite a lot of differences depending on database.
Depending on database, can differ data types, which can be stored inside its cells, or some other things like completeness of SQL language implementation, performance or extensibility.

For example, lets take a look on three most popular databases, which are SQLite, MySQL and PostgreSQL.

SQLite is a simple file-oriented database. It provides most basic functionality and implements database storage as single file on hard drive. It has some limitations, comparing to full SQL standarts, and also has quite limited amount of data types it can use. But, from another side, its quite compact and has quite good portability. SQLite database is perfect fit for small local storages, which should not be accessible from multiple places.

MySQL is more advanced and most popular server-side database. It provides much wider column types set, good scalability and good safety out of the box, but still does not implement all SQL operations, and has some issues with read-write operations, happening at same time.

PostreSQL is most advanced relational database, completely implementing all SQL features and having good concurrently executed tasks support, but has lower performance and higher complexity, comparing to other instruments.

26
Q

What is difference between monolith and microservices architecture?

A

The main difference is that in case of monolith, all our solution parts are stored together, in tight integration, and in case of microservices, our solution is separated into few independent modules, responsible for certain narrow functionality, and communicating with each other using Internet.

About advantages and disadvantages of each approach.

Monolith is good because it all project parts and together in one repository, and if you need to connect one part of application with another, you can do it easily and quickly.
Monorepository is bad from that side, because if you need to access functionality of one application part from another application part, stored inside another monorepository, then you have to do this through HTTP requests or somehow else through the Internet, and this is much slower and not so convenient.

Second moment, is scalability. This is much easier and much faster to expand module with narrow functionality, than expanding monolith.

Also, this is onboarding on microservices projects is much faster and easier, comparing to monoliths, because this case new developer need to learn less new information.

Finally, I would say, that this can be a problem if microservices are not completely agreed about format of their endpoints and functionality. There is needed more interaction between different teams to make everything work.
Same time, in monolith, you always can see what works in what way and how these parts are connected to each other.

⏩Monolith:

✔️Simple: One codebase, easier to manage.
✔️Fast: No network delays, just quick performance.
✔️Unified: Easier data management.

❌Scaling: Hard to scale parts individually.
❌Changes: One change can affect the whole system.

⏩Microservices:

✔️Scalable: Scale what you need, when you need it.
✔️Flexible: Teams work independently, deploy faster.
✔️Resilient: One service fails, others keep going.

❌Complex: More moving parts to manage.
❌Slow: Network communication can add delays.
❌Data: Keeping everything in sync is tricky.

27
Q

What is a recent technical challenge you experienced and how did you solve it?

A

Рассказать про разработку идеи и внедрение новой системы ролей

28
Q

What is ORM?

A

ORM means Object Relational Mapping. The idea of ORM is to make it possible to connect database tables and objects in code with no need to write SQL requests manually.

Instead of it, your classes in your code are used to create a table in database, all fields of class and database columns are linked, and when you do changes and save them, then database is automatically changed too.

So, ORM makes it easier for regular developer to write code, related to databases, because you don’t need to manually write SQL code, so it saves time and reduces chances to write code with bugs. While not so experienced developer can write complicated and hard to read query, ORM usually generates optimized and reliable code, so it also improves safety in most cases.

From the other side, you need another dependency in your project, and more code means more possible vulnerabilities. Also, code, generated by ORM in some cases can be slower, than code, written by programmer.

29
Q

What is difference between TCP and UDP?

A

To make long story short, TCP is slower but more reliable, and UDP is faster, but less reliable.

The main reason is because TCP ensures that it has successfully connected to the destination server before sending any infomation, and after sending information it ensures that information was delivered successfully.
UDP, same time, doesn’t care about that. In UDP, we just send data and forget about it. Was it delivered or not - that is not our problem anymore.

And of course, here can appear question, why if UDP is so unreliable, it is still in use? Well, because in certain spheres we need speed and data loss is not so critical for us. For example, FPV drones video streaming. We need to send video as fast as possible, to give user enough time to react. If our video looks a bit strange because of lost fragments, this is not critical, but if we would use TCP here and ensure that all data chunks were delivered, we could have significant input lag, that could make the drone completely not possible to control.

From other side, there are spheres where we need to have reliability. The simplest examples are internet webistes, or file transfer. In these examples we must have files intergrity to make them work, because even if one percent of this data would be lost, it would become almost useless.

30
Q

What tools are there for monitoring and debugging microservices?

A

That are logs and related instruments like Prometheus and Grafana. Using them you can dynamically see statistics for different things, from time needed to handle request sent to certain API endpoint, to current server load, this can be almost anything.
We can observe certain metrics, and if they are out of allowed borders, cre

Also, we can use tracing instruments, like Jaeger for network requests analysis or Postman for debugging.

In addition, there are more specific things like database or code profilers, which can be used in order to improve performance.

31
Q

What is JWT?

A

JWT is a JSON Web Token.

This is a special encrypted authentification key used for access to protected api endpoints.

JWT is needed to avoid sending user data like login and password each time with request, and make this process more safe.
The reason main advantage of JWT over this procedure is that, first of all, JWT has certain deadlines, and become outdates quite quickly, comparing to time during what user password remains the same, for example. So, even if somebody could decrypt the request to the server itself, there would be no useful information for this hacker, because server won’t accept this token anymore.
Second reason is that in case of JWT, there are no need to go to the database and compare data in database and in JWT, because token already gurantee correctness of data.

There are two types of JWT tokens:

First one is access token. This is token that is used for accessing protected resources. It can be used multiple times, but has limited lifespan, usually something like 15 minutes.

Second is a refresh token. Refresh token is used to generate new access token and refresh token, after existing access token will expire.
It can be used only once, but has long lifespan, usually few days.

JWT consist of three parts, divided by dot.
That are header, body and signature
Header contains infromation about encryption type, body contains token infromation like expiration time, user login, etc, and signature is an encrypted header plus body, and used to verify their authentity.

Header and body are public and not encrypted.

The whole system works that way, that after user successfully authorized, server generates this JWT token, and its signature part is encrypted using secret key.
After that token is stored in client cookies or localStorage
Every time client want to access protected API endpoints, it also send this token under Authorization request header.
When server receive token, it uses its public part to re-create signature, and if generated signature, and signature from token are same, then this proves that token is valid and can be used for authorization purposes.

32
Q

What is the difference between identification, authentification and authorization?

A

Idendification is a process of receiving information that is used to identify user, like login or email

Authentication is a process of user identity verification using password or any other method.

Authorization is a process of giving user permissions to do certain things through providing him special token, for example.

33
Q

У тебя написано, что ты работал с Clickhouse, а что ты с ней делал?

A

В общем, у нас один из сервисов это сервис переработки, и по нему специальная джоба раз в день собирает данные для статистики из PostgreSQL базы данных, денормализует, и отправляет в Clickhouse для того, чтобы потом можно было проанализировать какие-то метрики. Ну то есть, какие за тот или иной период были переработки, в каких командах, каким образом они компенсировались, сколько это по итогу стоило компании, как это повлияло на текучку разработчиков, и всё в таком духе.
И я как раз несколько раз когда происходили какие-то изменения в структуре базы данных, переделывал эту джобу, чтобы она не падала с ошибкой.

34
Q

Писал / пишешь ли ты тесты на свой код?
Какие тесты?

A

Да, я пишу тесты на свой код.
Я писал только юнит-тесты.

35
Q

Что такое Prometheus? Из чего он состоит?
Как работает?

A

Prometheus это система сбора и хранения метрик, основанная на базе данных временных рядов, хранящей данные аггрегированные по времени.

Для сбора метрик Prometheus периодически стучится в целевые сервисы, получая от них данные.

Prometheus сам по себе, насколько я знаю, может визуализировать данные, но у нас на проектах для этого использовалась отдельно Grafana.

36
Q

Какие на практике у тебя возникали проблемы с Kafka?

A

Ну, смотрите…
На самом деле у Kafka может быть три потенциальные проблемы.
Первая - недоступность системы из-за ребалансировки.
Вторая - перегруженность дискового пространства.
Третья - в старых версиях Kafka могло возникнуть дублирование сообщений, потому что существовали только варианты доставки at most once и at least once, а exactly once не было.

На практике я ни с какой из них не сталкивался, потому что в первом случае всё решается настройкой консумера на периодические ретраи запросов, во втором - всё это решается заранее, так как настроены алерты на кейсы, когда уже использовано какое-то большое количество памяти, а третий в новых версиях kafka исправлен тем, что была добавлена exactly once доставка.

Суть в том, что Kafka не статична, то есть, в рамках одного и того же топика могут меняться консумеры, например из-за того, что количество подов изменилось. Из-за этого время от времени могут происходить такие ситуции, при которых часть консумеров обрабатывает очень много партиций, а другая часть бездействует. В этом случае данные обрабатываются медленно, но при этом мощности простаивают.
Для того, чтобы этого избежать нужна ребалансировка, то есть перераспределение партиций между консумерами.
Во время этого процесса система недоступна.

37
Q

Как в kafka работает чтение сообщений консумерами?
Допустим, есть несколько ситуаций
1. Я хочу, чтобы одно и то же сообщение было обработано одним и тем же сервисом ровно один раз, хотя у меня может быть несколько десятков экземпляров этого сервиса
2. Я хочу, чтобы одно и то же сообщение в рамках того же топика было обработано несколько раз, каждый раз разными сервисами

A

Это реализуется через consumer groups
Консумеры в одной группе могут обрабатывают одно и то же сообщение один раз.

38
Q

Что такое ключ партиционирования в Kafka?

A

Ключ партиционирования - это штука, которая используется для определения той партиции, куда будет записано сообщение.
Смысл в том, что сообщения с одним ключом партиционирования пишутся всегда в одну и ту же партицию.