Internet Technologies Flashcards
Can you explain why OAuth is important?
In the early days of internet, sharing information was straightforward. Just share your username and password and they could access anything they wanted.
But oh course this is not what we want as customers right now, to give websites this much power.
OAuth 2 is like giving someone a special key. It gives access to an application to specific information in another application, say Facebook user info, but not friends info. We control who gets access to our data, without having to share our password. And we can revoke that key at anytime.
An example, is an image store app and the third party application we want to print the pictures with. Print app request access to photos and when we grant it, the printing app can access our photos without knowing about our login credentials.
How does OAuth work in a nutshell?
- Let’s say there is Snapstore that stores our pictures and Print Magic that we want to access those pictures and print some.
- PrintMagic needs to access our pics without having our login credentials. Snapstore might have the authorization server or it could be an external identity provider.
- PrintMagic sends a clientid and scope which represent the access level to snap store authorization.
- As resource owner, we authenticate directly with Snapstore and grant PrintMagic the consent to access our photos.
- Once the consent is granted, the auth server sends an authorization code back to PrintMagic.
- PrintMagic then sends this auth code, clientid and clientsecret to the auth server. The clientsecret is a key only shared between PrintMagic and authorization server.
- If the authorization server verifies the authorizationcode, the clientid, and the client secret, it issues an access token to PrintMagic.
- Finally PrintMagic uses this access token to request our photos from SnapStore server.
- It is important to note that the access token’s expiry can be set and it can be revoked by us to provide another layer of security.
- OAuth2 also supports RefreshToken, when an access token expires without our intervention the token is refreshed.
Tell me about Message Queues
Message queue is a software component that enables the different parts of a system work asynchronously by sending and receiving messages. They act in the middle and enable the sender and receiver act independently.
Message queues are crucial for building scalable, loosely coupled and fault-tolerant systems.
Can you share some tips about building and secure APIs?
1. Use clear naming: It tells a lot to developers about if they are dealing with a group of resource or one single resource, etc…
2. Make apis idempotent: APIs will likely to be called more than once, so we need to make sure they are idempotent, meaning they won’t create duplicate records or do weird stuff at the backend. For this, we can get a client generated unique ID on every client request so the second time, the api can say, object already exists, etc.
**3. Add versioning: ** To not impact our current API consumers, and to support backward compatibility, we need to make proper versioning for new features.
Say: /api/v1/carts/123 can be /api/v2/carts/123
4. Add pagination: To enhance the performance of our APIs and improve the user experience, we need to limit the amount of data sent to the client. For this, we can use pagination. There are 2 common ways in pagination: cursor-based and page number + offset.
5. Use clear query strings for sorting and filtering data:
Examples for this:
GET /users?sort_by=registered
GET /products?filter=color:blue
-This helps developer to instantly grasp the active filters or sorts already applied. It is much easier to add new sorting or filtering criteria over time without breaking the existence ones. And third, we can actually cache existing filtered results and re-use them.
6. Think about security early on: Use HTTP headers for sensitive data like API keys, instead of URLs. Request headers can also be exposed, so use TLS encryption at every step. And use robust Access Control by verifying keys and tokens every step of the request processing.
7. Keep cross-resource references simple: For example, one item in a cart should be simply referenced like: /api/v1/cart/123/item/456 but not like /api/v1/items?card_id=123&item_id=456 This avoids messy query parameters and helps developers consuming your API.
8. Plan for rate-limiting: This avoids the overload of our systems in the case of an abuse. It protects infrastructure. Ways to do it, 20 reqs per sec from one IP, or free-tier clients can do 1000 requests in a day etc.
What is REST API?
REST is the most common communication standard between computers over Internet. It is simple and good enough for most companies, that’s why it is widely used. API stands for Application Programming Interface. It is a way for two computers to talk to each other. The common API standard used by most mobile and web applications to talk to the servers is called REST. It stands for REpresentational State Transfer.
REST is not a specification. It is a new set of rules that has been the common standard for building web API since the early 2000s. An API that follows the REST standard is called a RESTful API.
What are the basics of REST API?
- A REST implementation should be stateless. It means the two parties don’t need to store any information about each other and every request is independent from each other. This provides scalable independent easy to manage web applications.
- A Restful API organizes resources into a set of unique URIs (uniform resource identifiers) The URIs groups different types of resources on a server. (like /producs or /users)
- The resources should be grouped by noun and not verb. Example: /getallproducts is wrong /products is correct.
- The request has a very specific format (POST /products HTTP/1.1). The line contains the URI for the resource we’d like to access. The URI is has an HTTP verb at the beginning which tells the server what we want to do with the resource.
- POST - Create , GET - Read, PUT - Update, DELETE - Delete (Apart from Put, all of them are idempotent. But we need to make sure our API is idempotent)
- The server gets the request, processes it, and returns a response. The first line of the response contains the HTTP status code, which is a short code about the request.
- 200-level: Success, 400-level: Something wrong with our request, 500-level: something wrong with the server.
- 500-levels can be retried, but it is important to understand if the API is idempotent. Otherwise we might get inconsistent results.
- Pagination, versioning, handling sensible values with care is really important.
What is GraphQL?
GraphQL is a query language for API developed by Meta. It provides a schema of the data in the API and gives clients the power to ask for exactly what they need.
How does GraphQL work?
GraphQL sits between the clients and the backend services. It could aggregate multiple resource requests into a single query. It also supports mutations, and subscriptions.
- Mutations are GraphQL’s way of applying data modifications to resources.
- Subscriptions are GraphQL’s way for clients to receive notifications on data modifications.
Can you compare REST and GraphQL?
Both send HTTP requests and receive HTTP responses.
Both make a request via a URL.
Both can return a JSON response in the same shape.
Differences:
- With GraphQL, we specify the exact resources we want, and also which fields we want.
GET /graphql?query={ book(id: “123”) { title, authors { name } } }
In REST example, the API implementer decided this for us that authors are included as related resources. In GraphQL, the client decides what to include. This is a benefit of GraphQL.
- It doesnt use URLs to specify the resources that are available in the API. Instead it uses a GraphQL schema. We can send a complex query that fetches additional data according to relationships defined in the schema. Doing the same in REST is more complicated. We would have to do that client side with multiple requests. This might end up with N+1 queries.
- In REST, we don’t need special libraries to consume someone else’s API. Requests can simply be sent using common tools like Curl or simply a web browser. In contrast, GraphQL requires heavier tooling support, both on the client and server sides. This requires a sizable upfront investment. This upfront cost might not be worth it, especially for very simple CRUD APIs.
- Another criticism of GraphQL is that it is more difficult to cache. REST uses HTTP GET for fetching resources, and HTTP GET has a well-defined caching behavior that is leveraged by browsers, CDNs, proxies, and web servers. GraphQL has a single point of entry and uses HTTP POST by default. This prevents the full use of HTTP caching.
- The final concern we have with GraphQL is that while GraphQL allows clients to query for just the data they need, this also poses a great danger. Imagine this example where a mobile application shipped a new feature that causes an unexpected table scan of a critical database table of a backend service. This could bring the database down as soon as the new application goes live. It is possible to mitigate this risk but it adds even more complexity to a graphql implementation. The cost to safeguard risks like this must be factored in when considering GraphQL.
Can you tell me the most popular API Architecture styles?
- SOAP:
- Mature, enterprise, financial world loves it, XML based.
- Complex, an overkill for an MVP - RESTful:
- Backbone of the internet, simple
- Not the best for real-time data, or highly connected data model. - GRPC: Uses protocol buffers by default.
- Flexible and modern.
- Limited browser support if you have browser clients. - GraphQL: Allows for the client to fetch the exact data they need`
- Flexible and efficient for web and mobile applications
- Steep learning curve, has an overhead for getting the tooling and everything needed to work, and heavier on the server side - WebSocket: Real-time, bidirectional and persistent connections.
- Perfect for live-chat applications and online gaming.
- If your app doesn’t require real-time data, WebSocket might be an overhead. - Webhook: All about being event-driven, uses HTTP call-backs to provide async operations.
- Github uses webhooks to trigger other systems whenever a new code is pushed.
- For immediate response and sync comms, webhook is not the best option.