Technical Interview Flashcards
Caching
Caching is about saving requested data to a faster or closer data store, so that data can be accessed again in the future.
Caching takes advantage of the locality of reference, which is the tendency to access the same information over and over again. For more details refer to locality in the glossary.
Caching offers the following benefits:
1) Reduces user wait time
2) Saves network bandwidth
3) Eliminates unnecessary computation time
API
API is an acronym that stands for application programming interface.
An API exposes programming functions to third-party developers. These developers can incorporate and execute those functions in their code. These third-party developers cannot see or change how the underlying functions are implemented.
For example, a ridesharing app can request and display map data via the Google Maps API.
REST APIs
REST stands for Representational State Transfer. REST is a way for two systems to communicate over HTTP, similar to how web browsers communicate with servers.
REST API is an important standard when exchanging information between two systems. Other standards, like SOAP, were unnecessarily complex and arbitrary.
An API is called a REST API or RESTful API when it follows these six design principles:
1) Client-server architecture. The client and server applications are separate. Each one can change independently from one another.
2) Stateless. The client application’s state is not stored on the server. Instead, the client’s state is passed around to every system that needs it.
3) Cacheable. The data transferred between clients and server must be cacheable.
4) Uniform interface. The API must have a uniform interface. It should be descriptive naming conventions. It should use consistent link and data formats such as JSON.
5) Layered system. The API should use a layered approach like MVC. For example, APIs, data, and authentication systems should sit on different servers.
6) Code on demand. This is an optional principle, but this allows for executable code to be returned.
RESTful API typically use standard HTTP verbs such as:
GET - retrieves information from a server
POST - writes new information to a server
PUT - updates prior information to a server
DELETE - removes information from a server
Tell me about “Reading from Cache”
When reading from the cache, sometimes the requested data is there. Sometimes not. We refer to this data availability as a cache hit or miss:
Cache Hit: Data is available in cache
Cache Miss: Data not available in cache
Along with data availability, we care about cache freshness. Freshness refers to whether the cache’s information is up to date. Out of date, or stale, data can be a concern.
For example, stale bank data can frighten both clients and banks. However, an older version of a personal web page can be less catastrophic.
The most common way cache systems determine freshness is with age. Cache systems often delete cache data that exceed an age threshold, which we call time-to-live (TTL).
What are the three common cache writing policies?
- Write-back : Write to the cache only;
Pro: low latency and high throughput
Con: potential data loss, especially if it’s the only copy during a crash - Write-through: Write to cache and permanent storage at the same time;
Pro: data consistency between the cache and storage
Con: Higher latency. Every write operation has to be performed twice - No-Write (AKA Write-around): Write to permanent storage only
Pro: Cache isn’t flooded with write requests
Con: Higher latency. A read for recently written data will create a cache miss
Tell me about “replacing the Cache”
Caches do not have infinite space, so a cache system dictates rules on what should be removed first. Here are some of the most common policies:
1) FIFO
2) LIFO
3) Last Recently Used (LRU)
4) Most Recently Used (MRU)
5) Random Replacement (AKA Cache Eviction)
Machine Learning
Machine Learning
Machine learning refers to algorithms that perform tasks based on inference rather than rules.
These inferences are derived from mathematical models that “learn” from large amounts of structured data.
For example, a developer can create a music recommendation service based on rules. For example, we can program an explicit rule that says if a user likes artist A then recommend music from artist B.
Machine learning is different. Instead, a machine learning algorithm will be given training data. Based off that training data, the algorithm will infer the data points that predict a particular outcome.
Example of provided data, for a music recommendation service, can include:
1) Listening history. Users who listened to a song are likely to listen to another song. This is called collaborative filtering.
2) Keywords. The song’s meta data or lyrics can provide clues. For example, the song’s meta data might indicate that the song is appropriate for toddlers. Or the song lyrics might indicate that the song is related to New York.
3) Audio file analysis. The music recommendation service might analyze the song file’s characteristics including tempo, loudness, key and time signature.
Other popular applications of machine learning include:
- Fraud detection
- Self-driving cars
- Voice recognition
- Email spam detection
- Shopping and movie recommendations
What are “features” when referring to Machine Learning?
Features
A feature is a property of an event, observation, or data point. Here are some examples:
1) For a music recommendation, a song’s tempo combined with genre may accurately predict music one would like.
2) In email spam detection, the word “FREE” in the subject line may more accurately predict spam.
3) In eCommerce, whether a user using an Apple device, may more accurately predict how likely that user will make a purchase.
Choosing features is very important. It can significantly impact prediction accuracy.
What is “Training data” when referring to Machine Learning?
Training Data
Training data, as the name implies, is data that trains machine learning models.
What is “Validation Data” when referring to Machine Learning?
Validation Data
After a machine learning model has been trained, the validation data set is processed to gauge the machine learning model’s accuracy
Machine Learning: Supervised vs. Unsupervised Learning
Supervised vs. Unsupervised Learning
Supervised learning is a machine learning model that makes predictions (output) with clear inputs.
Unsupervised learning is a machine learning model that makes predictions (output) with unclear inputs.
For example, a self-driving car application may be given a clear (labeled) input that certain intersections have red stop signs. Based on this labeled data, the machine learning algorithm can infer that cars should stop when it encounters intersections with stop signs. This is an example of supervised learning.
However, a self-driving car application may not be given labeled data indicating which intersections have red stop signs. Instead, it would have to infer from the data available to it, when the car should stop. For example, it may wrongly infer that:
Stop: When it approaches an intersection and other cars are slowing down.
vs
Not stop: When it approaches an intersection and other cars aren’t present
Machine Learning: Neural networks
Neural networks
A neural network is another learning model. Just like machine learning, it learns without specific rules.
Neural network’s quirky name comes from “neurons” in the human brain. That is, a neural network model process signals (i.e. data) like a neuron. That neuron can then signal other neurons it is connected to.
Machine Learning: Batching
Batching
The concept of aggregating portions of the training data together. These aggregations are “learned” together, instead of one-by-one. Batching training data reduces the number of times gradients are calculated to adjust weights through back propagation.
Machine Learning: Deep Learning
Deep Learning
Group of machine learning methods that are fundamentally based on neural networks. This includes variations of neural networks such as recurrent neural networks (RNNs) and convolutional neural networks (CNNs).
What is Microservice Architecture?
Also called microservices, microservice architecture is an application designed with loosely coupled services or submodules.
An example of this is an eCommerce store. There is an account services, inventory service, and shipping service, all with their own databases. A mobile app and browser would tie into these loosely coupled services.
These loosely coupled services make it easy to develop and run them independently. This makes it easier to maintain, test, and scale. These microservices can even be running in different programming languages.
This is considered better than a monolithic application, which is developed from start to finish as a single unit. Monolithic applications are typically poorly organized, making it hard to debut, maintain, or extend.