Technical Interview Flashcards
CIDR notation- what is it used for? How would one use it?
Classless Inter-Domain Routing
Used to define a block of op addresses.
Used to assign a certain amount of IP’s to a subnet or for route tables to specify where traffic will route if it tries to go to an IP within the given cidr block.
- 140.1.0/16 (slash 16 if you wanted a subnet that has 65,536 IPs in the subnet)
- 140.1.0/28 (slash 28 if you wanted a subnet that has 16 IPs in the subnet)
xxx. xxx.xxx.0 - Network address
xxx. xxx.xxx.1 - VPC Router
xxx. xxx.xxx.2 - DNS Server
xxx. xxx.xxx.3 - Reserved for future use
xxx. xxx.xxx.255 - Network Broadcast Address
Difference between DAS, SAN & NAS?
Direct Attached Storage - is connected directly to a computer without going through a network
Storage Area Network - specialized, high-speed network that provides block-level network access to storage. When connected to a computer, these will often look like direct attached storage
Network Attached Storage - file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients.
Describe accurately what deduplication is?
Deduplication is the reduction or elimination of redundant data by storing duplicated portions of a dataset only once.
EFS - Data compression is enabled by default when you use data deduplication, further reducing the amount of data storage by compressing the data after deduplication.
Object Storage vs. block storage vs. file storage
(S3) Object Storage - Objects are discrete units of data that are stored in a structurally flat data environment. Each object is a simple, self-contained repository that includes the data, metadata, and identifier
(EBS) Block storage - breaks up data into blocks and then stores those blocks as separate pieces
(EFS) File storage - hierarchical storage methodology used to organize and store data on a computer hard drive or on network-attached storage
What is a CDN and describe how it works on a technical level?
Example is CloudFrpnt
Content Delivery Network
Viewer wants content (image) from you website -> DNS Routes the request to nearest edge location usually based on latency-> CloudFront checks it’s cache (returns what is cached in the distribution)
NOT CACHED:
If its not cached then CloudFront compares your request with specifications in the distribution and forwards the request to your specified Origin, afterward the Origin then sends the requested content back to the edge location.
As soon as the first byte arrives from the origin, CloudFront begins to forward the object to the user.
CloudFront also adds the object to the cache in preparation for the next time someone requests it.
Describe Serverless functions and what is their advantage?
AWS Lambda would be an example of serverless function, which enables you to use compute resources without having to launch or manage the underlying infrastructure
What is an advantage of putting TLS on a Load Balancer vs. on the webserver itself?
It allows you to create secure encrypted communications to your ELB within your VPC which will then be able to communicate with your webservers, with very little operational overhead or administrative complexity in regards to handling or updating certificates.
You can easily upload a certificate and tell your ELB which certificate to use (when adding TLS) for the ELB, instead of handling the termination process within each ec2 instance which added to the load on the instance and required you to install the certificate on each instances.
Name the Three Tier Architecture
Presentation/Web Tier: Frontend - user interface, user interactions (GUI)
Application Tier: Logic or middle tier, this is the heart of the application and handles processing, sometimes against other information in the data tier
Data Tier: Database, where information that is processed by the application is stored and managed
A Customer complained about a 3 tier architecture application being slow in loading web pages. How would you solve the problem?
** TRIAGE **
I would first try to see exactly what the client is describing as slow - I need to know what they are seeing and exactly what they are experiencing.
I’d want to know if this is intermittently occurring or can be recreated every time.
After seeing the type of slowness, the speed in which the web page responds, which web pages appear to respond slowly - i’d develop a strategy to narrow down the moving parts for that piece.
I’d take a broad view of the web/application/data tier by checking on CloudWatch monitoring to view utilization across the environment to see if anything unique stood out as well as check for alarms.
Can the issue be recreated outside of the client’s work environment, lets try to remove as many factors as possible and see if anything changes. After eliminating network speeds and other O/S based issues (the issue appears wide spread for everyone)
I’d take a broad scope approach to testing each system to see how quickly each tier responded to page loading requests, and once I located one of the tiers that was not responding at the speed that was expected, I would take a deeper dive with more specific tools to track down what was making that particular piece slow.
Difference between SQL vs NoSQL database? Give a use case when you would choose them.
SQL databases are relational and NoSQL databases are nonrelational.
Relational database are a collection of data items with pre-defined relationships between them.
Non Relational databases don’t use the relational database model (meaning it doesn’t have related tables within the database) and instead opts for a design that allows for flexible schema, which is good for when data is unstructured/unpredictable.
Customer needs a database but doesn’t have a schema planned out in advance, the data may not be well structured or it may be unpredictable. (DynamoDB) - less strict on compliance
Customer needs a database to store predictable, structured data with a finite number of users/applications accessing it. (Aurora) - usually stricter compliance like ACID
A Customer is requesting ways to implement network monitoring. What tools would you use to implement the same?
Use CloudWatch metrics - to monitor network utilization for your VPC resources as well as on prem
VPC Flow Logs - to monitor traffic at the VPC, subnet, or network interface level - of your AWS resources
Use CloudTrail logs - for your Elastic Load Balancers to track IP address where requests are made from
Web Application Firewall logs - to monitor Layer 7 requests coming inbound
Can you scale a RDBMS? If yes how would you do it?
Scaling for reads - add a read replica
Scaling to handle more writes - may need to perform vertical scaling (by sizing up instance type), horizontal scaling is more difficult due to the nature of relational databases but can be done. “master-slave” model in which the “slaves” are additional servers that can handle parallel processing and replicated data, or data that is “sharded” (divided and distributed among multiple servers, or hosts) to ease the workload on the master server.
Sharding is also an option with rdbms where each shard will contain parts of the database without knowing of each other. each shard is it’s own server. this is good for online transaction processing OLTP but not good for online analytics processing OLAP because when you run analysis you want to know about the entire dataset and not just a single shard.
(personally, i’d like to move a customer to an Aurora multi master cluster)
* less likely the question*
Scaling storage - some systems will perform this automatically like Aurora, or if you use Amazon RDS (RDS storage auto scaling)
Difference between application and network load balancer.
Application Load Balancers are used for Layer 7 (application layer) requests and handle HTTP/HTTPS/gRPC (web apps, microservices, and containers)
Network Load Balancers are used for layer 4 (transport layer) request and handle TCP/UDP/TLS (ultra low latency and millions of requests per second)
A customer wants to move a monolithic application to microservices architecture in AWS.
Excellent! Lets break it down! They likely have a tightly coupled architecture (single application with multiple functions) that we want to start splitting out to a distributed architecture with loose coupling.
You would determine each individual service or function and split that code into it’s own standalone services.
Use of APIs - easier integrations
Independently deployable blocks of code - can be scaled and maintained independently
Business-oriented architecture -
Flexible use of technology - each microservice can be written using different technologies
Speed and Agility - fast to deploy and update
What strategy would you use and whether you will suggest to use ECS or EKS?
ECS is simplicity and EKS is flexibility.
ECS allows users to choose very basic options and take advantage of default standards to launch containers
Company wants to migrate docker containers to AWS (with no special requirements) - ECS.
EKS will allow more broad use of options while still maintaining security and scalability in AWS or across multiple clouds.
A customer wants to migrate docker container and either wants some kind of implementation with a standardized management layer across multiple clouds and on-premises environments, or they’re migrating a system into AWS and they’re already are using an open source container orchestration platform. (EKS)