AWS Q&A Flashcards
How would you design a resilient system in AWS?
To create a design to fail system, you have to create a backup database. In AWS its very easy to replicate a database and create a backup database. In case of failure, we can immediately switch to the backup database. And the backup database would always be synced up to the master database.
So, for a “Design to fail” system there are certain characteristics that have to be in place.
- Follow a Pessimistic Approach
- Automated Recovery
- Handle Design, Execution and Deploy Failures
Pessimistic Approach: You have to follow a pessimistic approach when are designing an architecture in the Cloud. You have to assume that things would fail.
Automatic Recovery: To handle such failure, you would have to create a system that would have Automatic Recovery from failure. So we would have to create an inbuilt Automatic Recovery mechanism in a “design to fail” system.
Handle Design, Execution and Deploy Failures: Also AWS should be designed to automatically recover from Design, Execution and Deploy state failures. When all these 3 stages of failures are handled, the system can handle any failure.
What Are The Tools In AWS that Can Be Used For Creating A System Based On a “Design To Fail” principle?
- Elastic IPs
- Availability Zones
- Amazon RDS – RDS provides deployment
- Machine Image
- Amazon CloudWatch
- AutoScaling
- Amazon EBS
- Automated BackUps
- Elastic IPS
AWS provides many tools to create strong system based on a “design to fail” principle. Some of these are Elastic IPs. We can failover gracefully using EIPs in AWS. An Elastic IP is a static IP that is dynamically remappable. We can quickly remap and failover to another set of servers, so that application traffic is routed to that new set of servers. It’s also very useful when we want to upgrade from, or move to a new version of software.
• Availability Zones
Availability Zones For a design to fail system we can use multiple availability zones to introduce resiliency in an AWS system. An Availability Zone is like a logical datacenter. By deploying applications in multiple availability zones, we can ensure high availability, so even in the eventuality of failure in a zone, our system remains available in other zones.
• Amazon RDS – RDS provides deployment
Then we have an option of Amazon RDS and RDS provides deployment functionality to automatically replicate database updates across multiple Availability Zones. So with this we have always the backup database ready.
• Machine Image
Then we have Machine Image. This is called an Amazon Machine Image AMI, where we can maintain an AMI to restore and clone the environments easily in a different Availability Zone. As soon as system is down in one environment, we can start it up in the next environment. So we can use multiple database slaves across Availability Zones.
• Amazon CloudWatch
Amazon CloudWatch. This is a real time open-source monitoring tool in AWS that provides visibility on AWS Cloud. So with monitoring, you will know that when the system is going to fail and you can take the corrective action. So that way we can take the appropriate actions, in case of hardware failure or performance degradation by setting up alerts on CloudWatch.
• AutoScaling
Then we have AutoScaling. We can create an AutoScaling group to maintain a fixed number of servers. In case of failure or performance degradation, unhealthy Amazon Instances are replaced with new ones. So we can use AutoScaling. Whenever we need to scale the system up or down.
• Amazon EBS
Amazon EBS. We can set up Cron jobs to take incremental snapshots of the database and upload it automatically to Amazon S3. In this way data is persistent independent of the instances, so we can use EBS for that.
• Automated BackUps
And we can also set up the Automated Backups; We can set the retention period for how long a backup will be kept. And then we can perform the automation backups that introduce resiliency in the system, where even if there is a failure, the backup can bring the system back.
Basic Design Concepts Part 2 - Data Proximity Principle
Why is it recommended to keep dynamic data closer to the compute, and static data closer to the end user in Cloud computing?
Compute Data - Server - Static Data - User
I. Keep the right kind of data at the right place
II. Static images near end-user
III. Processing data near backend server
This is a basic question on the design of a system in cloud computing, which is widely used in AWS, where they ask Is that why dynamic data is closer to the compute, where computation takes place, and why static data is closer to the end user. So first, you need to understand the data proximity principle. We have a server and the user.
The user has to access the static data, and the server has to accept compute data. So, in general proximity principle if we keep the right kind of data at the right place, it can help build an excellent enterprise software system. The purpose of keeping dynamic data closer to compute resources is that it can reduce the latency while processing. So if the dynamic data is near to the computer, you don’t need to spend time in moving it near to the server. There is no need for servers to fetch data from remote locations.
What is the difference between Region, Availability Zone and EndPoint in AWS?
- AWS Region
- Availability Zone
- EndPoint
I. A Region Can have multiple Availability Zones
II. Low-Latency in Availability Zones of a region
III. An EndPoint is an Entry point.
Region: In AWS, every Region is an independent environment. It’s like an isolated datacenter.
Availability Zones: Within a region we can have multiple Availability Zones. This is the difference between Regions and Availability Zones. Every Availability Zone is an isolated area, but there are Low-Latency Links that connect one Availability Zone to another within a region. So with that, the data transfer between two Availability Zones of the same region is very fast.
EndPoint: An EndPoint is just an entry point for a web service. It is written in a URL form. It’s like HTTPS, dynamodb.us, to Amazon aws.com. So this is an endpoint for Amazon DynamoDB service. So most of the AWS services they offer an option to select a Region and find for incoming requests. So for services, you can use a regional endpoint. Many services like IAM do not support Regions, so their EndPoints do not have a Region. So, that’s how Region. Availability zone and EndPoint differentiate.
What are the important features of Amazon, S3?
- Unlimited Storage
- Object Based Storage
- 99.999999999% durability
- Buckets
- Unique Bucket Names
- File Size
In Amazon S3 there are many features. S3 is mainly a storage service. It’s called a Simple Storage Service, hence S3. Some of it’s features are it provides unlimited storage for files, and you can store unlimited data as of now in S3. S3 is object-based storage so you store things like an object in S3. S3 claims that they deliver 99.99%, durability. So chances of losing data are very low.
In S3 you have to store the data in buckets, and the names of the buckets in S3, have to be unique, globally. You cannot have things like common bucket names across the Regions. You have to pick unique bucket names. The file size in the S3 can vary from zero bytes to five terabytes. So although you can store unlimited number of objects, but the size can vary up to five terabytes.
What are the main limitations on buckets created in AWS S3?
- 100 buckets per account
- Transfer Ownership
- Number of objects
- Bucket inside another bucker
- DNS Compliant names
So the limitations in Amazon S3 are you can support like up to 100 buckets by default. So a user, by default can create a maximum of a hundred buckets. If you want to increase this limit, you can submit a request to overcome this limit.
Another thing is that we cannot transfer ownership of bucket to another account. So the ownership remains with the person who created it. You can give access to others, but you cannot transfer the ownership.
There’s no limit to the number of objects that can be stored in a bucket, the objects are like files, and the number of files that can be stored in a bucket are limitless. There’s no limit for that.
Also, we cannot create a bucket inside another bucket you cannot have a transitive kind of relationship here, where you create a bucket inside another bucket. It’s not allowed in S3.
Another restriction by Amazon S3 is that all the bucket names have to be DNS compliant in all regions. You cannot have bucket names with spaces and like non-DNS compliant names. So you have to deal with all these limitations. If you follow these rules, you can get the maximum benefit out of Amazon S3.
What is the difference between Amazon S3 and Amazon EC2?
These are 2 popular products from AWS.
• S3 is a Storage Service
• EC2 is a computing environment
Amazon S3: is a storage service in the Cloud. It is used to store large amount of data files, and these files can be Image Files, PDF. These are the static data, or these can be dynamic data that is created during the runtime, but we just store it and access it.
Amazon EC2: EC2 is a remote computing environment that runs in Cloud. This is an environment where our servers run, so we can install our software and operating system, in an EC2 instance, and we can use it to run our web servers, application servers, database servers. You can run these servers in EC2.
S3, on one hand is like a hard disk in cloud, and EC2 is like a processor in cloud. So we use both of them in combination.
What are the different Tiers in Amazon S3 storage?
There are 3 different Tiers in Amazon S3 • Standard Tier • Standard-Infrequent Access • Reduced Redundancy Storage Standard Tier: In this Tier S3 supports durable storage of files that become immediately available. As soon as you write, you are able to read them. This is generally used for frequently used files.
Standard-Infrequently Access: In this tier, the S3 provides durable storage that is immediately available but in this tier, files are infrequently accessed and infrequently used. This is a cheaper option and is cost effective.
Reduced Redundancy Storage: In this tier, S3 provides the option to customer to store data at lower levels of redundancy. Data is copied in multiple locations but not like Standard Tier, and the numbers of locations are limited, and this is the cheapest option.
What is the difference between Volume and a Snapshot in AWS?
Volume: In AWS, a “Volume” is a durable block level storage device that can be attached to a single EC2 instance. Simply put, it is like a hard disk on which we can write, or read from, and a computing resource can be attached to it.
Snapshot: A “Snapshot” is created by copying the data of a volume to another location at a specific time. We can even replicate the same snapshot to multiple Availability Zones.
A snapshot is a single point in time view of a volume. What is stored in that volume at that point of time, can change later on. A snapshot doesn’t change. We can create a snapshot only when we have a volume. Also, from a Snapshot we can create a new volume.
In AWS, we have to pay for storage that is used by a volume, as well as the one used by snapshots. So, if you create a volume you have to pay for the storage of the volume. And if you take a snapshot out of that volume, then you have to pay for that snapshot, storage as well. These are the main differences between volume and snapshot.
What is the difference between Instance Store, and EBS?
• Persistence
• Encryption
Instance Store: The main differences are that Instance Store data is not persisted for long-term use. If the Instance terminates or fails, we can lose that Instance Store data.
EBS: Any data stored in EBS is persisted for a longer duration. Even if an instance fails, we can use the data stored in EBS to connect it to another EC2 Instance. Also, encryption EBS provides a full volume encryption of data that is stored in it.
Instance Store: Is not considered good for encrypting data. So, if you want fast data, and the data doesn’t matter to you much, then you can go for Instance Store. Whereas if you really want encryption you want to purchase the data for a longer duration, you have to go for EBS Elastic Block Store.
How does Amazon EC2 work?
EC2 (also known as Elastic Compute Cloud) is a computing environment that is provided by AWS. It supports a highly scalable computing capacity in AWS.
Behind the scene, not transparent to us, we do not know the server is doing, but it provides a very high computing capacity with some kind of an IOS. So instead of buying hardware for servers, we can use Amazon EC2 to deploy our application. We don’t need to buy hardware and install an IOS before using it, because EC2 is a complete computing suite. So there is no need to buy and maintain hardware within our own datacenter. We can just rent the Amazon EC2 servers. Based on our varying needs, we can also use as few, or as many Amazon EC2 instances.
If you want to scale up or scale down, you can do that. It even provides AutoScaling options, in which the instances scale up or down based on the load and traffic spikes. So that is another good option, and it is easy to deploy applications on EC2. It provides automated deployment.
Also, we can configure security and networking in Amazon EC2 much easily than in our own custom datacenter. So there are so many benefits of EC2.
What is the difference between Stop and Terminate in an Amazon EC2 instance?
- Stop an Instance
- Terminate an Instance
The Stop and Terminate are two different things in EC2. So, what is the difference between them? First, we will address stopping an instance.
Stop an Instance: When we stop an instance, it performs a normal shutdown and moves into a stopped state. This instance can be restarted again at a later point in time. So, a stopped instance can be restarted again. We are not charged for additional instance hours when an instance goes to a stop state.
Terminate an Instance: When we terminate an Instance, the normal shut down takes place, and all the attached Amazon EBS volumes are deleted. In other words, all the connected data are deleted. The exception would be if we call the Delete on termination attribute as false, then the volume will not be terminated; otherwise, all volumes will be terminated and deleted. Once deleted, it can never be started again. So, when we stop an instance we can start it again, but in terminate, it’s not possible to start it again.
What are the main uses of Amazon Elastic Compute Cloud (EC2)?
- Easy Configuration
- Control
- Fast Reboot
- Scalability
- Resilient
The main uses of EC2 is first, it provides a scalable computing resources for creating a software infrastructure. It is very easy to deploy an application in EC2.
So the main uses are:
Easy configuration: We can easily configure our servers in EC2 and manager capacity. So, if you want to easy configuration, you can just go to EC2.
Control: EC2 also provides complete control of computing resources, even to developers. We don’t need special skills or special ops people for getting access to the EC2. Even developer people can access. The users can run the EC2 environment according to his or her system needs. So whatever needs you have, you can create an environment to accommodate you needs.
Fast reboot: It’s very fast to reboot an instance in EC2. So, because of the fast reboot the overall deployment, and development time is reduced in EC2.
Scalability: In EC2 we can create a highly scalable environment, based on the load that is expected with our application. So scalability is very good option and in EC2 and we don’t need to worry about that.
Resilient: It is very easy to create and terminate servers in EC2. Due to this, we can develop resilient applications in EC2. So basically, if we can create and terminate servers whenever we want, then it ultimately it provides resiliency to the application.
So just to give a recap, the main uses of EC2 are as follows, Easy Configuration, Control, Fast reboot, Scalability and Resiliency of the whole system. So these are the main uses.
So what are Spot instances in Amazon EC2?
What is the difference between spot instance, and on-demand instance in Amazon EC2? This is a follow up question, just to test your knowledge
Spot Instance: Spot instance and On Demand Instance are very similar in nature and they can be used for a similar purpose. The main difference is in their commitment. In spot instance, there is no commitment from either side, be it ours or from the AWS side. As soon as the bid price exceeds the spot price, or whenever the bid rises more than spot price, the new user will get the instance.
On Demand Instance: With an On-demand instance, a user has to pay the On-demand rate specified by Amazon. Once they buy the Instance to use, they continuing to pay the rate they bought it for from Amazon. In the spot instance, once the spot price exceeds the bid price, Amazon will shut it down and give it to new highest bidder. The benefit to the user is that they will not be charged for the partial hour in when instance was taken back from them. So, for the partial hour they are not charged.
What is Auto-Scaling and how does it work?
This is not only an operations question, nowadays its also asked for developers.
Auto Scaling: Auto scaling is the ability of a system to scale itself automatically.
Based on triggers like the crashing of a server or low performance and high traffic, a system can automatically scale up or scale down.
AWS extensively supports Auto Scaling. It provides tools to create, configure and automatically start new instances without any manual intervention. As the name suggests, auto scaling takes place without any manual intervention.
Also, we can set the thresholds at which new instances will fire up. Or we can monitor the matrix like APA response time, number of requests per second, and based on these metrics, let the AWS provision and start new servers. “If you they asked you a definition, then you can just say,” Auto Scaling is the ability of a system to scale itself automatically based on the triggers like crashing of a server or low performance, or high traffic.
What and how are Amazon Machine Image (AMI) and an Amazon Instance related?
AMI vs. Instance
AMI: An AMI is the template of Machine that is used to create an Instance.
Instance: An instance is used to denote the hardware of the server that is used for an instance.
AMI: Amazon Machine Image (AMI) is a template in which we can store the configuration of a server. It can be an operating system (OS), application server, web server, etc.
Think of it like a template. For example, in a Word template you can generate multiple letters and documents. Likewise, with AMI, the same way, you can use the same template to create different kinds of servers.
However, AMI in itself is not a server. An Instance is the actual server that is built by using an AMI. So Amazon instances are like a server, and it typically runs in Amazon AWS cloud. It’s mainly for AWS Cloud.
We can launch multiple types of instances from the same AMI. This is actually a good difference. You can have only one AMI, but from that single AMI you can create different kinds of instances. You can have different computing and memory configurations in each instance. For example, you can have an instance for Dev, another for QA, or for Prod environment. It may have different configurations but be from the same template. The same AMI can launch multiple instances.
Instance: So, now lets look at an Instance. What is an Instance? An instance is used to denote the hardware of the server that is used for an Instance. In each instance, there can be different capabilities, and we can work on an Instance the way we can work on any other server. However, on AMI, you cannot work. So that is the main difference.
AMI can be used to create an Instance on which we can work. An AMI is just a template of the machine - this is what you have to use in the interview.
What is a VPC and what are the benefits of using a Virtual Private Cloud in AWS
A VPC - Virtual Private Cloud is a network that is logical isolated from other networks in the cloud.
It allows you to easily customize your networking configuration.
It allows you to have your own IP address range, internet gateways and security groups.
The benefit of VPC is that it helps in aspects of cloud computing like privacy, security and preventing loss of proprietary data.
Subnets: A subnet can be thought of as dividing a large network into smaller networks.
What’s the difference between a VPN, VPS, and VPC?
A VPN makes the private network (such as a company network) of an entity accessible through public infrastructure, primarily the internet. A VPN can allow users to exchange data efficiently across shared or public networks, as though they are directly linked to the private network.
A VPN privately connects to a virtual network to prevent unauthorized traffic interception and allow efficient flow of data without incurring heavy costs of constructing a physical private network or corporate intranet infrastructure.
A VPS refers to the sharing of computing resources of a main host in a data center. Since a single host is partitioned into several virtual compartments where each unit is capable of functioning independently, each ‘instance’ is what is called a virtual private server.
A VPC, or virtual private cloud, is similar to a VPS. But where a VPS uses a fixed portion of a server with fixed resources, a VPC can manage large numbers of virtual machines and are not limited to a single, fixed-resource server. Users are not bound by the limitation sof the underlying hardware.
Furthermore, VPCs allow their users to manage their own service. They can turn servers on and off at their leisure. This allows an hourly pricing model instead of a monthly one.
What is AWS Lambda?
- Run Code
- No Server Provision
- AutoScale
- High Traffic
Run Code: AWS Lambda is a service from Amazon to run a specific piece of code in Amazon Cloud, without provisioning any server, so there is no effort involved in administration of servers, so we don’t have to buy a server for this process. We just have to run the code in AWS Lambda, and we are not charged, until our code starts running. Therefore it is a very cost effective solution to run code.
AutoScale: Also, AWS Lambda can automatically scale our application. When the number of requests to run the code increases. So we do not have to worry about scalability of application to use AWS, Lambda.
High Traffic: Lambda can handle very high traffic by AutoScaling.
No Server Provision: There’s no provisioning of servers. So all these things make AWS Lambda a very desired service from Amazon.