Design High-Performing Architectures Flashcards
You have just started work at a small startup in the Seattle area. Your first job is to help containerize your company’s microservices and move them to AWS. The team has selected ECS as their orchestration service of choice. You’ve discovered the code currently uses access keys and secret access keys in order to communicate with S3. How can you best handle this authentication for the newly containerized application?
Attach a role with the appropriate permissions to the task definition in ECS.
It’s always a good idea to use roles over hard-coded credentials. One of the best parts of using ECS is the ease of attaching roles to your containers. This allows the container to have an individual role even if it’s running with other containers on the same EC2 instance.
A pharmaceutical company has begun to explore using AWS cloud services for their computation workloads for processing incoming orders. Currently, they process orders on-premises using self-managed virtual machines with batch software installed. The current infrastructure design does not scale well and is cumbersome to update. In addition, each processed batch job takes roughly 30-45 minutes to complete. The processing times cannot be reduced due to the complexity of the application code, and they want to make the new solution as hands-off as possible with automatic scaling based on the number of queued orders.
Which AWS service would you recommend they use for this application design that best meets their needs and is cost optimized?
AWS Batch
AWS Batch is perfect for long-running (>15 minutes) batch computation workloads within AWS while leveraging managed compute infrastructure. It automatically provisions compute resources and then optimizes workload distribution based on the quantity and scale of your workloads.
You have just been hired by a large organization which uses many different AWS services in their environment. Some of the services which handle data include: RDS, Redshift, ElastiCache, DynamoDB, S3, and Glacier. You have been instructed to configure a web application using stateless web servers. Which services can you use to handle session state data?
Elasticache and DynamoDB both can be used to store session data.
Amazon RDS can store session state data. It is slower than Amazon DynamoDB, but may be fast enough for some situations.
A fully managed service for loading streaming data into AWS. It’s designed to make it easy to capture, transform, and load streaming data into data lakes, data stores, and analytics tools.
Amazon Kinesis Data Firehose
It can capture, transform, and deliver streaming data to Amazon S3, Amazon Redshift, Amazon OpenSearch Service (formerly known as Amazon Elasticsearch Service), generic HTTP endpoints, and service providers like Datadog, New Relic, MongoDB, and Splunk.
A time-series forecasting service that uses machine learning and provides business insights.
Amazon Forecast
A fully managed service that allows you to build and run applications that use Apache Kafka to process streaming data. Kafka is often used for real-time streaming of data pipelines and streaming analytics.
Amazon Managed Streaming for Apache Kafka (MSK)
A fully managed integration service that enables secure transfer of data between Software as a Service (SaaS) applications like Salesforce, Marketo, Slack, and AWS services like S3 and Redshift, in real time.
AWS AppFlow
A managed message broker service for Apache ActiveMQ and RabbitMQ that makes it easy to set up and operate message brokers in the cloud.
Amazon MQ
Amazon MQ supports industry-standard APIs and protocols for messaging, including JMS, NMS, AMQP, STOMP, MQTT, and WebSocket. This means you can easily migrate your existing applications to the service without having to rewrite code.
A logical grouping of instances within a single Availability Zone.
A cluster placement group
A cluster placement group can span peered VPCs in the same Region. Instances in the same cluster placement group enjoy a higher per-flow throughput limit for TCP/IP traffic and are placed in the same high-bisection bandwidth segment of the network.
A group of instances that are each placed on distinct underlying hardware.
Spread Placement Groups
Spread placement groups are recommended for applications that have a small number of critical instances that should be kept separate from each other. They ensure that instances are placed on distinct racks, with each rack having its own network and power source.
A group of instances that are separated into logical segments, called partitions. Each partition has its own set of racks. Each rack has its own network and power source.
Partition Placement Groups
Partition placement groups are recommended for large distributed and replicated workloads, such as Hadoop, Cassandra, and Kafka.
____________ are a way to influence the distribution of instances, which can help in optimizing for latency, throughput, or resilience.
Placement Groups
You can’t merge placement groups or move an instance from one placement group to another after it’s been launched.
But you can create an AMI from your instance, launch a new instance from the AMI into a placement group, and then terminate the original instance.
__________ allows businesses and developers to convert media files from their original source format into versions that are optimized for various devices, such as smartphones, tablets, and PCs.
Amazon Elastic Transcoder
__________ is a NoSQL database that supports key-value and document data models, and enables developers to build modern, serverless applications that can start small and scale globally to support petabytes of data and tens of millions of read and write requests per second.
Amazon DynamoDB
DynamoDB is designed to run high-performance, internet-scale applications that would overburden traditional relational databases
An OLAP database service that is designed for online analytic processing and business intelligence applications
Amazon Redshift
It provides powerful query and data manipulation capabilities with high performance and scalability.
Near real-time complex querying on massive data sets.