AWS Tech Flashcards

Question 1

Q

What is AWS?

Answer

A

AWS, is as a leading cloud computing platform empowers businesses and individuals with a diverse array of services - encompassing computing power, storage solutions and databases Elasticity and Scalability

AWS offers elasticity, allowing resources to scale up or down based on demand, and scalability, enabling the use of large amounts of computing resources on a per-use fee basis.

Cost-Effective Solution: Using AWS is typically much cheaper than maintaining a server farm directly as it offers the ability to pay for computing resources on a per-use basis.

Here are short key points differentiating between Elasticity and Scalability:

Elasticity:
Refers to the ability to dynamically adjust resources based on demand, scaling up or down automatically.

Scalability:
Involves the capability to handle increased workload by adding resources without impacting the system’s performance.

Question 2

Q

What is cloud computing?

Answer

A

Cloud computing is the on-demand delivery of IT resources over the Internet with pay-as-you-go pricing.

Instead of buying, owning, and maintaining physical data centers and servers, you can access technology services, such as computing power, storage, and databases, on an as-needed basis from a cloud provider like Amazon Web Services (AWS).

Provide example for Types of cloud computing - IaaS, PaaS, SaaS

Elasticity: Refers to the ability to dynamically adjust resources based on demand, scaling up or down automatically.

Scalability: Involves the capability to handle increased workload by adding resources without impacting the system’s performance.

Question 3

Q

Provide example for Types of cloud computing

Answer

A

Infrastructure as a Service (IaaS)

These IaaS offerings on AWS provide the foundational infrastructure components for users to build, deploy, and manage their applications and services in the cloud. For example - EC2, S3, VPC

Platform as a Service (PaaS)

PaaS removes the need for you to manage underlying infrastructure (usually hardware and operating systems), and allows you to focus on the deployment and management of your applications- Platform as a Service (PaaS) examples -** AWS Lambda, Amazon RDS,Amazon DocumentDB **

SaaS SaaS provides you with a complete product that is run and managed by the service provider

nfrastructure as a Service (IaaS) offerings on AWS:

Amazon EC2 (Elastic Compute Cloud) A

mazon EC2 provides resizable compute capacity in the cloud, allowing users to launch virtual servers, known as instances, to host their applications. EC2 offers various instance types optimized for different workloads, providing flexibility and scalability for computing resources.

Amazon S3 (Simple Storage Service):

Amazon S3 is a scalable object storage service that offers reliable and secure storage for a wide range of use cases. It provides the ability to store and retrieve any amount of data at any time, making it suitable for various storage needs, including data backup, content storage, and application hosting.

Amazon VPC (Virtual Private Cloud):

Amazon VPC allows users to create a logically isolated section of the AWS Cloud, where they can launch AWS resources in a virtual network. VPC provides control over the virtual networking environment, including IP address ranges, subnets, and routing tables, enabling users to customize network configurations for their applications.

These IaaS offerings on AWS provide the foundational infrastructure components for users to build, deploy, and manage their applications and services in the cloud.

AWS Lambda: AWS Lambda is a serverless computing service that falls under the PaaS category. It enables developers to run code without provisioning or managing servers.

Question 4

Q

What is What is Amazon Simple Storage Service?

What is Object storage?

Answer

A

AWS S3, or Amazon Simple Storage Service, is a scalable, secure, and highly durable object storage service provided by Amazon Web Services.

Object storage is a type of storage architecture that manages data as objects, unlike traditional file systems that store data as files in a folder hierarchy. Each object typically includes the data itself, metadata, and a unique identifier. Object storage systems are highly scalable and can manage large amounts of unstructured data efficiently.

Question 5

Q

What is a document database, and how does it differ from a relational database?

Answer

A

A document database stores semi-structured data as documents, using nested key-value pairs to provide the document’s structure or schema.
Unlike a relational database, it does not normalize data across multiple tables with a unique and fixed structure, allowing for flexibility in storing different types of documents in the same database.

Document databases are beneficial for use cases such as managing user profiles, real-time big data analysis, and content management. They offer a flexible schema for storing diverse attributes and data values, making them practical for online profiles, real-time data extraction, and content aggregation from various sources.

Question 6

Q

How does Amazon DocumentDB enable machine learning and generative artificial intelligence models to work with data in real time?

Answer

A

Amazon DocumentDB eliminates the need to manage separate infrastructure, write code to connect with another service, or duplicate data from the primary database, enabling machine learning and generative AI models to work with real-time data.
Vector search for Amazon DocumentDB, you can store, index, and search millions of vectors with millisecond response times. A vector is a numerical representation that represents the semantic meaning of unstructured data such as text, images, and video. You can store vectors from Amazon Bedrock, Amazon SageMaker, and other third party or propriety models

Question 7

Q

Explain what is Vector Search for Amazon DocumentDB

Answer

A

Vector search for Amazon DocumentDB is a feature that allows you to perform similarity search on the data stored in your Amazon DocumentDB database using vector representations of the documents.
This feature leverages machine learning and similarity algorithms to retrieve similar documents based on their content.
This approach enables efficient and accurate retrieval of similar documents, making it valuable for applications such as recommendation systems, content discovery, and information retrieval.

Suppose you have a collection of customer reviews stored in Amazon DocumentDB. With vector search, you can input a specific customer review and find other reviews that share similar sentiments, topics, or opinions. For instance, if you input a review expressing satisfaction with a particular product, the vector search feature can retrieve other reviews that convey similar levels of satisfaction or discuss similar product attributes, helping to identify patterns and trends in customer feedback.

Question 8

Q

what is A data lake?

Answer

A

A data lake is a centralized repository that enables storage of structured and unstructured data at any scale.

It allows storing data without the need to structure it beforehand and supports various types of analytics, including dashboards, visualizations, big data processing, real-time analytics, and machine learning for informed decision-making

AWS data lakes are collections of data stored in Amazon S3 that can be analyzed using services like Athena, Glue, and Lake Formation. This allows for centralized data storage and analysis across an organization.

Question 9

Q

What is A data warehouse

Answer

A

A data warehouse is a database optimized to analyze relational data coming from transactional systems and line of business applications. The data structure and schema are defined in advance to optimize for fast SQL queries, and the results are typically used for operational reporting and analysis. Data is cleaned, enriched, and transformed so it can act as the “single source of truth” that users can trus

Question 10

Q

How does a data warehouse compare to a data lake?

Answer

A

Depending on the requirements, a typical organization will require both a data warehouse and a data lake as they serve different needs, and use cases.
A data warehouse is designed to analyze relational data with a predefined structure for fast SQL queries, mainly for operational reporting and analysis. Conversely, a data lake can store structured and unstructured data without prior structuring, supporting various analytics like dashboards, visualizations, big data processing, real-time analytics, and machine learning for informed decision-making.

Question 11

Q

What is Microservices?

Answer

A

Microservices are an architectural paradigm that breaks an application into small autonomous services, each of which is independently deployable and loosely coupled to other microservices in the application.

Question 12

Q

What is Containers?

Answer

A

Containers are semi-isolated execution environments that bundle applications with their dependencies to allow easy and consistent deployment regardless of the environment.

Containers provide the essential infrastructure for deploying, managing, and scaling microservices, ultimately enhancing scalability, resource utilization, and customer experience.

Question 13

Q

What is Kubernetes?

Answer

A

Kubernetes is an open-source container orchestration system that automates the management, scaling, and deployment of containerized applications at scale. It arranges containers into logical groupings for management and discoverability, then launches them onto clusters of Amazon Elastic Compute Cloud (Amazon EC2) instances.
Kubernetes allows you to run containerized applications, including microservices, batch processing workers, and platforms as a service (PaaS) using the same toolset on premises and in the cloud.

Question 14

Q

What is Continuous Integration (CI)?

Answer

A

Continuous Integration (CI) is a software development practice where developers regularly merge their code changes into a central repository. Each merge triggers an automated build and testing process to quickly identify and resolve integration errors.

Question 15

Q

What is Continuous Delivery (CD)?

Answer

A

Continuous Delivery (CD) is an extension of CI that ensures software is always in a deployable state. It involves automating the process of deploying code changes to production or staging environments after successful builds and tests.

Question 16

Q

What are Pipelines in the context of CI/CD?

Answer

A

Pipelines refer to a series of steps that automate the software delivery process, from code changes to deployment. These pipelines consist of various stages like build, test, and deploy, allowing for efficient and consistent software delivery.

Question 17

Q

What is an EC2 instance type?

Answer

A

Amazon Elastic Compute Cloud (EC2) is a web service that provides resizable compute capacity in the cloud.

Amazon EC2 offers features such as scalable compute capacity, multiple instance types, flexible pricing options, and the ability to customize and configure instances.

Question 18

Q

What is Amazon Elastic Container Service (ECS)?

Answer

A

Amazon ECS is a fully automated container management service that allows customers to run containers at scale in the cloud without the need to manage the container orchestration software.

Amazon ECS provides benefits such as automated container management, seamless hybrid environment support, security, cost control, simplicity, and integration with other AWS services.

Question 19

Q

What are Latent diffusion models?

Answer

A

Latent diffusion models are ways to study how new things like ideas, products, or behaviors spread among people. These models look at both what we can see, like how many people adopt something new, and also at hidden factors, like people’s personal preferences or social connections that affect how fast something new catches on. By using these models, researchers can understand why some things become popular quickly while others take more time to catch on, helping them figure out how to promote new ideas or products better.

Question 20

Q

What is ETL Pipeline

Answer

A

An ETL (Extract, Transform, Load) pipeline is a process that pulls data from one or more sources, transforms it to meet specific requirements, and then loads it into a target database or data warehouse. The key steps in an ETL pipeline are:

Extract: Pulling data from various source systems.
Transform: Cleaning, formatting, and preparing the data for the target system.
Load: Transferring the transformed data into the target database or data warehouse.