AWS AI/ML Tech Flashcards

1
Q

What are The 5 ML Development Life Cycle Stages

A
  1. Planning: Defining the problem and business goals. [citation:4]
  2. Data Preparation: Gathering, cleaning, and organizing data.
  3. Model Development: Selecting, training, and tuning the model. [citation:4]
  4. Deployment: Integrating the model into a live environment.
  5. Monitoring: Tracking performance and making adjustments as needed. [citation:4]
    1.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How to identifying opportunities

A

1. Business Needs

  • Identify Pain Points and Challenges: Begin by thoroughly understanding your organization’s current challenges and pain points. This could involve analyzing operational inefficiencies, customer churn, or areas where manual processes hinder productivity.
  • Explore Growth Opportunities: Look beyond immediate challenges and consider how AI can unlock new growth avenues. This might involve personalizing customer experiences, automating tasks, or developing innovative products and services.

2. Focus on Value Creation:

  • Quantify Potential Impact: Estimate the potential ROI of AI initiatives by considering factors like cost savings, revenue growth, and improved efficiency.
  • Prioritize High-Impact Use Cases: Focus on projects that offer the greatest potential for positive business outcomes and align with strategic goals.

3. Consider Feasibility:

  • Data Availability and Quality: Assess the availability and quality of data required to train and operate AI models effectively.
  • Resource Requirements: Evaluate the need for infrastructure, computing power, and skilled personnel to support AI initiatives.
  • **Expertise and Skillset: **Determine if your team possesses the necessary expertise or if additional training or external support is required.

Examples:
Start with readily available data: If you have existing customer data or operational logs, explore how they can be used to build initial AI models.
Utilize AWS Free Tier: Leverage AWS Free Tier to experiment with AI services and build proof-of-concept projects without significant upfront costs.
Engage with AWS Partner Network: Collaborate with AWS Partners to access specialized expertise and accelerate AI implementation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

**What are the 3 evaluation typs of entripise data ? **

A

1. Data Aware: Primarily records and stores data, without active use in decision-making; responsibility for data is not clearly defined.

2. Data Informed: Analyzes data to inform decisions, with specific roles assigned for managing data; employs interactive data tools for insights.

3. Data Driven: Integrates data into all decision-making processes, with a company-wide commitment to data; utilizes advanced technologies like AI for strategic actions.

1. Data Aware:

**Capability: **Focus is on the collection of data with an emphasis on knowing what has happened, signifying a basic level of data utilization.
**Ownership: **There’s no designated responsibility for data management, indicating a lack of strategic importance placed on data assets.
**Technology: **Relies on legacy systems (old guard databases) that support simple data storage without advanced analysis or integration capabilities.

2. Data Informed:

**Capability: **Uses insights gained from data analysis to understand why events have occurred, pointing to a more analytical approach to data.
Ownership: Data is managed by designated individuals or teams, reflecting an organizational move towards recognizing data as a valuable asset.
Technology: Incorporates more sophisticated tools like data warehouses and interactive queries, allowing for deeper analysis and pattern recognition.

3. Data Driven:

**Capability: **Prioritizes action based on data analytics, with the aim to influence future outcomes, demonstrating the highest level of data maturity.
**Ownership: **Data is a responsibility shared across the organization, indicating a culture that values and utilizes data in decision-making processes.
Technology: Employs cutting-edge technologies like cloud computing, artificial intelligence, machine learning, and generative AI, enabling real-time analysis and proactive decision-making.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define High-Impact AI Initiatives (HI-AIs)

A
  • high-impact AI initiative (HI-AI) is an AI opportunity that is feasible, has a clear short-term and long-term business impact, and minimizes risks.
    High-Impact AI Initiatives (HI-AIs)

It involves:
1. Identifying or recognizing potential AI initiatives (PAIs)

  1. Framing potential AI initiatives for clarity on benefits and measurability
  2. Scoring initiatives to shortlist and prioritize HI-AI initiatives
  3. Verifying with experts on the viability of the initiatives
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Amazon CodeWhisperer?

A

Amazon CodeWhisperer is an AI-powered coding assistant that provides real-time, context-aware code suggestions to enhance developer productivity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is ML?

A

“ML” stands for Machine Learning, which is a branch of artificial intelligence (AI). It involves training algorithms to make predictions or decisions based on data. Machine learning models automatically improve their performance as they are exposed to more data over time. It’s used in a wide range of applications, from recommendation systems in online platforms to autonomous vehicles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the 3 types of ML algorithms?

A
  1. Reinforcement Learning (RL):
    * Dynamic Decision Processes: RL is suitable for scenarios involving dynamic decision-making where an agent interacts with an environment to achieve a specific goal through trial and error.
  • Sequential Decision-Making: Ideal for tasks that require sequential decision-making, learning from feedback to optimize actions over time.
  • Delayed Rewards:Effective for problems with delayed rewards, where the consequences of actions are not immediate.
  • Real-World Applications: Widely used in autonomous driving, robotics manipulation, NLP, finance, and industry automation.
  • Example: Training robots for complex movements or optimizing resource allocation in dynamic environments.
  1. Unsupervised Learning:
    * Pattern Discovery: Unsupervised learning is valuable for tasks focused on finding patterns, relationships, or structures within data without explicit guidance.
    * Exploratory Data Analysis: Commonly used for clustering, dimensionality reduction, and anomaly detection.
    * Data Segmentation: Ideal for segmenting data into meaningful groups based on underlying patterns.
    Real-World Applications: Useful when there is a large amount of unlabeled data or to understand the inherent structure of the data.
    * Example: Segmenting customer preferences for targeted marketing or reducing data dimensionality for visualization.
  2. Supervised Learning:
    * Labeled Data: Supervised learning requires labeled training data where the model learns from input-output pairs to make predictions.
    Predictive Modeling: Ideal for tasks focused on making predictions or classifying new data points based on historical data.
    * Regression & Classification: Commonly used for regression tasks (predicting continuous values) and classification tasks (predicting categories).
    * Real-World Applications: Widely applied in image recognition, natural language processing, and customer behavior prediction.
    * Example: Spam email detection, medical diagnosis, image classification, and predictive maintenance.

Examples:

Supervised Learning
* Spam prediction, Fraudulent transaction detectior
* Customer churn prediction
* Machine failure prediction
* Forecasting staffing levels
* Forecasting raw material prices
* Forecasting consumer demand

Unsupervised Learning
* Micro-segmentation of custoers
* Recommendations of products to purchase
* Customer behavior analysis (market basket

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Deep Learning?

A
  • Deep learning is a special class of ML that uses neural networks behind the scenes. Neural networks try to simulate the way the human brain works, with many densely interconnected brain cells. The idea is to replicate this inside a computer, so you can get it to learn things, recognize patterns, and make decisions in a humanlike way.
  • Unlike traditional ML, the beauty of deep learning is that it can automatically uncover features in data that it should use for learning to make optimal predictions.

Let’s take the house price prediction problem that we talked about earlier. With neural networks, you can present all possible input -from zip code to neighborhood to the average median price of houses in the city. Neural networks can decide which features are essential and which ones can be excluded or relied on less. These networks will also learn which combination of input can make the best prediction. In contrast, with traditional ML, you’d have to experiment with different combinations of features. Even though a neural network can automatically learn which features to use, for it to work as expected, the network needs to be designed correctly and fed large volumes of high-quality training data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Provide two types of GenAI models

A

Transformers
Description: Transformers are an innovative type of neural network architecture that can understand and process the context of words within a sentence or text sequence. They employ mechanisms such as self-attention to assign appropriate weights to the words based on their context within the input sequence.

GANs (Generative Adversarial Networks)
Description: GANs are a type of generative model characterized by their two main components: a generator and a discriminator. The generator attempts to produce synthetic data, such as images, while the discriminator aims to differentiate between real and generated data. GANs are built on an adversarial training paradigm whereby the generator and discriminator engage in a competitive dialogue, resulting in the generator improving its capacity to create more realistic outputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do large language models work?

A
  • A key factor in how LLMs work is the way they represent words. Earlier forms of machine learning used a numerical table to represent each word. But, this form of representation could not recognize relationships between words such as words with similar meanings.
  • This limitation was overcome by using multi-dimensional vectors, commonly referred to as word embeddings, to represent words so that words with similar contextual meanings or other relationships are close to each other in the vector space.
  • Using word embeddings, transformers can pre-process text as numerical representations through the encoder and understand the context of words and phrases with similar meanings as well as other relationships between words such as parts of speech. It is then possible for LLMs to apply this knowledge of the language through the decoder to produce a unique output
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are applications of large language models?

A

“There are many practical applications for LLMs.

Copywriting
Apart from GPT-3 and ChatGPT, Claude, Llama 2, Cohere Command, and Jurassiccan write original copy. AI21 Wordspice suggests changes to original sentences to improve style and voice.

Knowledge base answering
Often referred to as knowledge-intensive natural language processing (KI-NLP), the technique refers to LLMs that can answer specific questions from information help in digital archives. An example is the ability of AI21 Studio playground to answer general knowledge questions.

Text classification
Using clustering, LLMs can classify text with similar meanings or sentiments. Uses include measuring customer sentiment, determining the relationship between texts, and document search.

Code generation
LLM are proficient in code generation from natural language prompts. Examples include Amazon CodeWhisperer and Open AI’s codex used in GitHub Copilot, which can code in Python, JavaScript, Ruby and several other programming languages. Other coding applications include creating SQL queries, writing shell commands and website design.

Text generation
Similar to code generation, text generation can complete incomplete sentences, write product documentation or, like Alexa Create, write a short children’s story.”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is natural language processing?

A
  • Large language models (LLM) are huge deep learning models trained on massive datasets.
  • They use transformer neural networks with an encoder and a decoder for self-attention.
  • Transformer models are capable of unsupervised training and self-learning to understand languages and knowledge.
  • Unlike earlier models, transformers process entire sequences in parallel, reducing training time.
  • The large-scale transformer architecture allows the use of models with hundreds of billions of parameters and the ingestion of massive amounts of data.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How Language Models work?

A

Understanding Language Models

  • An essential aspect of how Large Language Models (LLMs) operate is the method of word representation.
  • Previous machine learning approaches used a numeric table for word representation, unable to recognize relationships between words like those with similar meanings.
  • Multi-dimensional vectors, known as word embeddings, have overcome this limitation by representing words in a manner where words with related contextual meanings or other connections are positioned closely together in the vector space.
  • Transformers, utilizing word embeddings, process text as numerical representations via the encoder, comprehending word and phrase contexts, as well as other word relationships, such as parts of speech.
  • Subsequently, the LLM applies this language knowledge via the decoder to generate a unique output.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Amazon quickSight (Gen AI serverless for BI)

A
  • Amazon QuickSight is a cloud-based business intelligence tool that helps users analyze data and gain insights. It has the following key features:
  • QuickSight allows users to connect to data from various sources like AWS, third-party data, spreadsheets and SaaS applications. It can combine different types of data into a single dashboard for reporting and analysis.
  • QuickSight authors can export data to SageMaker Canvas to build ML models without any coding. Various algorithms can be used to create predictive models for tasks like forecasting, anomaly detection, and more.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How Amazon Web Services (AWS) Services for GAN Support?

A

Amazon Web Services (AWS) Services for GAN Support

  1. Amazon SageMaker
    • Fully managed service for preparing data and building, training, and deploying machine learning models.
    • Offers fully managed infrastructure, tools, and workflows for diverse model applications.
    • Features tailored to accelerate GAN development and training for various applications.
  2. Amazon Bedrock
    • Fully managed service providing access to foundation models (FMs) or trained deep neural networks from Amazon and leading AI startups.
    • Offers FMs through APIs, allowing flexibility to select the most suitable model for specific requirements.
    • Enables swift development and deployment of scalable, reliable, and secure generative AI applications without infrastructure management.
  3. AWS DeepComposer
    • Offers a creative approach for ML initiation, allowing hands-on experience with a musical keyboard and modern ML techniques to enhance ML skills.
    • Regardless of ML or music background, developers can engage in GAN training and optimization to create original music.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are AWS Deep Learning Services?

A

AWS Deep Learning Services
- Utilize cloud computing to scale deep learning neural networks cost-effectively and optimize for speed.
- AWS offers specific services for fully managing deep learning applications:
- Amazon Rekognition: Incorporate pretrained or customizable computer vision features into applications.
- Amazon Transcribe: Automatically recognize and transcribe speech accurately.
- Amazon Lex: Build intelligent chatbots proficient in understanding intent, conversational context, and task automation across multiple languages.
- Amazon SageMaker: Provides a swift and straightforward approach to constructing, training, and deploying neural networks at scale for deep learning on AWS.
- AWS Deep Learning AMIs: Create customized environments and workflows for deep learning applications.
- Begin your deep learning journey on AWS by signing up for a free AWS account today!

17
Q

What are Foundation Models ?

A
  • Foundation models (FMs) are extensive deep learning neural networks revolutionizing data scientists’ approach to machine learning (ML), enabling quicker and more cost-effective development of new AI applications by leveraging vast datasets.
  • The term foundation model was coined by researchers to describe ML models trained on a broad spectrum of generalized and unlabeled data and capable of performing a wide variety of general tasks such as understanding language, generating text and images, and conversing in natural language.
  • Foundation models are indeed pre-trained models. These models are trained on vast amounts of data and are equipped with hundreds of millions or even billions of parameters. This preparatory training enables them to perform a range of tasks and adapt to a wide variety of downstream applications. Therefore, foundation models are specifically designed to be pre-trained in order to achieve their adaptability and versatility in handling diverse use cases.
18
Q

How AWS Can Help with foundation models?

A
  • Amazon SageMaker JumpStart, which is a ML hub offering models, algorithms, and solutions, provides access to hundreds of foundation models, including top performing publicly available foundation models. New foundation models continue to be added, including Llama 2, Falcon, and Stable Diffusion XL 1.0.
  • Amazon Bedrock is the easiest way to build and scale generative AI applications with foundation models. Amazon Bedrock is a fully managed service that makes foundation models from Amazon and leading AI startups available through an API, so you can choose from various FMs to find the model that’s best suited for your use case. With Bedrock, you can speed up developing and deploying scalable, reliable, and secure generative AI applications without managing infrastructure.
  • Amazon Titan is High-performing foundation models from Amazon
19
Q

What are examples of foundation models?

A

Claude
- Claude 2 is Anthropic’s state-of-the-art model that excels at thoughtful dialogue, content creation, complex reasoning, creativity, and coding, built with Constitutional AI. Claude 2 can take up to 100,000 tokens in each prompt, meaning it can work over hundreds of pages of text, or even an entire book. Claude 2 can also write longer documents—like memos and stories on the order of a few thousand tokens—compared to its prior version.

GPT
- The Generative Pre-trained Transformer (GPT) model was developed by OpenAI in 2018.

AI21 Jurassic
- Released in 2021, Jurassic-1 is a 76-layer auto-regressive language model with 178 billion parameters. Jurassic-1 generates human-like text and solves complex tasks. Its performance is comparable to GPT-3.

Amazon Titan
- Exclusive to Amazon Bedrock, the Amazon Titan family of models incorporates Amazon’s 25 years of experience innovating with AI and machine learning across its business. Amazon Titan foundation models (FMs) provide customers with a breadth of high-performing image, multimodal, and text model choices, via a fully managed API.
- Amazon Titan models are created by AWS and pretrained on large datasets, making them powerful, general-purpose models built to support a variety of use cases, while also supporting the responsible use of AI. Use them as is or privately customize them with your own data.

20
Q

What is Amazon SageMaker designed to help data scientists and developers accomplish?

A

Amazon SageMaker is designed to help data scientists and developers prepare data, and build, train, and deploy machine learning models quickly by integrating purpose-built capabilities. This allows for the construction of highly accurate models with less effort spent on managing ML environments and infrastructure.

21
Q

How does SageMaker Data Wrangler assist in feature engineering?

A
  • Feature engineering is crucial because data in its raw form often doesn’t provide enough or optimal information for training models. By converting, transforming, or combining raw data into features, models can better learn from the data, reducing noise and enhancing the signal.
  • SageMaker Data Wrangler helps in converting, transforming, or combining raw tabular data into features in a fraction of the time it typically takes, making the feature engineering process more efficient.
22
Q

What is the purpose of the SageMaker Feature Store?

A

The SageMaker Feature Store allows users to save, version, describe, and search for features. This facilitates the sharing and reuse of features across teams, improving collaboration and model development efficiency.

23
Q

Provide an example of sagemaker implementation

A

To build a model that creates a musical playlist curated to the listener’s taste using Amazon SageMaker, we can summarize the process in five key points:

  • Data Collection and Preparation: Utilize Amazon SageMaker to connect and load a large quantity of song metadata from sources like Amazon S3 and Amazon Redshift. This metadata might include song length, beats per minute, genre, and rating, which are crucial for training the model.
  • Feature Engineering: Employ SageMaker Data Wrangler to convert, transform, or combine raw tabular data into meaningful features, such as “danceability” by combining beat and genre. This step is essential to enhance the model’s learning process by maximizing the signal and reducing noise in the data.
  • Utilize Feature Store: Save the engineered features to SageMaker Feature Store to facilitate versioning, description, and easy retrieval of features. This not only helps in training the model with relevant data but also ensures that features can be reused for other models, improving efficiency.
  • Balance and Clarify Data: Apply SageMaker Clarify to ensure the training data is well-balanced across various musical genres and to identify potential biases. This step is critical to create a model that accurately reflects a wide array of musical tastes and genres, leading to more personalized playlist recommendations.
  • Model Training and Deployment: After preparing and ensuring the quality of the training data, train the machine learning model using SageMaker. Once trained, deploy the model for making real-time predictions. Continuously improve the model over time by leveraging new data and insights from tools like SageMaker Clarify and SageMaker Debugger for systematic error and bias removal.
24
Q

How does SageMaker contribute to the productivity and cost-efficiency of data science teams?

A

With SageMaker, data science teams can achieve up to a 10 times improvement in productivity and a 54% lower total cost of ownership (TCO) compared to other cloud platforms, thanks to its integrated development environment and automated processes

25
Q

What is Amazon Titan?

A

High-performing foundation models from Amazon that allow you to:

  • Text Generation
  • Summarization
  • Semantic Search
  • Image Generation
  • Retrieval Augmented Generation (RAG)
26
Q

What is Precision

A

Precision measures the accuracy of the positive predictions made by a model. Specifically, it answers the question: “Out of all the instances that the model predicted as positive, how many are actually positive?” It’s crucial when the cost of a false positive is high

27
Q

Recall in AI

A

Recall, on the other hand, assesses the model’s ability to identify all actual positive instances in the dataset. It answers: “Of all the positives that actually exist, how many did the model successfully identify?” This metric is vital when it’s crucial to capture as many positives as possible, even if some false positives are also picked up

28
Q

What is AWS Inferentia?

A
  • AWS Inferentia is an AI inference chip developed by Amazon for deploying machine learning models in production.
  • It provides high performance and low cost inference capabilities compared to other alternatives like GPUs.
  • Inference workloads involve deploying pre-trained machine learning models to perform predictions or classifications on new input data in real-time or batch
29
Q

What is AWS Trainium?

A
  • AWS Trainium is a new type of Amazon EC2 instance powered by AWS Trainium chips that are optimized for machine learning model training workloads.
  • Training workloads on the other hand involve developing machine learning models by feeding large datasets to training algorithms like deep neural networks.
30
Q

What is Amazon CodeWhisperer

A

Amazon CodeWhisperer is an AI coding assistant created by AWS to help developers write code more efficiently

31
Q

What are the Computer Vision modles

A

**Amazon Rekognition
**
* Analyze images and videos
* Catalog assets, automate workflows, and extract meaning from your media and applications.

**Amazon Lookout for Vision
**
* Detect defects and automate inspection
* Identify missing product components, vehicle and structure damage, and irregularities for comprehensive quality control.

**AWS Panorama
**
* Utilize computer vision at the edge
* Improve operations with automated monitoring to find bottlenecks and assess manufacturing quality and safety.

32
Q

What is the AWS serviese for Automated Data Extraction and Analysis

A

**Amazon Textract
**
* Extract text and data
* Pull valuable information from millions of documents at speed.

Amazon Comprehend

  • Acquire insights
  • Maximize the value of unstructured text with natural language processing (NLP).

Amazon A2I

  • Control quality
  • Add humans to the review process to ensure accuracy and compliance of sensitive data.
33
Q

What are the aws services for Language AI

A

**Amazon Lex
**
* Build chatbots & virtual agents
* Create automated conversation channels to improve customer service.

**Amazon Transcribe
**
* Automate speech recognition
* Enhance your applications and workflows with automatic speech recognition.

Amazon Polly

  • Give your apps a voice
  • Convert text into life-like speech, improving user experience and accessibility.
34
Q

What are the AWS services that are for Improve Customer Experience?

A

**Amazon Kendra
**
* Find accurate information faster
* Enhance websites and applications with natural language speech to help users quickly search for what they need.

Amazon Personalize

  • Personalize online experiences
  • Use ML to customize applications and websites to each individual user.

**Amazon Translate
**
* Engage audiences in every language
* Expand your reach and accessibility with fast, accurate, and customizable translation.

35
Q

What are the servies for Business Metrics?

A

Amazon Forecast

  • Forecast business metrics
  • Harness unique data types and time series data to create accurate end-to-end prediction models.

Amazon Fraud Detector

  • Detect online fraud
  • Stop adversaries and identify potential attacks with technology honed through years of use on Amazon.com.

Amazon Lookout for Metrics

  • Identify data anomalies
  • Detect and identify root causes of unexpected changes in metrics such as revenue and retention.
36
Q

What are the servies for Code and DevOps ?

A

Amazon DevOps Guru

  • Improve application availability
  • Simplify operational performance measurement and reduce application downtime.

**Amazon CodeGuru Reviewer
**
* Automated code reviews
* Detect bugs and assess critical issues and vulnerabilities fast for higher quality code.

Amazon CodeGuru Profiler

  • Eliminate costly inefficient code
  • Use runtime behavior analysis to improve application performance and decrease compute costs.
37
Q

What are the servies for Code and Industrial AI ?

A

Amazon Lookout for Equipment

  • Detect abnormal machine conditions
  • Automatically detect abnormal machine conditions by analyzing sensor data.

Amazon Monitron

  • Predictive maintenance
  • End-to-end predictive maintenance system that includes sensors, gateway, anomaly detection service, and end-user application.
38
Q

What are the AWS servies for Healthcare?

A

**Amazon HealthLake
**
* Store and analyze health data
* Securely store, transfer, query, and analyze health data to offer a complete view of a patient’s medical history.

**Amazon Comprehend Medical
**
* Extract health data
* Extract information from unstructured medical text accurately and quickly.

39
Q

What is IDA?

A

Iterative deepening A* (IDA) is a graph traversal and path-finding algorithm used in artificial intelligence. It combines the advantages of depth-first search and A search to efficiently find the shortest route in a weighted graph