3. AI Technology Stack Flashcards

1
Q

What is an AI Platform?

A

An AI platform is software that allows an organization to develop, test, deploy and refresh AI applications.

Platforms can:
-Centralize data analysis
- Streamline development and productions workflows
- Facilitate collaboration
- Automate systems-development tasks
- Monitor models and systems in production

Examples: Google Cloud Platform, Microsoft Azure, Amazon Web Services

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are common uses of AI applications?

A
  • E-commerce
  • Education
  • Health care
  • Autonomous vehicles
  • Navigation
  • Facial recognition
  • Robotics
  • Human resources
  • Marketing
  • Social media
  • Chatbots
  • Finance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are common AI models?

A
  1. Linear and statistical models
  2. decision trees
  3. Machine learning models
    - Neural networks
    * Computer vision model
    * Speech recognition model
    * Language models
    * Reinforcement learning models
  4. Robotics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe Linear and Statistical Models:

A
  • Models the relationship between two variables (ex. how sales of a product are related to changes in pricing based on historical data)
  • Linear statistical models are not black box algorithm and more explainable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Describe the Decision Tree Models:

A
  • Predicts an outcome based on a flowchart of questions and answers
  • Explainable and not a black box
  • Disadvantage: changing the training data (even in a small way) can significantly impact the algorithm; subject to security attacks and hacks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Describe the Machine Learning Models:

A
  • Have black box capabilities
  • Have a lack of transparency and explainability
  • Neural networks (based on the human brain) * Contain nodes, like neurons, in a layered structure and continuously improve the
    ability to find the right answer
  • Do not need to be trained to make complex nonlinear inferences in unstructured
    data
  • Commonly behind technology, such as facial recognition
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Give examples of Neural Networks:

A
  1. Computer vision models: used to recognize images in videos
  2. Speech recognition models: used in products like Alexa, transcription software (analyze speech across factors such as pitch, tone, language, and accent)
  3. Language models: natural language processing; allow computers to understand human language using machine learning, deep learning models and linguistics (used to process and respond to large amounts of communications data ex. customer service chatbots)
  4. Reinforcement learning models: train models to optimize their actions within a given environment to achieve a specific goal
  • Guided by feedback mechanisms of rewards and penalties
  • Conducted through trial and error; interactions or simulated experiences that do not require external data. (Ex. an algorithm trained to earn a high score in a video
    game by having its efforts evaluated and rated according to success towards the
    goal.
  • Disadvantage: lack of explainability and transparency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Explain Robotics:

A
  • Multidisciplinary field encompassing the design, construction operation and programing of robotics
  • Allows AI systems and software to interact with the physical world without human
    intervention. (Ex. Roomba using machine learning to navigate a building)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How does technology stack propose challenges to AI and has it driven AI to the heights that we
see today?

A

The algorithmic innovation has been one of the true advances in pushing forward

Started with phenotypic and image data capture systems in response to genomic research
and the accumulation of data:
* Supervised data
* Data collected in interactive environments
* Structured and unstructured data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the main areas of AI infrastructure?

A
  1. Compute
  2. Storage
  3. Network and software development
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How does compute infrastructure advance AI?

A
  1. GPUs have thrust the AI movement forward. They have specialized chips that offload from
    CPUs.
    * Provide better performance and match to algorithmic advances
    * Better in matching hardware to the AI model for optimal performance
  2. Serverless: not limited to a particular server or piece of hardware; running code on multiple
    hardware devices, providing two important services or functionalities for AI:
    * Loose coupling: taking data from a variety of sources
    * Scalability: running multiple instances of the code because it’s not tied to a given server, which helps drive AI forward
  3. High performance compute: create isolated clusters of compute power; high-speed networking, specialized chipsets

4.Trusted execution environments: good from a privacy perspective for AI, as human influence is taken out of the equation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is Quantum Computing?

A

Processing data in three dimensions horizontally and vertically.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 4 general stages of AI storage?

A
  1. Ingestion
  2. Preparation
  3. Training
  4. Output (inference)

Each stage has different storage requirements that must be adhered to in order to avoid project failure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are storage infrastructure considerations?

A
  1. Expense of a storage solution for massive amounts of data
  2. Different storage for a variety of storage types (file, object, image, etc.). (Each require different storage subsystems and may also affect expenses)
  3. Storage types for structured vs. unstructured data
    * Easier to process structured data than unstructured data
    * AI must be done at scale; flexible storage allows the ability to do an AI at scale
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How does network infrastructure advance AI?

A
  1. High-speed networks needed to support AI models: complex AI models, deep learning
    models, natural language processing, large language models like ChatGPT
  2. Deliver to training data in time to the AI algorithm as well as training and inference at scale
    through high-speed networks
  3. High-performance compute. Underlying infrastructure is housed in the same data centers, usually in the same rack and connected via fiber connections
  4. Edge computing: the Internet of Things (It is estimated that within the next three to five years,
    each individual will have five connected devices)
  5. Communication or network protocols: based on a congestion-free design, especially for larger language models and neural networks
  6. Transmission control protocol (standard). (However, this requires a packet to be sent)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How has software driven and contributed to AI challnges?

A
  1. Democratization of AI: AI is easier to use, low-code software, simpler AI interfaces
  2. Tuning AI systems: allows customization of AI models to generate more accurate outcomes and provide highly valuable insights into data. Usually done through trial and error by changing hyper parameters (very numerous in complex models), varies on model type and
    complexity
  3. Scale AI models: Trial-and-error tuning (doing it as you go and goal is to move from trial-and-error tuning to educated tuning)
  4. Transformation: Data must be transformed into an AI model:
    *Increases data compatibility: some AI pipelines require transforming data for compatibility with the data the AI model will be analyzing, optimizing data quality
    * Low data quality poses a challenge: transformation may have to be done internally to the model, in preprocessing external to the model, or on post-processing of the model, the output
  5. Labeling: Enriches the data used for deployment, training and tuning (data labels need to be of a high quality and standard)
17
Q

What are the main data labeling challenge?

A
  • Low-quality data labels
  • Scaling high-quality data
  • Data labeling operations
  • Lack of quality assurance in data labeling operations (needing to verify and validate that data labels are high-quality)
18
Q

What is data observability and monitoring?

A
  1. Data observability is intended to monitor the overall health and status of an organization’s data ecosystem
  2. AI observability is a subcomponent of data observability focused on monitoring the performance of an AI algorithm, the data going in and coming out of an AI algorithm, and metrics of an AI system
  3. Data observability and AI observability are key to success on any AI project: provides indices and metrics for performance, an in-depth analysis of AI data and models and gives the capability to investigate, resolve and prevent AI model issues
  4. Perform outcome validation to ensure that desired outcomes are delivered, align with the AI model and can be more or less predicted from tuning and transformations
19
Q

What are data observability and monitoring challenges?

A
  1. Data integrity: The data trained on is irrelevant or close enough to the data that the analysis is based on, and may lead to inconsistent results
  2. Data drift: Similar to data integrity; training on a specific type of data and then the algorithm is applied to a different type of data; may produce faulty predictions, lead to future mistakes and provide inaccurate data pipelines (ex. in health care: the AI model is trained on one type of data and data drift occurs, that data is applied to a different type of data, leading to an incorrect diagnosis, having a significant impact on an individual)
  3. Bias and discrimination: is common (since humans create AI algorithms, the algorithms may inherit biases)
20
Q

What is the role of open-source AI?

A
  1. A maker culture that allows organizations greater freedom to innovate with AI
  2. Creates its own massive feedback loops that drive the free spread of ideas for applying AI transformation, tuning best practices and turning ideas into viable businesses or assets for organizations