P3 - Latency Flashcards
What is a chatbot?
A software application that simulates human conversation.
Commonly used in customer service, virtual assistants, and e-commerce.
Examples: Siri, Alexa, customer service bots on websites.
The RAKT scenario
RAKT’s chatbot: Designed for insurance queries but underperforming.
Key Problem: Slow response times (high latency).
Customers report frustration and dissatisfaction.
What is latency?
The delay between a user inputting a query and the chatbot responding.
Why does latency matter?
Slow responses = poor user experience.
Impacts customer satisfaction and trust.
Real-world analogy: Waiting on hold during a phone call.
What causes latency?
Complex Decision Algorithms
Large Datasets
High Query Volume
How does Complex Decision Algorithms cause latency?
Chatbots use multiple models to understand and respond to queries.
The “critical path” involves dependencies between these models.
How does Large datasets cause latency?
Unoptimized data leads to slower processing times.
How does High Query Volume cause latency?
Increased user load strains system performance.
How do we reduce latency?
Streamline Dependencies
Optimize Datasets
Improved Algorithms
How does Streamline Dependencies reduce latency?
Simplifies the critical path by removing unnecessary steps.
How does optimising datasets reduce latency?
Use classified, domain-specific, and cleaned data.
How does Improved Algorithms reduce latency?
They use faster and more efficient machine learning techniques.
What is the ‘Critical Path’ in this context?
The decision algorithm that defines the shortest and most efficient sequence of linked machine learning models required to transform a user’s input into the bot’s final response.
How do dependencies cause delays in software systems?
Components that rely on others to function properly. When one application or component is updated or improved, the entire system may need corresponding updates. Additionally, if an update for one component is incompatible with another, it can cause delays across the system.
What strategies can be used to streamline dependency processes?
Offloading
Lightweight Models
Caching
How can offloading be used to streamline dependency processes?
Delegating processes (e.g., Siri offloads tasks to the cloud) allows multiple tasks to run concurrently.
How can lightweight models be used to streamline dependency processes?
Using streamlined models offers shorter response times compared to larger, more complex counterparts.
How can caching be used to streamline dependency processes?
Storing commonly used data reduces the frequency of API calls, speeding up the overall process.