C1 Flashcards
What are the core functional requirements for a banking application?
What does the system need to do?
What are the system’s key features?
What are the end user’s expectations?
What are the system’s constraints?
1) Account creation and management for checking and savings accounts.
2) Fund transfers between accounts.
3) Viewing account balances and transaction history.
What are the non-functional requirements for a banking application?
1) Reliability
2) Security
3) High availability
4) Consistency for financial data
You will want to make sure that your detailed design can help you answer questions like:
What is the availability of your system, and is it fault-tolerant?
How is caching handled?
How are load balancers being used to distribute the load amongst servers?
What is the scale of the system?
How many users should it support?
How many requests should the server handle?
Are most use cases read-only?
Do users typically read the data shortly after someone else overwrites it?
Are most users on mobile devices?
What is the purpose of doing back-of-the-envelope calculations in system design?
To estimate the scale of the system, such as the number of accounts, transactions per second, and storage needs, helping inform infrastructure choices.
Estimating the number of servers: How many daily active users (DAU) do you expect to support?
Estimating the daily storage requirements: How many tweets are posted per day, and what is the percentage of tweets containing media?
Estimating network requirements: What is the maximum response time expected by the end user?
In our banking app, how many accounts would we have with 10 million users, each with an average of 2 accounts?
20 million accounts.
What is the peak transactions per second (TPS) if each user makes 2 transactions per day, with peak traffic in 10% of the day?
Approximately 2,315 TPS.
24 * 0.1 (10%) == 2.4
2.4hours×3600seconds/hour=8640seconds
PeakTPS=
Totaltransactionsperday / Peakperioddurationinseconds == 20M / 8640 == 2315
What should the system interface for creating an account include?
POST createAccount(accountType string, accountInfo info)
where info
contains user details like name, email, etc. Returns account ID and account type.
How can we ensure duplicate accounts are not created for the same user?
By using unique identifiers (like email) and implementing distributed locks to avoid concurrency issues during account creation.
What should the system interface for updating account details include?
PUT manageAccount(accountID int64, accountInfo info)
to update modifiable fields. Returns updated account details.
What is a potential solution for handling duplicate requests in a funds transfer operation?
Use a unique transaction ID for each transfer and check if a transaction with the same ID exists before processing it.
idempotency keys and deduplication caches
User ID + Timestamp + Transaction Details: Combine the user’s ID, the timestamp of the request, and essential details like the recipient account and amount. This approach works well if there is low chance of the user initiating identical transactions within milliseconds.
User ID + Idempotency Key: Have the client generate a unique idempotency key for each transaction, which is sent along with the request. This key could be a UUID or hash generated by the client at the time of request initiation.
Transaction UUID: Have the client generate and send a unique UUID for each transaction. If the server receives multiple requests with the same UUID, it knows they represent the same transaction.
What are two critical aspects to ensure in the View Account Balances API?
1) Consistency (ensuring users see up-to-date balances)
* Read-after-write consistency (read-your-writes) (redis) OR use the primary writer for your read to prevent lag to read replica
* cache invalidation on the balance after each write for accurate balance display
2) Security (restricting access to authorized users)
* OAuth / JWT Token
* RBAC
* HTTPS
How can we ensure balance consistency in a high-concurrency environment?
Use read-after-write consistency with caching, such as Redis, or rely on database transactions and cache invalidation on balance changes.
What indexes should be included on the Transaction table?
Primary Key on transactionId
, composite indexes on (fromAccountId, timestamp)
and (toAccountId, timestamp)
, and index on timestamp
.
How can we restrict unauthorized access to balance data in a banking app?
Use JWT or OAuth tokens for authentication and implement role-based access control (RBAC).
Describe cursor-based pagination.
Cursor-based pagination uses a “cursor” that indicates the last item retrieved (e.g., transactionID), allowing efficient, scalable navigation through large datasets.
What indexes should be included on the Account table for a banking app?
Primary Key on accountId
, unique index on email
, and index on lastLogin
.
What is the high-level system design for a banking app?
Client -> API Gateway -> Load Balancer -> Microservices (Account, Transaction, Notification) -> Database (Postgres with replicas), Redis Cache, Message Queue
What is the role of the API Gateway in the banking system architecture?
The API Gateway handles client requests, enforces authentication, rate limiting, and routes requests to the appropriate microservices.
Why use microservices for a banking app?
Microservices offer scalability, fault isolation, ease of maintenance, and allow each service to scale independently based on demand.
What are the main microservices in our banking app?
1) Account Management Service
2) Transaction Service
3) Notification Service
What is the role of the Account Management Service in a banking app?
Handles creating, updating, and deleting user accounts, including validation to prevent duplicate accounts.
What is the role of the Transaction Service in a banking app?
Manages fund transfers, balance checks, and retry mechanisms for consistency and idempotency.
What is the role of the Notification Service in a banking app?
Sends notifications (e.g., for transactions) asynchronously by reading messages from a message queue.
Why is Redis used in the banking system architecture?
Redis caches frequently accessed data (e.g., balances) to reduce database load and improve response times.
What purpose does a load balancer serve in the banking app architecture?
Distributes incoming requests across multiple instances of each microservice for load distribution and high availability.
Why is a message queue like Kafka used in the banking app?
To handle asynchronous messaging for long-running tasks (e.g., notifications) and ensure reliable delivery with retries.
- Kafka is a distributed queue
- high throughput low latency
- scalability (horizontal scaling, Topics are partitioned across brokers, and consumers can read from these partitions concurrently, which further boosts scalability)
- Durability and Reliability
- Kafka stores messages on disk and can replicate data across multiple brokers, ensuring data persistence and fault tolerance.
- If a broker fails, Kafka can seamlessly recover from replicas, ensuring no data loss and continued operation.
- Data Retention and Replayability
- Fault Tolerance
- Decoupling of Producers and Consumers
- Real-time and Batch Processing Capabilities
- Exactly-Once Processing Semantics
How can circuit breakers improve the reliability of the Transaction Service?
Circuit breakers detect service failures and prevent further calls, enabling fallback responses and avoiding cascading failures.
- Transaction Processing: For each dependency within a transaction, wrap calls in a circuit breaker. If, for example, the Transaction Service needs to verify account balances and the balance service is unavailable, the circuit breaker will prevent further requests to it, avoiding a backlog.
- Cross-Service Dependencies: If the Transaction Service needs to interact with other services (e.g., fraud detection, payment authorization), circuit breakers ensure that failures in these services don’t disrupt the entire transaction processing workflow.
- Real-Time Alerts and Monitoring: Circuit breakers can be integrated with monitoring tools to alert administrators when certain services are unreachable, allowing proactive issue resolution.
Example
* Check Account Balance: If the balance service is down, the circuit breaker opens, and the Transaction Service can either return an error or use a cached balance (if available).
* Fraud Check: If fraud detection is unavailable, the circuit breaker will prevent retries, and the Transaction Service might flag the transaction for later review rather than delaying or failing the operation.
* Payment Processor: If the payment processor is down, the Transaction Service quickly returns an error response, allowing the user to retry later instead of experiencing extended delays.
Circuit breakers prevent cascading failures, improve response times by failing fast, enable graceful degradation, and support automatic recovery
What is a dead-letter queue, and how is it used in a banking app?
A dead-letter queue stores messages that could not be processed, allowing for isolation of problematic messages and improving system resilience.
How can horizontal scaling be achieved in a banking microservices architecture?
By adding more instances of each microservice, each capable of handling a portion of the load, ensuring better performance under high traffic.
What monitoring tools can help track performance in a microservices architecture for banking?
ELK Stack (Elasticsearch, Logstash, Kibana) or Grafana and Prometheus for real-time monitoring, logging, and alerting.
What is distributed tracing, and why is it important in a microservices architecture?
Distributed tracing tracks requests across services, helping to identify latency and bottlenecks in complex microservices interactions.
Why is connection pooling important for database performance in a high-traffic banking app?
Connection pooling reduces the overhead of opening and closing connections frequently, improving database throughput and performance.
How can high availability be ensured for the primary database in a banking app?
By using a master-replica setup where the primary (master) handles writes, and replicas handle reads, ensuring redundancy and availability.
What are some caching strategies to improve user experience in a banking app?
Use Redis for caching balance data, implement read-after-write consistency, and consider caching at the API Gateway level for frequently accessed data.
- Near-Real-Time Cache Invalidation for Critical Data
- Some banking operations, like balance updates and funds transfers, are highly sensitive to consistency. In these cases, implement a near-real-time cache invalidation mechanism to clear or update cache entries immediately after a transaction.
- For instance, if a user makes a transfer, immediately invalidate or update the balance cache to ensure accurate information for the next balance check.
- This approach provides reliability for sensitive information and improves trust in the app’s data accuracy.
- Read-through-cache
- A read-through cache automatically loads data into the cache from the database or backend service when it’s requested but not already cached.
- In a banking app, this strategy is useful for frequently accessed but relatively static data like account information (e.g., account details, interest rates, and account types).
- This approach reduces latency for users by caching data the first time it’s requested, improving response times on subsequent requests.
- Write-Through Cache for Frequently Updated Data
- In a write-through cache, data is written to both the cache and the database simultaneously, ensuring the cache always has the latest data.
- This is beneficial for high-read, high-write items like user preferences and recent transactions.
- By keeping frequently updated data in the cache, users experience faster load times, while the cache remains consistent with the database.
- Client-Side Caching for Static Data
- For data that doesn’t change frequently (e.g., terms and conditions, product details), leverage client-side caching to reduce server requests.
- This allows the app to store static data on the user’s device, improving performance and reducing data usage, which can be important for users on mobile networks.
- Implement versioning to refresh client-side cache when these resources are updated, ensuring users always have access to the latest information.
What is the purpose of sharding the database in a large-scale banking application?
Sharding distributes data across multiple database instances, reducing load on each instance and enabling horizontal scaling.
- Enhanced Performance and Reduced Latency
- Each shard contains only a subset of the data, meaning queries are handled by a smaller dataset, which reduces the time taken to retrieve data.
- Improved Fault Isolation and Reliability
- Sharding isolates data into separate databases. If one shard encounters an issue (e.g., hardware failure or data corruption), the impact is limited to only that shard, not the entire database.
- Efficient Resource Management
- By segmenting data, each shard requires fewer system resources (memory, CPU, and storage) than a monolithic database would. This reduces the load on each database instance and avoids the resource limitations associated with a single, large database.
Examples
* Hash-Based Sharding (Random Distribution)
* Distribute customer accounts or transactions by hashing the account ID or transaction ID, ensuring an even distribution of data across shards.
* This approach helps balance the load across shards but requires a central mechanism for mapping requests to the correct shard.
- Range-Based Sharding (Data Segmentation)
- Partition data by ranges (e.g., customer account numbers or date ranges for transactions), useful for workloads where data locality is beneficial.
- For instance, transaction records can be sharded by date range to keep recent data in a hot shard for quick access, while older records are stored in archival shards.