C1 Flashcards

1
Q

What are the core functional requirements for a banking application?

What does the system need to do?
What are the system’s key features?
What are the end user’s expectations?
What are the system’s constraints?

A

1) Account creation and management for checking and savings accounts.
2) Fund transfers between accounts.
3) Viewing account balances and transaction history.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the non-functional requirements for a banking application?

A

1) Reliability
2) Security
3) High availability
4) Consistency for financial data

You will want to make sure that your detailed design can help you answer questions like:

What is the availability of your system, and is it fault-tolerant?
How is caching handled?
How are load balancers being used to distribute the load amongst servers?

What is the scale of the system?
How many users should it support?
How many requests should the server handle?
Are most use cases read-only?
Do users typically read the data shortly after someone else overwrites it?
Are most users on mobile devices?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the purpose of doing back-of-the-envelope calculations in system design?

A

To estimate the scale of the system, such as the number of accounts, transactions per second, and storage needs, helping inform infrastructure choices.

Estimating the number of servers: How many daily active users (DAU) do you expect to support?

Estimating the daily storage requirements: How many tweets are posted per day, and what is the percentage of tweets containing media?

Estimating network requirements: What is the maximum response time expected by the end user?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In our banking app, how many accounts would we have with 10 million users, each with an average of 2 accounts?

A

20 million accounts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the peak transactions per second (TPS) if each user makes 2 transactions per day, with peak traffic in 10% of the day?

A

Approximately 2,315 TPS.

24 * 0.1 (10%) == 2.4

2.4hours×3600seconds/hour=8640seconds

PeakTPS=
Totaltransactionsperday / Peakperioddurationinseconds == 20M / 8640 == 2315

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What should the system interface for creating an account include?

A

POST createAccount(accountType string, accountInfo info) where info contains user details like name, email, etc. Returns account ID and account type.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How can we ensure duplicate accounts are not created for the same user?

A

By using unique identifiers (like email) and implementing distributed locks to avoid concurrency issues during account creation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What should the system interface for updating account details include?

A

PUT manageAccount(accountID int64, accountInfo info) to update modifiable fields. Returns updated account details.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a potential solution for handling duplicate requests in a funds transfer operation?

A

Use a unique transaction ID for each transfer and check if a transaction with the same ID exists before processing it.

idempotency keys and deduplication caches

User ID + Timestamp + Transaction Details: Combine the user’s ID, the timestamp of the request, and essential details like the recipient account and amount. This approach works well if there is low chance of the user initiating identical transactions within milliseconds.

User ID + Idempotency Key: Have the client generate a unique idempotency key for each transaction, which is sent along with the request. This key could be a UUID or hash generated by the client at the time of request initiation.

Transaction UUID: Have the client generate and send a unique UUID for each transaction. If the server receives multiple requests with the same UUID, it knows they represent the same transaction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are two critical aspects to ensure in the View Account Balances API?

A

1) Consistency (ensuring users see up-to-date balances)
* Read-after-write consistency (read-your-writes) (redis) OR use the primary writer for your read to prevent lag to read replica
* cache invalidation on the balance after each write for accurate balance display
2) Security (restricting access to authorized users)
* OAuth / JWT Token
* RBAC
* HTTPS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can we ensure balance consistency in a high-concurrency environment?

A

Use read-after-write consistency with caching, such as Redis, or rely on database transactions and cache invalidation on balance changes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What indexes should be included on the Transaction table?

A

Primary Key on transactionId, composite indexes on (fromAccountId, timestamp) and (toAccountId, timestamp), and index on timestamp.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can we restrict unauthorized access to balance data in a banking app?

A

Use JWT or OAuth tokens for authentication and implement role-based access control (RBAC).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Describe cursor-based pagination.

A

Cursor-based pagination uses a “cursor” that indicates the last item retrieved (e.g., transactionID), allowing efficient, scalable navigation through large datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What indexes should be included on the Account table for a banking app?

A

Primary Key on accountId, unique index on email, and index on lastLogin.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the high-level system design for a banking app?

A

Client -> API Gateway -> Load Balancer -> Microservices (Account, Transaction, Notification) -> Database (Postgres with replicas), Redis Cache, Message Queue

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the role of the API Gateway in the banking system architecture?

A

The API Gateway handles client requests, enforces authentication, rate limiting, and routes requests to the appropriate microservices.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why use microservices for a banking app?

A

Microservices offer scalability, fault isolation, ease of maintenance, and allow each service to scale independently based on demand.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the main microservices in our banking app?

A

1) Account Management Service
2) Transaction Service
3) Notification Service

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the role of the Account Management Service in a banking app?

A

Handles creating, updating, and deleting user accounts, including validation to prevent duplicate accounts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the role of the Transaction Service in a banking app?

A

Manages fund transfers, balance checks, and retry mechanisms for consistency and idempotency.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is the role of the Notification Service in a banking app?

A

Sends notifications (e.g., for transactions) asynchronously by reading messages from a message queue.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Why is Redis used in the banking system architecture?

A

Redis caches frequently accessed data (e.g., balances) to reduce database load and improve response times.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What purpose does a load balancer serve in the banking app architecture?

A

Distributes incoming requests across multiple instances of each microservice for load distribution and high availability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Why is a message queue like Kafka used in the banking app?

A

To handle asynchronous messaging for long-running tasks (e.g., notifications) and ensure reliable delivery with retries.

  • Kafka is a distributed queue
  • high throughput low latency
  • scalability (horizontal scaling, Topics are partitioned across brokers, and consumers can read from these partitions concurrently, which further boosts scalability)
  • Durability and Reliability
    • Kafka stores messages on disk and can replicate data across multiple brokers, ensuring data persistence and fault tolerance.
    • If a broker fails, Kafka can seamlessly recover from replicas, ensuring no data loss and continued operation.
  • Data Retention and Replayability
  • Fault Tolerance
  • Decoupling of Producers and Consumers
  • Real-time and Batch Processing Capabilities
  • Exactly-Once Processing Semantics
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How can circuit breakers improve the reliability of the Transaction Service?

A

Circuit breakers detect service failures and prevent further calls, enabling fallback responses and avoiding cascading failures.

  • Transaction Processing: For each dependency within a transaction, wrap calls in a circuit breaker. If, for example, the Transaction Service needs to verify account balances and the balance service is unavailable, the circuit breaker will prevent further requests to it, avoiding a backlog.
  • Cross-Service Dependencies: If the Transaction Service needs to interact with other services (e.g., fraud detection, payment authorization), circuit breakers ensure that failures in these services don’t disrupt the entire transaction processing workflow.
  • Real-Time Alerts and Monitoring: Circuit breakers can be integrated with monitoring tools to alert administrators when certain services are unreachable, allowing proactive issue resolution.

Example
* Check Account Balance: If the balance service is down, the circuit breaker opens, and the Transaction Service can either return an error or use a cached balance (if available).
* Fraud Check: If fraud detection is unavailable, the circuit breaker will prevent retries, and the Transaction Service might flag the transaction for later review rather than delaying or failing the operation.
* Payment Processor: If the payment processor is down, the Transaction Service quickly returns an error response, allowing the user to retry later instead of experiencing extended delays.

Circuit breakers prevent cascading failures, improve response times by failing fast, enable graceful degradation, and support automatic recovery

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a dead-letter queue, and how is it used in a banking app?

A

A dead-letter queue stores messages that could not be processed, allowing for isolation of problematic messages and improving system resilience.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

How can horizontal scaling be achieved in a banking microservices architecture?

A

By adding more instances of each microservice, each capable of handling a portion of the load, ensuring better performance under high traffic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What monitoring tools can help track performance in a microservices architecture for banking?

A

ELK Stack (Elasticsearch, Logstash, Kibana) or Grafana and Prometheus for real-time monitoring, logging, and alerting.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is distributed tracing, and why is it important in a microservices architecture?

A

Distributed tracing tracks requests across services, helping to identify latency and bottlenecks in complex microservices interactions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Why is connection pooling important for database performance in a high-traffic banking app?

A

Connection pooling reduces the overhead of opening and closing connections frequently, improving database throughput and performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How can high availability be ensured for the primary database in a banking app?

A

By using a master-replica setup where the primary (master) handles writes, and replicas handle reads, ensuring redundancy and availability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

What are some caching strategies to improve user experience in a banking app?

A

Use Redis for caching balance data, implement read-after-write consistency, and consider caching at the API Gateway level for frequently accessed data.

  • Near-Real-Time Cache Invalidation for Critical Data
    • Some banking operations, like balance updates and funds transfers, are highly sensitive to consistency. In these cases, implement a near-real-time cache invalidation mechanism to clear or update cache entries immediately after a transaction.
    • For instance, if a user makes a transfer, immediately invalidate or update the balance cache to ensure accurate information for the next balance check.
    • This approach provides reliability for sensitive information and improves trust in the app’s data accuracy.
  • Read-through-cache
    • A read-through cache automatically loads data into the cache from the database or backend service when it’s requested but not already cached.
    • In a banking app, this strategy is useful for frequently accessed but relatively static data like account information (e.g., account details, interest rates, and account types).
    • This approach reduces latency for users by caching data the first time it’s requested, improving response times on subsequent requests.
  • Write-Through Cache for Frequently Updated Data
    • In a write-through cache, data is written to both the cache and the database simultaneously, ensuring the cache always has the latest data.
    • This is beneficial for high-read, high-write items like user preferences and recent transactions.
    • By keeping frequently updated data in the cache, users experience faster load times, while the cache remains consistent with the database.
  • Client-Side Caching for Static Data
    • For data that doesn’t change frequently (e.g., terms and conditions, product details), leverage client-side caching to reduce server requests.
    • This allows the app to store static data on the user’s device, improving performance and reducing data usage, which can be important for users on mobile networks.
    • Implement versioning to refresh client-side cache when these resources are updated, ensuring users always have access to the latest information.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What is the purpose of sharding the database in a large-scale banking application?

A

Sharding distributes data across multiple database instances, reducing load on each instance and enabling horizontal scaling.

  • Enhanced Performance and Reduced Latency
    • Each shard contains only a subset of the data, meaning queries are handled by a smaller dataset, which reduces the time taken to retrieve data.
  • Improved Fault Isolation and Reliability
    • Sharding isolates data into separate databases. If one shard encounters an issue (e.g., hardware failure or data corruption), the impact is limited to only that shard, not the entire database.
  • Efficient Resource Management
    • By segmenting data, each shard requires fewer system resources (memory, CPU, and storage) than a monolithic database would. This reduces the load on each database instance and avoids the resource limitations associated with a single, large database.

Examples
* Hash-Based Sharding (Random Distribution)
* Distribute customer accounts or transactions by hashing the account ID or transaction ID, ensuring an even distribution of data across shards.
* This approach helps balance the load across shards but requires a central mechanism for mapping requests to the correct shard.

  • Range-Based Sharding (Data Segmentation)
    • Partition data by ranges (e.g., customer account numbers or date ranges for transactions), useful for workloads where data locality is beneficial.
    • For instance, transaction records can be sharded by date range to keep recent data in a hot shard for quick access, while older records are stored in archival shards.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Why use read replicas for the Postgres database in a banking app?

A

To offload read-heavy operations from the primary database, improving performance and allowing the primary to focus on write operations.

32
Q

What are some techniques to ensure ACID properties in a funds transfer operation?

A

Use atomic transactions in the database, employ two-phase commits if needed, and lock resources to prevent concurrent modifications.

33
Q

How does a two-phase commit protocol help in a distributed banking system?

A

It ensures atomicity across distributed services, preventing partial updates in multi-service transactions (though it can add complexity).

34
Q

What security measures should be implemented for a banking API?

A

Use HTTPS for data transmission, JWT or OAuth for authentication, role-based access control (RBAC), and data encryption at rest.

To secure a banking API, a combination of robust authentication (OAuth, MFA), encryption (TLS, AES), rate limiting, input validation, real-time monitoring, and secure integration with third-party services is necessary. Additionally, ensuring compliance with security standards and implementing fraud detection measures is critical for maintaining a secure and trustworthy API. By following these best practices, you can minimize the risk of data breaches, fraud, and other security threats in your banking system.

  • Authentication
    • OAuth 2.0: Implement OAuth 2.0 for secure, token-based authentication
    • MFA: Require MFA for users and administrators to reduce the risk of unauthorized access
    • API Keys: For service-to-service communication, use API keys that are securely stored and associated with specific API clients to control access to resources.
  • Authorization
    • Role-Based Access Control (RBAC): Implement RBAC to ensure users and applications have only the minimum permissions required.
    • Scope and Permissions: Use fine-grained access control, limiting API access based on specific user roles and API scopes.
  • Data Encryption
    • TLS (Transport Layer Security): Enforce HTTPS for all communications between clients and the API to encrypt data in transit.
    • Encryption at Rest: Sensitive data such as account information, transaction history, and user credentials should be encrypted when stored in databases or file systems
  • Rate Limiting and Throttling
    • Rate Limiting: Implement rate limiting to prevent abuse and DoS (Denial of Service) attacks.
    • Throttling: Throttling can be used to slow down traffic when rate limits are exceeded.
  • Input Validation and Sanitization
  • Sanitize Inputs: Validate and sanitize all user inputs to protect against injection attacks, including SQL injection, script injection (XSS), and other malicious payloads.
  • Output Encoding: Ensure that any data rendered back to the user (e.g., via API responses) is properly encoded to prevent injection attacks like cross-site scripting (XSS)
  • Logging and Monitoring
    • Audit Logs: Maintain detailed logs of all API calls, including successful and failed login attempts, account changes, transactions, and other critical operations.
    • Real-Time Monitoring: Implement real-time monitoring to detect unusual activities such as multiple failed login attempts, large transactions, or spikes in API requests that could indicate abuse or fraud.
    • Centralized Logging System: Use centralized logging systems like ELK Stack (Elasticsearch, Logstash, Kibana), Splunk, or CloudWatch for easier tracking and analysis of security events.
  • API Gateway and Web Application Firewall (WAF)
    • API Gateway: Place an API gateway in front of your API to handle security tasks such as routing, rate limiting, authentication, logging, and security policy enforcement
    • WAF (Web Application Firewall): Deploy a WAF to filter and monitor HTTP traffic to the API, blocking malicious requests such as SQL injection, cross-site scripting (XSS), and other attacks based on predefined rules.
  • IP Whitelisting and Geo-Blocking
    • IP Whitelisting: For sensitive API operations (e.g., admin access), restrict access to trusted IP addresses.
    • Geo-Blocking: In certain cases, you can block API access from countries or regions that are not relevant to your business, or where there are higher risks of fraud.
  • Session Management
    • Token Expiry: Use short-lived tokens for API authentication, and refresh tokens periodically to limit the risk of a token being compromised. Ensure that tokens have a reasonable expiry time (e.g., 15 minutes to 1 hour).
    • Session Timeouts: Implement session timeouts for user sessions, especially for operations that require higher levels of security (e.g., transferring funds or modifying account settings).
    • Logout Mechanism: Provide users with the ability to log out or revoke active sessions to protect their accounts.
  • Security Best Practices for External Integrations
    • Third-Party API Security: If the banking API integrates with external services (e.g., payment processors, fraud detection), ensure that these services use secure authentication methods (such as OAuth) and follow stringent security practices.
    • Limit Data Sharing: Ensure that only the necessary data is shared with third-party services, and use data anonymization or tokenization where possible to reduce the risk of exposing sensitive customer information.
  • Compliance with Security Standards
    • PCI DSS Compliance: Ensure that your API complies with PCI DSS (Payment Card Industry Data Security Standard) if dealing with payment information or card details.
    • GDPR Compliance: For applications that handle European Union residents’ data, comply with GDPR by implementing appropriate data protection measures, like consent management and data anonymization.
    • SOC 2 or ISO 27001: If applicable, ensure your system is certified for security standards like SOC 2 or ISO 27001, which indicate that your system follows recognized security controls.
35
Q

What are some observability best practices for a banking microservices system?

A

Use centralized logging, distributed tracing, real-time metrics, and alerting to quickly detect and troubleshoot issues.

36
Q

How does partitioning help optimize the Transaction table in a banking system?

A

Partitioning by time (e.g., monthly) keeps data manageable, improves query performance for recent transactions, and speeds up large table operations.

37
Q

What is the system interface design for the banking application?

A

The interface includes:
- POST createAccount: Creates an account with necessary user info, ensuring no duplicate accounts.
- PUT manageAccount: Updates user account info, validating changes.
- POST transfer: Initiates a fund transfer, handling retries to avoid double transactions.
- GET viewAccountBalances: Retrieves account balance with up-to-date data.
- GET transactionHistory: Fetches paginated transaction history using cursor-based pagination.
- Key considerations include idempotency for transfers, caching for balance retrieval, and strict authentication for sensitive data access.

38
Q

How can consistency be maintained when viewing recent account balances?

A

Use read-after-write consistency with caching strategies like:
- Cache Invalidation: Invalidate or update cache on balance change, ensuring latest values.
- Direct Database Reads: For critical operations, read from the primary database instead of cache.
- These techniques help avoid stale data when displaying balances after transactions.

39
Q

What are the main microservices in the banking app’s high-level design?

A

1) Account Management Service: Manages account creation, updates, and deletion.
2) Transaction Service: Manages fund transfers, balance checks, and idempotency for transactions.
3) Notification Service: Sends notifications asynchronously for events like successful transactions.

40
Q

What is a Two-Phase Commit (2PC), and how is it used?

A

2PC is a protocol ensuring distributed transactions are atomic:
- Phase 1 (Prepare): Services prepare for a transaction and confirm readiness.
- Phase 2 (Commit/Rollback): Services commit or rollback based on responses.
- Useful for cross-service consistency, though it can add complexity and potential latency.

40
Q

What is the purpose of the Transaction Service in a banking application?

A

The Transaction Service handles:
- Fund transfers and balance verification.
- Ensures atomic transactions for consistency.
- Handles duplicate transactions by checking unique transaction IDs, ensuring no double-withdrawals.

41
Q

What is the Saga Pattern, and when would it be used in a banking app?

A

The Saga Pattern manages distributed transactions by coordinating smaller, independent transactions:
- Each service performs part of the workflow; on failure, compensating actions roll back partial changes.
- This pattern suits banking apps for operations like fund transfers where full rollback is complex.

41
Q

How could the database be partitioned in a large-scale banking system?

A

Partition by Account ID or time (e.g., monthly) for large tables:
- Account-based partitioning allows isolation of user-specific data.
- Time-based partitioning improves performance for recent transaction queries and maintenance.

42
Q

How should authentication and authorization be handled in a banking app?

A
  • JWT or OAuth tokens for secure, stateless user authentication.
    • Role-based access control (RBAC) for different user permissions.
    • Encrypted communications (HTTPS) to protect data in transit.

Multi-Factor Authentication (MFA)
Password Policies
Least Privilege Principle
Fine-grained Authorization
Time-based Access Control
Audit Logs
Data Encryption

  • User Login:
    • The user enters their username and password.
    • If the credentials are valid, MFA is triggered (e.g., OTP sent to the user’s phone).
    • The system checks if the user’s device is trusted or if there are any anomalies (e.g., a login attempt from a new device).
    • If successful, the system generates a session token (with an expiration time).
  • Performing a Transaction:
    • The user requests to transfer money.
    • The system checks if the user has the necessary balance and if the requested transaction falls within their allowed limits.
    • The system validates the transaction through a second authentication step (e.g., biometric verification).
    • The system checks authorization (e.g., does the user have permission to make such a transfer?).
    • The transaction is logged and processed.
  • Admin Access:
    • An admin logs into the system with additional checks (e.g., stronger authentication, restricted to specific IPs).
    • The system verifies the admin’s role and grants access to sensitive data or settings based on the least privilege principle.
43
Q

What is CQRS (Command Query Responsibility Segregation), and when would it be useful?

A

CQRS separates write operations (commands) from read operations (queries):
- Useful in systems with distinct read/write requirements or complex business logic.
- In a banking app, CQRS could optimize balance queries by separating them from account updates.

44
Q

What is Event Sourcing, and when would it be used in a banking system?

A

Event Sourcing stores state changes as a series of events:
- Useful for retaining a full history of changes, enabling better auditing.
- In banking, it could be applied to transaction history, preserving every balance update as an event.

45
Q

What strategies can ensure high availability for the database in a banking app?

A
  • Primary-replica setup: Primary handles writes, replicas handle read-heavy operations.
    • Auto-failover: Automatic failover to a secondary replica in case of primary failure.
    • Partitioning and sharding: Distributes data across multiple nodes for improved resilience.
46
Q

Why would a message queue (e.g., Kafka or RabbitMQ) be used in a banking application?

A

Message queues provide:
- Asynchronous processing for non-critical tasks like notifications.
- Retries for failed messages, ensuring reliable event delivery.
- Decoupling between services, allowing independent service scaling.

47
Q

What is a dead-letter queue, and how does it help in a banking app?

A

A dead-letter queue stores messages that could not be processed:
- Allows for isolation of problematic messages.
- Helps identify and troubleshoot issues with transactions, ensuring system resilience.

48
Q

How does partitioning the database by account ID or date improve performance?

A

Partitioning reduces the search space for queries:
- Account ID partitioning allows efficient access to user-specific data.
- Time-based partitioning makes recent transactions faster to query, especially beneficial for retrievals and maintenance.

49
Q

How does cursor-based pagination work?

A

Cursor-based pagination uses a “cursor” pointing to the last item retrieved:
- Instead of requesting a page number, clients request items after a specific cursor ID.
- Reduces database load compared to traditional pagination, as it avoids total record counts.

50
Q

What are key considerations in designing an API Gateway for a banking app?

A

The API Gateway should handle:
- Authentication and authorization for secure access.
- Rate limiting to prevent misuse or DDoS attacks.
- Request routing to the appropriate microservices.

51
Q

What is read-after-write consistency, and why is it important for balance data?

A

Read-after-write consistency ensures that data is immediately available after an update:
- Important for balances so users see accurate information post-transaction.
- Can be achieved with cache invalidation or reading directly from the primary database.

51
Q

How can the Notification Service be designed to work asynchronously in a banking app?

A
  • Use a message queue to receive notifications from other services.
    • Consumers in the Notification Service read from the queue and send alerts (e.g., emails, SMS).
    • Ensures reliability, even if the Notification Service experiences temporary delays.
52
Q

What are common strategies for cache invalidation to ensure up-to-date data?

A
  • Time-based expiration: Set TTL for cache entries to refresh data periodically.
    • Event-based invalidation: Clear or update cache on specific events, such as a transaction completion.
    • Versioning: Use cache versions to ensure outdated data is not accessed.
53
Q

How does Redis caching help in a banking application?

A

Redis can store frequently accessed data, like account balances, reducing load on the main database:
- Low-latency reads improve response time for common queries.
- Cache invalidation strategies maintain data consistency.

54
Q

What are the primary indexes needed for the Account table?

A
  • Primary Key on accountId for unique identification.
    • Unique Index on email to prevent duplicate accounts.
    • Index on lastLogin for querying recent activity.
54
Q

What are the primary indexes needed for the Transaction table?

A
  • Primary Key on transactionId.
    • Composite Indexes on (fromAccountId, timestamp) and (toAccountId, timestamp) for efficient history queries.
    • Index on timestamp for time-based pagination.
54
Q

What are some key benefits of using microservices in a banking app?

A
  • Scalability: Each service scales independently.
    • Fault isolation: Failures are contained within each service, improving resilience.
    • Ease of maintenance: Clear separation of concerns aids code management and updates.
54
Q

What is distributed tracing, and why is it important for microservices?

A

Distributed tracing tracks request flows across services:
- Helps identify latency, bottlenecks, and failure points.
- Useful for troubleshooting in complex service interactions.

54
Q

How does CQRS improve system performance?

A

CQRS separates command (write) and query (read) responsibilities:
- Optimizes for scalability by allowing independent tuning of read and write pathways.
- In banking, CQRS allows high-frequency balance reads without impacting account updates.

54
Q

How do load balancers improve the banking system’s availability?

A

Load balancers distribute requests across multiple instances:
- Prevents overload on any single instance.
- Supports health checks to route requests away from failed instances.

55
Q

Why might a banking app use eventual consistency, and where?

A

Eventual consistency allows more flexible scalability at the cost of slight delays:
- Suitable for non-critical data, like certain analytics, where real-time updates aren’t essential.
- Not typically used for balances or sensitive operations.

55
Q

What security measures are essential for a banking app?

A
  • Encryption: Use HTTPS for data in transit, and encrypt sensitive data at rest.
    • Authentication: Implement OAuth or JWT for secure logins.
    • Access Control: Role-based access (RBAC) ensures users access only allowed resources.
55
Q

What monitoring tools can aid in observing a banking app’s microservices?

A
  • ELK Stack (Elasticsearch, Logstash, Kibana): Useful for centralized logging and error tracking.
    • Prometheus and Grafana: For real-time metrics and visualization.
    • Jaeger: Distributed tracing tool to track request flows across services.
56
Q

How does sharding help in managing large-scale database loads?

A

Sharding splits data across multiple nodes, enabling:
- Improved performance by distributing read/write operations.
- Scalability by adding more shards as data grows.
- Banking apps might shard by account range or region.

56
Q

What role does a circuit breaker play in microservices reliability?

A

Circuit breakers detect failing services and prevent further requests:
- Reduces load on troubled services and prevents cascading failures.
- Enables graceful degradation, improving user experience.

56
Q

What is the approximate storage requirement for account data if each account record takes up 136 bytes and there are 20 million accounts?

A

Around 2.72 GB (20 million accounts * 136 bytes per account).

56
Q

How many accounts are expected in the system if there are 10 million users and each user has an average of 2 accounts?

A

20 million accounts (10 million users * 2 accounts per user).

56
Q

How many transactions are expected in the system per day if each user makes 2 transactions daily?

A

20 million transactions per day (10 million users * 2 transactions per user per day).

57
Q

How many peak transactions per second (TPS) should the system handle if 10% of daily transactions occur during peak hours over a 2-hour period?

A

Approximately 2,315 TPS (2 million transactions during peak / 7,200 seconds).

58
Q

What is the storage requirement for transaction data if each transaction takes up 41 bytes, with 20 million transactions daily, and data is retained for 3 years?

A

Approximately 897 GB (820 MB per day * 365 days * 3 years).

59
Q

What components of account data are used to calculate storage requirements?

A

Components include accountId (8 bytes), userId (8 bytes), balance (8 bytes), accountType (4 bytes), creationTime (8 bytes), and additional metadata (~100 bytes).

60
Q

What components of transaction data are used to calculate storage requirements?

A

Components include transactionId (8 bytes), fromAccountId and toAccountId (16 bytes), amount (8 bytes), timestamp (8 bytes), and isSuccessful status (1 byte).

61
Q

How much memory is required to cache account balances for 10% of accounts (2 million) if each balance requires 8 bytes?

A

16 MB (2 million accounts * 8 bytes per balance).

62
Q

How much memory is required to cache recent transactions for active accounts, assuming 10% of accounts (2 million) each have 60 recent transactions per month, with each transaction being 41 bytes?

A

Approximately 4.92 GB (2 million accounts * 60 transactions * 41 bytes per transaction).

62
Q

How much memory is required for the message queue (e.g., Kafka) if each transaction message is 100 bytes, peak TPS is 2,315, and messages are retained for 1 hour?

A

832.2 MB (2,315 TPS * 100 bytes * 3,600 seconds).

63
Q

What is the approximate total storage required for account and transaction data?

A

Around 900 GB (2.72 GB for accounts + 897 GB for transactions over 3 years).

64
Q

What is the approximate total memory required for caching and message queuing?

A

Around 5.768 GB (4.936 GB for caching + 832.2 MB for the message queue).

65
Q

Why might partitioning be used for transaction data in this system design?

A

Partitioning can help manage and query large volumes of transaction data, possibly by year or account ID range, to improve performance and scalability.

66
Q

What additional considerations are needed for high availability in this system design?

A

Replication is necessary for each component (e.g., Redis, Kafka) to achieve high availability, potentially doubling storage and memory requirements based on redundancy needs.

67
Q

AWS Cognito

A

User Authentication and Management:

Cognito handles user sign-up, sign-in, forgot password, and MFA (multi-factor authentication) out of the box.
It also allows for federated authentication, meaning you can integrate external identity providers like Google, Facebook, or corporate SSO (via SAML).
Token Issuance:

After successful login, Cognito issues JWT tokens (ID token, access token, and refresh token) for the user.
These tokens are used for stateless authentication and authorization across the application, ensuring that no additional authentication logic is needed in the backend.
Authorization:

Cognito integrates with IAM (Identity and Access Management) to define roles and permissions. You can set up role-based access control (RBAC) where each user or group has specific permissions.
You can include these roles and permissions in the JWT token, and your API Gateway or backend services can check the token’s claims to validate authorization.
Scaling and Security:

Cognito is a fully managed service, which means it scales automatically without you needing to worry about the underlying infrastructure.
It comes with built-in security features like encryption, DDoS protection, and compliance with standards like HIPAA, GDPR, and SOC 2.

68
Q

aws cognito flow

A

User Authentication (Cognito):

  1. The user logs in via Cognito (using a username/password, social login, or SSO).
    Upon successful login, Cognito issues JWT tokens for the user.
    API Gateway and Token Validation:
  2. API Gateway receives requests with the JWT token in the Authorization header.
    The API Gateway validates the token by checking the signature and verifying it with Cognito.
    Backend Services:
  3. Backend services (like Account Service, Transaction Service) extract the user identity and roles directly from the JWT token and process the request based on that.
    Session Management:
  4. If you need session persistence or additional caching (e.g., user profiles), you can use something like Redis to cache certain user information but the core authentication is handled by Cognito.