Create long-running applications Flashcards
Designing reliable applications
1) Reliability is the probability that a system functions correctly during any given period of time.
2) Azure helps you to handle these challenges.
Fault domains and update domains
1) hardware failures are unavoidable
2) To help you to cope with such failures, Azure introduces two concepts: fault domain and update domain.
Fault domain
1) This is a group of resources that can fail at the same time
2) you can distribute service instances evenly to multiple fault domains so that all service instances won’t fail at the same time due to hardware failures.
Update domain
1) An update domain is a logical group of resources that can be simultaneously updated during system upgrades. 2) When Azure updates a service, it doesn’t bring down all instances at the same time.
3) Instead, it performs a rolling update via an update domain walk.
4) Service instances in different update domains are brought down group by group for updates.
Transient errors
Transient errors are caused by some temporal conditions such as network fluctuation, service overload, and request throttling. Transient errors are quite elusive; they happen randomly and can’t be reliably re-created. A typical way to handle transient error is to retry a couple of times
Loose coupling
1) dynamic scaling and load leveling.
2) don’t have direct dependencies on one another
3) failing component won’t produce the ripple effect on the entire system
4) integrated by Azure Service Bus queue
Health monitoring
1) Azure Diagnostics and Application Insights
workload of an application
two typical workload change patterns: gradual changes (scaling up, out) and spikes.
Dynamic scaling
1) Predicable (weekend demand)
2) UnPredicable news website might experience unexpected spikes when breaking news occurs
Two major autoscaling methods
1) Scheduled scaling This is suitable for expected workload changes.
2) Reactive scaling Reactive scaling is suitable for unexpected workload changes. It monitors certain system metrics and adjusts system capacity when those metrics attain certain thresholds.
A practical challenge of reactive scaling
is the latency in provisioning new resources. Provisioning a new VM and deploying a new service instance need time. So, when you design your reactive-scaling solution, you need to leave enough space for new resources to be brought online
Containers to avoid latency
1) Container technologies such as Docker make it possible for you to package workloads in light-weight images, which you can deploy and activate very quickly.
2) Such agility affords new possibilities to reactive scaling by eliminating the needs to account for latencies.
Workload partitioning
1) total workload is sliced into small portions, and each portion is assigned to a number of designated instances. 2) There are several advantages to using workload partitioning compared to homogeneous instances, among them is tenant isolation.
Tenant isolation
you can route workloads for a certain tenant to a designated group of instances instead of being randomly routed to any instances in a bigger instance pool
Tenant isolation new scenarios
1) per-tenant monitoring
2) tiered service offerings,
3) independent updates to different tenants.