From e934dc60f4fbcc2d8ddd9b3240227bebab908291 Mon Sep 17 00:00:00 2001 From: Kamran Ahmed Date: Thu, 19 Jan 2023 20:50:14 +0400 Subject: [PATCH] Add content for reliability patterns --- .../103-reliability-patterns/100-availability/index.md | 5 ++--- .../101-high-availability/index.md | 5 ++--- .../103-reliability-patterns/102-resiliency/index.md | 8 ++++++-- .../103-reliability-patterns/index.md | 5 ++--- 4 files changed, 12 insertions(+), 11 deletions(-) diff --git a/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/100-availability/index.md b/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/100-availability/index.md index 15d0864f7..2d7cff828 100644 --- a/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/100-availability/index.md +++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/100-availability/index.md @@ -1,8 +1,7 @@ # Availability -Availability refers to the ability of a system to perform its intended function without interruption. High availability is desired as it means that the system is less likely to experience downtime, and when it does, it can quickly recover. To increase the availability of a system, several methods can be used such as Redundancy, Load balancing, Failover, Monitoring, and Automated recovery. +Availability is measured as a percentage of uptime, and defines the proportion of time that a system is functional and working. Availability is affected by system errors, infrastructure problems, malicious attacks, and system load. Cloud applications typically provide users with a service level agreement (SLA), which means that applications must be designed and implemented to maximize availability. To learn more visit the following links: -- [System Design: Availability](https://dev.to/karanpratapsingh/system-design-availability-38bd) -- [Concept of Availability in system design](https://www.enjoyalgorithms.com/blog/availability-system-design-concept) \ No newline at end of file +- [Availability Patterns](https://learn.microsoft.com/en-us/azure/architecture/framework/resiliency/reliability-patterns#availability) \ No newline at end of file diff --git a/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/101-high-availability/index.md b/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/101-high-availability/index.md index c371c4908..6f96b39d5 100644 --- a/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/101-high-availability/index.md +++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/101-high-availability/index.md @@ -1,8 +1,7 @@ # High availability -High availability refers to the ability of a system to continue operating even in the event of a failure or outage. This is often achieved by designing the system to be redundant, meaning that multiple copies of the system are running at the same time, and if one copy fails, the others can take over. It can be achieved by using Redundancy, Load balancing, and Failover. It can be measured using metrics such as Mean Time Between Failures (MTBF), Mean Time To Recovery (MTTR) and Availability. +Azure infrastructure is composed of geographies, regions, and Availability Zones, which limit the blast radius of a failure and therefore limit potential impact to customer applications and data. The Azure Availability Zones construct was developed to provide a software and networking solution to protect against datacenter failures and to provide increased high availability (HA) to our customers. With HA architecture there is a balance between high resilience, low latency, and cost. Learn more from the following links: -- [What is High availability (HA)?](https://www.techtarget.com/searchdatacenter/definition/high-availability) -- [Introduction to High Availability Architecture](https://www.filecloud.com/blog/an-introduction-to-high-availability-architecture/) \ No newline at end of file +- [High availability Patterns](https://learn.microsoft.com/en-us/azure/architecture/framework/resiliency/reliability-patterns#high-availability) \ No newline at end of file diff --git a/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/102-resiliency/index.md b/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/102-resiliency/index.md index 213e1cc24..faa9de45a 100644 --- a/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/102-resiliency/index.md +++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/102-resiliency/index.md @@ -1,7 +1,11 @@ # Resilience -Resilience refers to the ability of a system to withstand and recover from disruptions, failures or unexpected conditions. It means the system can continue to function and provide service even when faced with stressors such as high traffic, failures or unexpected changes. Resilience can be achieved by designing the system to be redundant, fault-tolerant, scalable, having automatic recovery, and monitoring and alerting mechanisms. It can be measured by Recovery Time Objective (RTO), Recovery Point Objective (RPO), Mean time to failure (MTTF), and Mean time to recovery (MTTR). +Resiliency is the ability of a system to gracefully handle and recover from failures, both inadvertent and malicious. + +The nature of cloud hosting, where applications are often multi-tenant, use shared platform services, compete for resources and bandwidth, communicate over the Internet, and run on commodity hardware means there is an increased likelihood that both transient and more permanent faults will arise. The connected nature of the internet and the rise in sophistication and volume of attacks increase the likelihood of a security disruption. + +Detecting failures and recovering quickly and efficiently, is necessary to maintain resiliency. Learn more from the following links: -- [System Resilience: What Exactly is it?](https://insights.sei.cmu.edu/blog/system-resilience-what-exactly-is-it/) \ No newline at end of file +- [Resiliency Patterns](https://learn.microsoft.com/en-us/azure/architecture/framework/resiliency/reliability-patterns#resiliency) \ No newline at end of file diff --git a/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/index.md b/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/index.md index 9b05a3dcf..cf40739a0 100644 --- a/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/index.md +++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/103-reliability-patterns/index.md @@ -1,8 +1,7 @@ # Reliability Patterns -Reliability patterns are solutions to common problems that arise when building systems that need to be highly available and fault-tolerant. These patterns provide a way to design and implement systems that can withstand failures, maintain high levels of performance, and recover quickly from disruptions. Some common reliability patterns include Failover, Circuit Breaker, Retry, Bulkhead, Backpressure, Cache-Aside, Idempotent Operations and Health Endpoint Monitoring. +These patterns provide a way to design and implement systems that can withstand failures, maintain high levels of performance, and recover quickly from disruptions. Some common reliability patterns include Failover, Circuit Breaker, Retry, Bulkhead, Backpressure, Cache-Aside, Idempotent Operations and Health Endpoint Monitoring. Learn more from the following links: -- [Reliability Patterns: A Survey](http://laccei.org/LACCEI2019-MontegoBay/full_papers/FP53.pdf) -- [Get started with Reliability Patterns](https://learn.microsoft.com/en-us/azure/architecture/framework/resiliency/reliability-patterns) \ No newline at end of file +- [Reliability Patterns](https://learn.microsoft.com/en-us/azure/architecture/framework/resiliency/reliability-patterns) \ No newline at end of file