Add content for consistency and background jobs

pull/3331/head
Kamran Ahmed 2 years ago
parent cab06b46da
commit ca35551e4f
  1. 2
      src/components/TopicOverlay.astro
  2. 3
      src/roadmaps/system-design/content/104-consistency-patterns/100-weak-consistency.md
  3. 3
      src/roadmaps/system-design/content/104-consistency-patterns/101-eventual-consistency.md
  4. 2
      src/roadmaps/system-design/content/104-consistency-patterns/102-strong-consistency.md
  5. 21
      src/roadmaps/system-design/content/105-availability-patterns/100-fail-over.md
  6. 31
      src/roadmaps/system-design/content/105-availability-patterns/101-replication.md
  7. 43
      src/roadmaps/system-design/content/105-availability-patterns/102-availability-in-numbers.md
  8. 31
      src/roadmaps/system-design/content/105-availability-patterns/index.md
  9. 10
      src/roadmaps/system-design/content/106-background-jobs/100-event-driven.md
  10. 14
      src/roadmaps/system-design/content/106-background-jobs/101-schedule-driven.md
  11. 4
      src/roadmaps/system-design/content/106-background-jobs/102-returning-results.md
  12. 2
      src/roadmaps/system-design/content/106-background-jobs/index.md

@ -48,7 +48,7 @@ const githubLink = `https://github.com/kamranahmedse/developer-roadmap/tree/mast
<div
id='topic-content'
class='prose prose-h1:mt-7 prose-h1:mb-2.5 prose-p:mt-0 prose-p:mb-2 prose-li:m-0 prose-li:mb-0.5 prose-h2:mb-3 prose-h2:mt-0'
class='prose prose-h1:mt-7 prose-h1:mb-2.5 prose-p:mt-0 prose-p:mb-2 prose-li:m-0 prose-li:mb-0.5 prose-h2:mb-3 prose-h2:mt-0 prose-h3:mt-[10px] prose-h3:mb-[5px]'
>
</div>

@ -1,6 +1,7 @@
# Weak Consistency
After a write, reads may or may not see it. A best effort approach is taken. This approach is seen in systems such as memcached. Weak consistency works well in real time use cases such as VoIP, video chat, and realtime multiplayer games. For example, if you are on a phone call and lose reception for a few seconds, when you regain connection you do not hear what was spoken during connection loss.
After an update is made to the data, it is not guaranteed that any subsequent read operation will immediately reflect the changes made. The read may or may not see the recent write.
To learn more, visit the following links:
- [Consistency Patterns in Distributed Systems](https://cs.fyi/guide/consistency-patterns-week-strong-eventual/)

@ -1,7 +1,6 @@
# Eventual Consistency
After a write, reads will eventually see it (typically within milliseconds).Data is replicated asynchronously. This approach is seen in systems such as DNS and email. Eventual consistency works well in highly available systems.
Eventual consistency is a form of Weak Consistency. After an update is made to the data, it will be eventually visible to any subsequent read operations. The data is replicated in an asynchronous manner, ensuring that all copies of the data are eventually updated.
To learn more, visit the following links:

@ -1,6 +1,6 @@
# Strong Consistency
After a write, reads will see it. Data is replicated synchronously. This approach is seen in file systems and RDBMSes. Strong consistency works well in systems that need transactions.
After an update is made to the data, it will be immediately visible to any subsequent read operations. The data is replicated in a synchronous manner, ensuring that all copies of the data are updated at the same time.
To learn more, visit the following links:

@ -1,12 +1,26 @@
# Fail-Over
Failover is an availability pattern that is used to ensure that a system can continue to function in the event of a failure. It involves having a backup component or system that can take over in the event of a failure.
In a failover system, there is a primary component that is responsible for handling requests, and a secondary (or backup) component that is on standby. The primary component is monitored for failures, and if it fails, the secondary component is activated to take over its duties. This allows the system to continue functioning with minimal disruption.
Failover can be implemented in various ways, such as active-passive, active-active, and hot-standby.
## Active-passive
With active-passive fail-over, heartbeats are sent between the active and the passive server on standby. If the heartbeat is interrupted, the passive server takes over the active's IP address and resumes service.
The length of downtime is determined by whether the passive server is already running in 'hot' standby or whether it needs to start up from 'cold' standby. Only the active server handles traffic. Active-passive failover can also be referred to as master-slave failover.
The length of downtime is determined by whether the passive server is already running in 'hot' standby or whether it needs to start up from 'cold' standby. Only the active server handles traffic.
Active-passive failover can also be referred to as master-slave failover.
## Active-active
In active-active, both servers are managing traffic, spreading the load between them. If the servers are public-facing, the DNS would need to know about the public IPs of both servers. If the servers are internal-facing, application logic would need to know about both servers. Active-active failover can also be referred to as master-master failover.
In active-active, both servers are managing traffic, spreading the load between them.
If the servers are public-facing, the DNS would need to know about the public IPs of both servers. If the servers are internal-facing, application logic would need to know about both servers.
Active-active failover can also be referred to as master-master failover.
## Disadvantages of Failover
@ -15,5 +29,4 @@ In active-active, both servers are managing traffic, spreading the load between
To learn more visit the following links:
- [Getting started with Fail-Over in System Design](https://github.com/donnemartin/system-design-primer)
- [System Design — Availabiliy Patterns](https://medium.com/must-know-computer-science/system-design-redundancy-and-replication-e9946aa335ba)
- [Fail Over Pattern - High Availability](https://www.filecloud.com/blog/2015/12/architectural-patterns-for-high-availability/)

@ -1,34 +1,11 @@
# Replication
Replication is futher derived in two components:
Replication is an availability pattern that involves having multiple copies of the same data stored in different locations. In the event of a failure, the data can be retrieved from a different location. There are two main types of replication: Master-Master replication and Master-Slave replication.
- Master-Slave Replication
- Master-Master Replication
## Master-Slave Replication
The master serves reads and writes, replicating writes to one or more slaves, which serve only reads. Slaves can also replicate to additional slaves in a tree-like fashion. If the master goes offline, the system can continue to operate in read-only mode until a slave is promoted to a master or a new master is provisioned.
## Disadvantages Master-Slave replication
Following are the disadvantages:
- Additional logic is needed to promote a slave to a master.
## Master-Master Replication
Both masters serve reads and writes and coordinate with each other on writes. If either master goes down, the system can continue to operate with both reads and writes.
## Disadvantages of Master-Master replication
Following are the disadvantages of master-master replication:
- A load balancer or you'll need to make changes to your application logic to determine where to write.
- Most master-master systems are either loosely consistent (violating ACID) or have increased write latency due to synchronization.
- Conflict resolution comes more into play as more write nodes are added and as latency increases.
- See Disadvantage(s): replication for points related to both master-slave and master-master.
- **Master-Master replication:** In this type of replication, multiple servers are configured as "masters," and each one can accept read and write operations. This allows for high availability and allows any of the servers to take over if one of them fails. However, this type of replication can lead to conflicts if multiple servers update the same data at the same time, so some conflict resolution mechanism is needed to handle this.
- **Master-Slave replication:** In this type of replication, one server is designated as the "master" and handles all write operations, while multiple "slave" servers handle read operations. If the master fails, one of the slaves can be promoted to take its place. This type of replication is simpler to set up and maintain compared to Master-Master replication.
Visi the following links for more resources:
- [Replication - Master-Slave](https://github.com/donnemartin/system-design-primer#master-slave-replication)
- [Master- Master Replication](https://github.com/donnemartin/system-design-primer#master-master-replication)
- [Replication: Avaiability Pattern](https://github.com/donnemartin/system-design-primer#replication)

@ -1,24 +1,28 @@
# Availability In Numbers
# Availability in Numbers
Availability is often quantified by uptime (or downtime) as a percentage of time the service is available. Availability is generally measured in number of 9s--a service with 99.99% availability is described as having four 9s.
## 99.9% Availability - Three 9s:
| Duration | Acceptable downtime |
| ------------- | ------------- |
| Downtime per year | 8h 45min 57s |
| Downtime per month | 43m 49.7s |
| Downtime per week | 10m 4.8s |
| Downtime per day | 1m 26.4s |
```
Duration | Acceptable downtime
------------- | -------------
Downtime per year | 8h 45min 57s
Downtime per month | 43m 49.7s
Downtime per week | 10m 4.8s
Downtime per day | 1m 26.4s
```
## 99.99% Availability - Four 9s
| Duration | Acceptable downtime |
| ------------- | ------------- |
| Downtime per year | 52min 35.7s |
| Downtime per month | 43m 49.7s |
| Downtime per week | 1m 5s |
| Downtime per day | 8.6s |
```
Duration | Acceptable downtime
------------- | -------------
Downtime per year | 52min 35.7s
Downtime per month | 43m 49.7s
Downtime per week | 1m 5s
Downtime per day | 8.6s
```
## Availability in parallel vs in sequence
@ -27,16 +31,23 @@ If a service consists of multiple components prone to failure, the service's ove
### In sequence
Overall availability decreases when two components with availability < 100% are in sequence:
```
Availability (Total) = Availability (Foo) * Availability (Bar)
If both Foo and Bar each had 99.9% availability, their total availability in sequence would be 99.8%.
```
If both `Foo` and `Bar` each had 99.9% availability, their total availability in sequence would be 99.8%.
### In parallel
Overall availability increases when two components with availability < 100% are in parallel:
```
Availability (Total) = 1 - (1 - Availability (Foo)) * (1 - Availability (Bar))
If both Foo and Bar each had 99.9% availability, their total availability in parallel would be 99.9999%.
```
If both `Foo` and `Bar` each had 99.9% availability, their total availability in parallel would be 99.9999%.
To learn more, visit the following links:
- [Getting started with Availability in Numbers](https://github.com/donnemartin/system-design-primer)
- [Availability in System Design](https://www.enjoyalgorithms.com/blog/availability-system-design-concept/)

@ -1,32 +1,5 @@
# Availability Patterns
There are three Availability Patterns which are:
Availability is measured as a percentage of uptime, and defines the proportion of time that a system is functional and working. Availability is affected by system errors, infrastructure problems, malicious attacks, and system load. Cloud applications typically provide users with a service level agreement (SLA), which means that applications must be designed and implemented to maximize availability.
- Fail-Over
- Replication
- Availability in Numbers
## Fail-Over
### Active-passive
With active-passive fail-over, heartbeats are sent between the active and the passive server on standby. If the heartbeat is interrupted, the passive server takes over the active's IP address and resumes service.
## Active-active
In active-active, both servers are managing traffic, spreading the load between them. If the servers are public-facing, the DNS would need to know about the public IPs of both servers. If the servers are internal-facing, application logic would need to know about both servers.
## Replication
Replication is futher derived in two components:
- Master-Slave Replication - The master serves reads and writes, replicating writes to one or more slaves, which serve only reads.
- Master-Master Replication - Both masters serve reads and writes and coordinate with each other on writes. If either master goes down, the system can continue to operate with both reads and writes.
## Availability In Numbers
Availability is often quantified by uptime (or downtime) as a percentage of time the service is available. Availability is generally measured in number of 9s--a service with 99.99% availability is described as having four 9s.
To learn more, visit the following links:
- [Getting started with Availability Patterns](https://github.com/donnemartin/system-design-primer)
- [Availability in System Design](https://www.enjoyalgorithms.com/blog/availability-system-design-concept)
- [System Design: Availability](https://dev.to/karanpratapsingh/system-design-availability-38bd)
- [Availability Patterns](https://learn.microsoft.com/en-us/azure/architecture/framework/resiliency/reliability-patterns#availability)

@ -1,11 +1,11 @@
# Event Driven
Event-driven architecture (EDA) is a design pattern that focuses on the flow of events through a system, rather than the flow of data or control. It is based on the idea that a system should respond to external events and trigger the appropriate actions.
Event-driven invocation uses a trigger to start the background task. Examples of using event-driven triggers include:
In an event-driven system, events are generated by external sources, such as user input, sensors, or other systems, and are passed through the system to be handled by the appropriate components. These events can trigger various actions, such as updating the state of the system, sending a message to another system, or triggering a computation.
- The UI or another job places a message in a queue. The message contains data about an action that has taken place, such as the user placing an order. The background task listens on this queue and detects the arrival of a new message. It reads the message and uses the data in it as the input to the background job. This pattern is known as asynchronous message-based communication.
- The UI or another job saves or updates a value in storage. The background task monitors the storage and detects changes. It reads the data and uses it as the input to the background job.
- The UI or another job makes a request to an endpoint, such as an HTTPS URI, or an API that is exposed as a web service. It passes the data that is required to complete the background task as part of the request. The endpoint or web service invokes the background task, which uses the data as its input.
Learn more from the following links:
- [What is an Event-Driven Architecture?](https://aws.amazon.com/event-driven-architecture/)
- [Event-Driven Architecture - Everything You Need to Know](https://blog.hubspot.com/website/event-driven-architecture)
- [System Design: Event-Driven Architecture (EDA)](https://dev.to/karanpratapsingh/system-design-event-driven-architecture-eda-3m72)
- [Background Jobs - Event Driven Triggers](https://learn.microsoft.com/en-us/azure/architecture/best-practices/background-jobs#event-driven-triggers)

@ -1,15 +1,13 @@
# Schedule Driven
Schedule-driven systems are systems that are designed to perform specific tasks or actions at predetermined times or intervals. These schedules can be defined by the system itself or can be set by an external agent, such as a user or another system.
Schedule-driven invocation uses a timer to start the background task. Examples of using schedule-driven triggers include:
Examples of schedule-driven systems include:
- A timer that is running locally within the application or as part of the application's operating system invokes a background task on a regular basis.
- A timer that is running in a different application, such as Azure Logic Apps, sends a request to an API or web service on a regular basis. The API or web service invokes the background task.
- A separate process or application starts a timer that causes the background task to be invoked once after a specified time delay, or at a specific time.
- Cron jobs
- Scheduled batch jobs
- Recurring events
- Automated trading systems
Typical examples of tasks that are suited to schedule-driven invocation include batch-processing routines (such as updating related-products lists for users based on their recent behavior), routine data processing tasks (such as updating indexes or generating accumulated results), data analysis for daily reports, data retention cleanup, and data consistency checks.
Learn more from the following links:
- [System Design - Job Scheduling System?](https://aws.amazon.com/event-driven-architecture/)
- [Scheduler System Design](https://atul-agrawal.medium.com/scheduler-as-a-service-9c5d0414ec6d)
- [Schedule Driven - Background Jobs](https://learn.microsoft.com/en-us/azure/architecture/best-practices/background-jobs#schedule-driven-triggers)

@ -1,7 +1,7 @@
# Returning Results
Returning results in a system design refers to the process of providing the output or outcome of a specific task or action to the requesting entity. This can include providing a response to a user request, returning a result of a computation or analysis, or sending a notification or message to another system.
Background jobs execute asynchronously in a separate process, or even in a separate location, from the UI or the process that invoked the background task. Ideally, background tasks are "fire and forget" operations, and their execution progress has no impact on the UI or the calling process. This means that the calling process does not wait for completion of the tasks. Therefore, it cannot automatically detect when the task ends.
Learn more from the following links:
- [Overview of Return Statement](https://press.rebus.community/programmingfundamentals/chapter/return-statement/)
- [Returning Results - Background Jobs](https://learn.microsoft.com/en-us/azure/architecture/best-practices/background-jobs#returning-results)

@ -11,4 +11,4 @@ Background jobs can be used for a variety of purposes, such as:
Learn more from the following links:
- [Intro of Background job system](https://www.codementor.io/projects/tool/background-job-system-atx32exogo)
- [Background Jobs - Best Practices](https://learn.microsoft.com/en-us/azure/architecture/best-practices/background-jobs)
Loading…
Cancel
Save