Add content to system-design (#3323)

2 years ago · a3b8b5653a
parent 8f8e2f41d8
commit a3b8b5653a
145 changed files with 1559 additions and 145 deletions
--- a/src/roadmaps/system-design/content/101-performance-vs-scalability.md
+++ b/src/roadmaps/system-design/content/101-performance-vs-scalability.md
@ -1 +1,13 @@
-# Performance vs scalability
+# Performance vs Scalability
+
+A service is scalable if it results in increased performance in a manner proportional to resources added. Generally, increasing performance means serving more units of work, but it can also be to handle larger units of work, such as when datasets grow.1
+
+Another way to look at performance vs scalability:
+
+- If you have a performance problem, your system is slow for a single user.
+- If you have a scalability problem, your system is fast for a single user but slow under heavy load.
+
+To learn more, visit the following links:
+
+- [Scalability, Availability & Stability Patterns](https://www.slideshare.net/jboner/scalability-availability-stability-patterns/)
+- [A Word on Scalability](https://www.allthingsdistributed.com/2006/03/a_word_on_scalability.html)
--- a/src/roadmaps/system-design/content/102-latency-vs-throughput.md
+++ b/src/roadmaps/system-design/content/102-latency-vs-throughput.md
@ -1 +1,7 @@
-# Latency vs throughput
+# Latency vs Throughput
+
+Latency is the time to perform some action or to produce some result. Throughput is the number of such actions or results per unit of time Generally, you should aim for maximal throughput with acceptable latency.
+
+Learn more from the following links:
+
+- [Understanding Latency versus Throughput](https://community.cadence.com/cadence_blogs_8/b/fv/posts/understanding-latency-vs-throughput)
--- a/src/roadmaps/system-design/content/104-consistency-patterns/100-weak-consistency.md
+++ b/src/roadmaps/system-design/content/104-consistency-patterns/100-weak-consistency.md
@ -1 +1,8 @@
-# Weak consistency
+# Weak Consistency
+
+After a write, reads may or may not see it. A best effort approach is taken. This approach is seen in systems such as memcached. Weak consistency works well in real time use cases such as VoIP, video chat, and realtime multiplayer games. For example, if you are on a phone call and lose reception for a few seconds, when you regain connection you do not hear what was spoken during connection loss.
+
+To learn more, visit the following links:
+- [Introduction to Weak Consistency](https://github.com/donnemartin/system-design-primer)
+- [Guide to Weak Consistency](https://iq.opengenus.org/consistency-patterns-in-system-design/)
+
--- a/src/roadmaps/system-design/content/104-consistency-patterns/101-eventual-consistency.md
+++ b/src/roadmaps/system-design/content/104-consistency-patterns/101-eventual-consistency.md
@ -1 +1,9 @@
-# Eventual consistency
+# Eventual Consistency
+
+After a write, reads will eventually see it (typically within milliseconds).Data is replicated asynchronously. This approach is seen in systems such as DNS and email. Eventual consistency works well in highly available systems.
+
+
+To learn more, visit the following links:
+
+- [Eventual Consistency Patterns](https://github.com/donnemartin/system-design-primer)
+- [System Design Concepts – Eventual Consistency](https://www.acodersjourney.com/eventual-consistency/)
--- a/src/roadmaps/system-design/content/104-consistency-patterns/102-strong-consistency.md
+++ b/src/roadmaps/system-design/content/104-consistency-patterns/102-strong-consistency.md
@ -1 +1,8 @@
-# Strong consistency
+# Strong Consistency
+
+After a write, reads will see it. Data is replicated synchronously. This approach is seen in file systems and RDBMSes. Strong consistency works well in systems that need transactions.
+
+To learn more, visit the following links:
+
+- [Strong Consistency Patterns](https://github.com/donnemartin/system-design-primer)
+- [Get started with Strong Consistency](https://www.geeksforgeeks.org/eventual-vs-strong-consistency-in-distributed-databases/)
--- a/src/roadmaps/system-design/content/105-availability-patterns/100-fail-over.md
+++ b/src/roadmaps/system-design/content/105-availability-patterns/100-fail-over.md
@ -1 +1,19 @@
-# Fail over
+# Fail-Over
+
+## Active-passive
+
+With active-passive fail-over, heartbeats are sent between the active and the passive server on standby. If the heartbeat is interrupted, the passive server takes over the active's IP address and resumes service.
+The length of downtime is determined by whether the passive server is already running in 'hot' standby or whether it needs to start up from 'cold' standby. Only the active server handles traffic. Active-passive failover can also be referred to as master-slave failover.
+
+## Active-active
+In active-active, both servers are managing traffic, spreading the load between them. If the servers are public-facing, the DNS would need to know about the public IPs of both servers. If the servers are internal-facing, application logic would need to know about both servers. Active-active failover can also be referred to as master-master failover.
+
+## Disadvantages of Failover
+
+ - Fail-over adds more hardware and additional complexity.
+ - There is a potential for loss of data if the active system fails before any newly written data can be replicated to the passive.
+
+ To learn more visit the following links:
+
+ - [Getting started with Fail-Over in System Design](https://github.com/donnemartin/system-design-primer)
+ - [System Design — Availabiliy Patterns](https://medium.com/must-know-computer-science/system-design-redundancy-and-replication-e9946aa335ba)
--- a/src/roadmaps/system-design/content/105-availability-patterns/101-replication.md
+++ b/src/roadmaps/system-design/content/105-availability-patterns/101-replication.md
@ -1 +1,34 @@
 # Replication
+
+Replication is futher derived in two components:
+
+ - Master-Slave Replication
+ - Master-Master Replication
+
+## Master-Slave Replication
+
+The master serves reads and writes, replicating writes to one or more slaves, which serve only reads. Slaves can also replicate to additional slaves in a tree-like fashion. If the master goes offline, the system can continue to operate in read-only mode until a slave is promoted to a master or a new master is provisioned.
+
+## Disadvantages Master-Slave replication
+
+Following are the disadvantages:
+ - Additional logic is needed to promote a slave to a master.
+
+## Master-Master Replication
+
+Both masters serve reads and writes and coordinate with each other on writes. If either master goes down, the system can continue to operate with both reads and writes.
+
+## Disadvantages of Master-Master replication
+
+Following are the disadvantages of master-master replication:
+
+ - A load balancer or you'll need to make changes to your application logic to determine where to write.
+ - Most master-master systems are either loosely consistent (violating ACID) or have increased write latency due to synchronization.
+ - Conflict resolution comes more into play as more write nodes are added and as latency increases.
+ - See Disadvantage(s): replication for points related to both master-slave and master-master.
+
+
+Visi the following links for more resources:
+
+- [Replication - Master-Slave](https://github.com/donnemartin/system-design-primer#master-slave-replication)
+- [Master- Master Replication](https://github.com/donnemartin/system-design-primer#master-master-replication)
--- a/src/roadmaps/system-design/content/105-availability-patterns/102-availability-in-numbers.md
+++ b/src/roadmaps/system-design/content/105-availability-patterns/102-availability-in-numbers.md
@ -1 +1,42 @@
-# Availability in numbers
+# Availability In Numbers
+
+Availability is often quantified by uptime (or downtime) as a percentage of time the service is available. Availability is generally measured in number of 9s--a service with 99.99% availability is described as having four 9s.
+
+## 99.9% Availability - Three 9s:
+
+| Duration           | Acceptable downtime |
+| -------------      | -------------       |
+| Downtime per year  | 8h 45min 57s        |
+| Downtime per month | 43m 49.7s           |
+| Downtime per week  | 10m 4.8s            |
+| Downtime per day   | 1m 26.4s            |
+
+## 99.99% Availability - Four 9s
+
+| Duration           | Acceptable downtime |
+| -------------      | -------------       |
+| Downtime per year  | 52min 35.7s         |
+| Downtime per month | 43m 49.7s           |
+| Downtime per week  | 1m 5s               |
+| Downtime per day   | 8.6s                |
+
+## Availability in parallel vs in sequence
+
+If a service consists of multiple components prone to failure, the service's overall availability depends on whether the components are in sequence or in parallel.
+
+### In sequence
+
+Overall availability decreases when two components with availability < 100% are in sequence:
+Availability (Total) = Availability (Foo) * Availability (Bar)
+If both Foo and Bar each had 99.9% availability, their total availability in sequence would be 99.8%.
+
+### In parallel
+
+Overall availability increases when two components with availability < 100% are in parallel:
+Availability (Total) = 1 - (1 - Availability (Foo)) * (1 - Availability (Bar))
+If both Foo and Bar each had 99.9% availability, their total availability in parallel would be 99.9999%.
+
+To learn more, visit the following links:
+
+- [Getting started with Availability in Numbers](https://github.com/donnemartin/system-design-primer)
+- [Availability in System Design](https://www.enjoyalgorithms.com/blog/availability-system-design-concept/)
--- a/src/roadmaps/system-design/content/105-availability-patterns/index.md
+++ b/src/roadmaps/system-design/content/105-availability-patterns/index.md
@ -1 +1,32 @@
-# Availability patterns
+# Availability Patterns
+
+There are three Availability Patterns which are:
+
+- Fail-Over    
+- Replication
+- Availability in Numbers
+
+## Fail-Over
+
+### Active-passive
+With active-passive fail-over, heartbeats are sent between the active and the passive server on standby. If the heartbeat is interrupted, the passive server takes over the active's IP address and resumes service.
+
+## Active-active
+In active-active, both servers are managing traffic, spreading the load between them. If the servers are public-facing, the DNS would need to know about the public IPs of both servers. If the servers are internal-facing, application logic would need to know about both servers. 
+
+## Replication
+
+Replication is futher derived in two components:
+
+ - Master-Slave Replication - The master serves reads and writes, replicating writes to one or more slaves, which serve only reads.
+ - Master-Master Replication - Both masters serve reads and writes and coordinate with each other on writes. If either master goes down, the system can continue to operate with both reads and writes.
+
+ ## Availability In Numbers
+
+Availability is often quantified by uptime (or downtime) as a percentage of time the service is available. Availability is generally measured in number of 9s--a service with 99.99% availability is described as having four 9s.
+
+To learn more, visit the following links:
+
+ - [Getting started with Availability Patterns](https://github.com/donnemartin/system-design-primer)
+ - [Availability in System Design](https://www.enjoyalgorithms.com/blog/availability-system-design-concept)
+ - [System Design: Availability](https://dev.to/karanpratapsingh/system-design-availability-38bd)
--- a/src/roadmaps/system-design/content/106-background-jobs/100-event-driven.md
+++ b/src/roadmaps/system-design/content/106-background-jobs/100-event-driven.md
@ -1 +1,11 @@
-# Event driven
+# Event Driven
+
+Event-driven architecture (EDA) is a design pattern that focuses on the flow of events through a system, rather than the flow of data or control. It is based on the idea that a system should respond to external events and trigger the appropriate actions.
+
+In an event-driven system, events are generated by external sources, such as user input, sensors, or other systems, and are passed through the system to be handled by the appropriate components. These events can trigger various actions, such as updating the state of the system, sending a message to another system, or triggering a computation.
+
+Learn more from the following links:
+
+- [What is an Event-Driven Architecture?](https://aws.amazon.com/event-driven-architecture/)
+- [Event-Driven Architecture - Everything You Need to Know](https://blog.hubspot.com/website/event-driven-architecture)
+- [System Design: Event-Driven Architecture (EDA)](https://dev.to/karanpratapsingh/system-design-event-driven-architecture-eda-3m72)
--- a/src/roadmaps/system-design/content/106-background-jobs/101-schedule-driven.md
+++ b/src/roadmaps/system-design/content/106-background-jobs/101-schedule-driven.md
@ -1 +1,15 @@
-# Schedule driven
+# Schedule Driven
+
+Schedule-driven systems are systems that are designed to perform specific tasks or actions at predetermined times or intervals. These schedules can be defined by the system itself or can be set by an external agent, such as a user or another system.
+
+Examples of schedule-driven systems include:
+
+- Cron jobs
+- Scheduled batch jobs
+- Recurring events
+- Automated trading systems
+
+Learn more from the following links:
+
+- [System Design - Job Scheduling System?](https://aws.amazon.com/event-driven-architecture/)
+- [Scheduler System Design](https://atul-agrawal.medium.com/scheduler-as-a-service-9c5d0414ec6d)
--- a/src/roadmaps/system-design/content/106-background-jobs/102-returning-results.md
+++ b/src/roadmaps/system-design/content/106-background-jobs/102-returning-results.md
@ -1 +1,7 @@
-# Returning results
+# Returning Results
+
+Returning results in a system design refers to the process of providing the output or outcome of a specific task or action to the requesting entity. This can include providing a response to a user request, returning a result of a computation or analysis, or sending a notification or message to another system.
+
+Learn more from the following links:
+
+- [Overview of Return Statement](https://press.rebus.community/programmingfundamentals/chapter/return-statement/)
--- a/src/roadmaps/system-design/content/106-background-jobs/index.md
+++ b/src/roadmaps/system-design/content/106-background-jobs/index.md
@ -1 +1,14 @@
-# Background jobs
+# Background Jobs
+
+Background jobs in system design refer to tasks that are executed in the background, independently of the main execution flow of the system. These tasks are typically initiated by the system itself, rather than by a user or another external agent.
+
+Background jobs can be used for a variety of purposes, such as:
+
+- Performing maintenance tasks: such as cleaning up old data, generating reports, or backing up the database.
+- Processing large volumes of data: such as data import, data export, or data transformation.
+- Sending notifications or messages: such as sending email notifications or push notifications to users.
+- Performing long-running computations: such as machine learning or data analysis.
+
+Learn more from the following links:
+
+- [Intro of Background job system](https://www.codementor.io/projects/tool/background-job-system-atx32exogo)
--- a/src/roadmaps/system-design/content/107-domain-name-system.md
+++ b/src/roadmaps/system-design/content/107-domain-name-system.md
@ -1 +1,16 @@
-# Domain name system
+# Domain Name System
+
+A Domain Name System (DNS) translates a domain name such as www.example.com to an IP address.
+
+DNS is hierarchical, with a few authoritative servers at the top level. Your router or ISP provides information about which DNS server(s) to contact when doing a lookup. Lower level DNS servers cache mappings, which could become stale due to DNS propagation delays. DNS results can also be cached by your browser or OS for a certain period of time, determined by the time to live (TTL).
+
+- NS record (name server) - Specifies the DNS servers for your domain/subdomain.
+- MX record (mail exchange) - Specifies the mail servers for accepting messages.
+- A record (address) - Points a name to an IP address.
+- CNAME (canonical) - Points a name to another name or CNAME (example.com to www.example.com) or to an A record.
+
+To learn more, visit the following links:
+
+- [Getting started with Domain Name System](https://github.com/donnemartin/system-design-primer#domain-name-system)
+- [Intro to DNS Architecture](https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2008-R2-and-2008/dd197427(v=ws.10)?redirectedfrom=MSDNs)
+- [DNS articles](https://support.dnsimple.com/categories/dns/)
--- a/src/roadmaps/system-design/content/108-content-delivery-networks/100-push-cdns.md
+++ b/src/roadmaps/system-design/content/108-content-delivery-networks/100-push-cdns.md
@ -1 +1,10 @@
-# Push cdns
+# Push CDNs
+
+Push CDNs receive new content whenever changes occur on your server. You take full responsibility for providing content, uploading directly to the CDN and rewriting URLs to point to the CDN. You can configure when content expires and when it is updated. Content is uploaded only when it is new or changed, minimizing traffic, but maximizing storage.
+
+Sites with a small amount of traffic or sites with content that isn't often updated work well with push CDNs. Content is placed on the CDNs once, instead of being re-pulled at regular intervals.
+
+To learn more, visit the following links:
+
+- [Introduction on Push CDNs](https://github.com/donnemartin/system-design-primer#content-delivery-network)
+- [Why use a CDN?](https://dev.to/karanpratapsingh/system-design-content-delivery-network-cdn-bof)
--- a/src/roadmaps/system-design/content/108-content-delivery-networks/101-pull-cdns.md
+++ b/src/roadmaps/system-design/content/108-content-delivery-networks/101-pull-cdns.md
@ -1 +1,11 @@
-# Pull cdns
+# Pull CDNs
+
+Pull CDNs grab new content from your server when the first user requests the content. You leave the content on your server and rewrite URLs to point to the CDN. This results in a slower request until the content is cached on the CDN.
+
+A time-to-live (TTL) determines how long content is cached. Pull CDNs minimize storage space on the CDN, but can create redundant traffic if files expire and are pulled before they have actually changed. Sites with heavy traffic work well with pull CDNs, as traffic is spread out more evenly with only recently-requested content remaining on the CDN.
+
+To learn more, visit the following links:
+
+- [The Differences Between Push And Pull CDNss](http://www.travelblogadvice.com/technical/the-differences-between-push-and-pull-cdns/)
+- [Brief aout Content delivery network](https://en.wikipedia.org/wiki/Content_delivery_network)
+- [What is Globally distributed content delivery?](https://figshare.com/articles/journal_contribution/Globally_distributed_content_delivery/6605972)
--- a/src/roadmaps/system-design/content/108-content-delivery-networks/index.md
+++ b/src/roadmaps/system-design/content/108-content-delivery-networks/index.md
@ -1 +1,27 @@
-# Content delivery networks
+# Content Delivery Networks
+
+A content delivery network (CDN) is a globally distributed network of proxy servers, serving content from locations closer to the user. Generally, static files such as HTML/CSS/JS, photos, and videos are served from CDN, although some CDNs such as Amazon's CloudFront support dynamic content. The site's DNS resolution will tell clients which server to contact.
+
+Serving content from CDNs can significantly improve performance in two ways:
+
+ - Users receive content from data centers close to them
+ - Your servers do not have to serve requests that the CDN fulfills
+
+## Push CDNs
+Push CDNs receive new content whenever changes occur on your server. You take full responsibility for providing content, uploading directly to the CDN and rewriting URLs to point to the CDN. You can configure when content expires and when it is updated. Content is uploaded only when it is new or changed, minimizing traffic, but maximizing storage.
+
+Sites with a small amount of traffic or sites with content that isn't often updated work well with push CDNs. Content is placed on the CDNs once, instead of being re-pulled at regular intervals.
+
+## Pull CDNs
+Pull CDNs grab new content from your server when the first user requests the content. You leave the content on your server and rewrite URLs to point to the CDN. This results in a slower request until the content is cached on the CDN.
+
+A time-to-live (TTL) determines how long content is cached. Pull CDNs minimize storage space on the CDN, but can create redundant traffic if files expire and are pulled before they have actually changed. Sites with heavy traffic work well with pull CDNs, as traffic is spread out more evenly with only recently-requested content remaining on the CDN.
+
+## Disadvantages of CDN
+- CDN costs could be significant depending on traffic, although this should be weighed with additional costs you would incur not using a CDN.
+- Content might be stale if it is updated before the TTL expires it.
+- CDNs require changing URLs for static content to point to the CDN.
+
+- [The Differences Between Push And Pull CDNss](http://www.travelblogadvice.com/technical/the-differences-between-push-and-pull-cdns/)
+- [Brief aout Content delivery network](https://en.wikipedia.org/wiki/Content_delivery_network)
+- [What is Globally distributed content delivery?](https://figshare.com/articles/journal_contribution/Globally_distributed_content_delivery/6605972)
--- a/src/roadmaps/system-design/content/109-load-balancers/100-horizontal-scaling.md
+++ b/src/roadmaps/system-design/content/109-load-balancers/100-horizontal-scaling.md
@ -1 +1,15 @@
-# Horizontal scaling
+# Horizontal Scaling
+
+Load balancers can also help with horizontal scaling, improving performance and availability. Scaling out using commodity machines is more cost efficient and results in higher availability than scaling up a single server on more expensive hardware, called Vertical Scaling. It is also easier to hire for talent working on commodity hardware than it is for specialized enterprise systems.
+
+## Disadvantages of horizontal scaling
+- Scaling horizontally introduces complexity and involves cloning servers
+- Servers should be stateless: they should not contain any user-related data like sessions or profile pictures
+- Sessions can be stored in a centralized data store such as a database (SQL, NoSQL) or a persistent cache (Redis, Memcached)
+- Downstream servers such as caches and databases need to handle more simultaneous connections as upstream servers scale out.
+
+To learn more, visit the following links:
+
+- [Introduction to Horizontal Scaling](https://github.com/donnemartin/system-design-primer#horizontal-scaling)
+- [System Design – Horizontal and Vertical Scaling](https://www.geeksforgeeks.org/system-design-horizontal-and-vertical-scaling/)
+- [Getting started with Horizontal and Vertical Scaling](https://www.codingninjas.com/blog/2021/08/25/system-design-horizontal-and-vertical-scaling/)
--- a/src/roadmaps/system-design/content/109-load-balancers/101-layer-4-load-balancing.md
+++ b/src/roadmaps/system-design/content/109-load-balancers/101-layer-4-load-balancing.md
@ -1 +1,8 @@
-# Layer 4 load balancing
+# Layer 4 Load Balancing
+
+Layer 4 load balancers look at info at the transport layer to decide how to distribute requests. Generally, this involves the source, destination IP addresses, and ports in the header, but not the contents of the packet. Layer 4 load balancers forward network packets to and from the upstream server, performing Network Address Translation (NAT).
+
+To learn more, visit the following links:
+
+- [What is Layer 4 Load Balancing?](https://github.com/donnemartin/system-design-primer#communication)
+- [Getting Started with Layer 4 Load Balancing](https://www.nginx.com/resources/glossary/layer-4-load-balancing/)
--- a/src/roadmaps/system-design/content/109-load-balancers/102-layer-7-load-balancing.md
+++ b/src/roadmaps/system-design/content/109-load-balancers/102-layer-7-load-balancing.md
@ -1 +1,10 @@
-# Layer 7 load balancing
+# Layer 7 Load Balancing
+
+Layer 7 load balancers look at the application layer to decide how to distribute requests. This can involve contents of the header, message, and cookies. Layer 7 load balancers terminate network traffic, reads the message, makes a load-balancing decision, then opens a connection to the selected server. For example, a layer 7 load balancer can direct video traffic to servers that host videos while directing more sensitive user billing traffic to security-hardened servers.
+
+At the cost of flexibility, layer 4 load balancing requires less time and computing resources than Layer 7, although the performance impact can be minimal on modern commodity hardware.
+
+Learn more from the following links:
+
+- [Introduction to Layer 7 Load Balancing](https://github.com/donnemartin/system-design-primer#layer-7-load-balancing)
+- [A Brief of Layer 7 Balancing](https://github.com/donnemartin/system-design-primer#communication)
--- a/src/roadmaps/system-design/content/109-load-balancers/103-load-balancing-algorithms.md
+++ b/src/roadmaps/system-design/content/109-load-balancers/103-load-balancing-algorithms.md
@ -1 +1,8 @@
-# Load balancing algorithms
+# Load Balancing Algorithms
+
+Load balancing is the process of distributing incoming network traffic across multiple servers in order to optimize resource usage, minimize response time, and avoid overloading any single server. There are several algorithms that can be used to achieve this, each with its own advantages and disadvantages.
+
+To learn more, visit the following links:
+
+- [Concept of load balancing algorithms](https://www.enjoyalgorithms.com/blog/load-balancers-in-system-design)
+- [Types of load balancing algorithms](https://www.cloudflare.com/learning/performance/types-of-load-balancing-algorithms/)
--- a/src/roadmaps/system-design/content/109-load-balancers/104-lb-vs-reverse-proxy.md
+++ b/src/roadmaps/system-design/content/109-load-balancers/104-lb-vs-reverse-proxy.md
@ -1 +1,14 @@
-# Lb vs reverse proxy
+# Load Balancer vs Reverse Proxy
+
+- Deploying a load balancer is useful when you have multiple servers. Often, load balancers route traffic to a set of servers serving the same function.
+- Reverse proxies can be useful even with just one web server or application server, opening up the benefits described in the previous section.
+- Solutions such as NGINX and HAProxy can support both layer 7 reverse proxying and load balancing.
+
+## Disadvantages of reverse proxy:
+
+- Introducing a reverse proxy results in increased complexity.
+- A single reverse proxy is a single point of failure, configuring multiple reverse proxies (ie a failover) further increases complexity
+
+To learn more visit the following links:
+
+- [What is a Reverse Proxy vs. Load Balancer?](https://www.nginx.com/resources/glossary/reverse-proxy-vs-load-balancer/)
--- a/src/roadmaps/system-design/content/109-load-balancers/index.md
+++ b/src/roadmaps/system-design/content/109-load-balancers/index.md
@ -1 +1,31 @@
-# Load balancers
+# Load Balancers
+
+Load balancers distribute incoming client requests to computing resources such as application servers and databases. In each case, the load balancer returns the response from the computing resource to the appropriate client. Load balancers are effective at:
+
+ - Preventing requests from going to unhealthy servers
+ - Preventing overloading resources
+ - Helping to eliminate a single point of failure
+
+Load balancers can be implemented with hardware (expensive) or with software such as HAProxy. Additional benefits include:
+
+ - **SSL termination** - Decrypt incoming requests and encrypt server responses so backend servers do not have to perform these potentially expensive operations
+ - Removes the need to install X.509 certificates on each server
+ - **Session persistence** - Issue cookies and route a specific client's requests to same instance if the web apps do not keep track of sessions
+
+To protect against failures, it's common to set up multiple load balancers, either in active-passive or active-active mode. Load balancers can route traffic based on various metrics, including:
+
+## Layer 4 load balancing
+Layer 4 load balancers look at info at the transport layer to decide how to distribute requests. Generally, this involves the source, destination IP addresses, and ports in the header, but not the contents of the packet. Layer 4 load balancers forward network packets to and from the upstream server, performing Network Address Translation (NAT).
+
+## Layer 7 load balancing
+Layer 7 load balancers look at the application layer to decide how to distribute requests. This can involve contents of the header, message, and cookies. Layer 7 load balancers terminate network traffic, reads the message, makes a load-balancing decision, then opens a connection to the selected server. For example, a layer 7 load balancer can direct video traffic to servers that host videos while directing more sensitive user billing traffic to security-hardened servers.
+
+## Disadvantages of load balancer
+The load balancer can become a performance bottleneck if it does not have enough resources or if it is not configured properly.
+Introducing a load balancer to help eliminate a single point of failure results in increased complexity.
+A single load balancer is a single point of failure, configuring multiple load balancers further increases complexity.
+
+To learn more, visit the following links:
+
+- [What is Load balancing (computing)?](https://en.wikipedia.org/wiki/Load_balancing_(computing))
+- [Introduction to Load Balancing](https://github.com/donnemartin/system-design-primer#layer-7-load-balancing)
--- a/src/roadmaps/system-design/content/110-application-layer/100-microservices.md
+++ b/src/roadmaps/system-design/content/110-application-layer/100-microservices.md
@ -1 +1,10 @@
 # Microservices
+
+Related to this discussion are microservices, which can be described as a suite of independently deployable, small, modular services. Each service runs a unique process and communicates through a well-defined, lightweight mechanism to serve a business goal. 1
+
+Pinterest, for example, could have the following microservices: user profile, follower, feed, search, photo upload, etc.
+
+To learn more, visit the following links:
+
+- [Intro to Microservice](https://github.com/donnemartin/system-design-primer#microservices)
+- [Building Microservices](https://cloudncode.wordpress.com/2016/07/22/msa-getting-started/)
--- a/src/roadmaps/system-design/content/110-application-layer/101-service-discovery.md
+++ b/src/roadmaps/system-design/content/110-application-layer/101-service-discovery.md
@ -1 +1,8 @@
-# Service discovery
+# Service Discovery
+
+Systems such as Consul, Etcd, and Zookeeper can help services find each other by keeping track of registered names, addresses, and ports. Health checks help verify service integrity and are often done using an HTTP endpoint. Both Consul and Etcd have a built in key-value store that can be useful for storing config values and other shared data.
+
+Visit the following links to learn more:
+
+- [What is Service-oriented architecture?](https://en.wikipedia.org/wiki/Service-oriented_architecture)
+- [Intro to Service Discovery](https://github.com/donnemartin/system-design-primer#Service%20Discovery)
--- a/src/roadmaps/system-design/content/110-application-layer/index.md
+++ b/src/roadmaps/system-design/content/110-application-layer/index.md
@ -1 +1,7 @@
-# Application layer
+# Application Layer
+
+Separating out the web layer from the application layer (also known as platform layer) allows you to scale and configure both layers independently. Adding a new API results in adding application servers without necessarily adding additional web servers. The single responsibility principle advocates for small and autonomous services that work together. Small teams with small services can plan more aggressively for rapid growth.
+
+For more resources, visit the following links:
+
+- [Getting started with Application Layer](https://github.com/donnemartin/system-design-primer#Application%20layer)
--- a/src/roadmaps/system-design/content/111-databases/100-rdbms/100-replication.md
+++ b/src/roadmaps/system-design/content/111-databases/100-rdbms/100-replication.md
@ -1 +1,11 @@
 # Replication
+
+## Master-slave replication:
+The master serves reads and writes, replicating writes to one or more slaves, which serve only reads. Slaves can also replicate to additional slaves in a tree-like fashion. If the master goes offline, the system can continue to operate in read-only mode until a slave is promoted to a master or a new master is provisioned.
+
+## Master-master replication:
+Both masters serve reads and writes and coordinate with each other on writes. If either master goes down, the system can continue to operate with both reads and writes.
+
+To learn more, visit the following links:
+
+- [Getting started with Replication](https://github.com/donnemartin/system-design-primer#replication)
--- a/src/roadmaps/system-design/content/111-databases/100-rdbms/101-sharding.md
+++ b/src/roadmaps/system-design/content/111-databases/100-rdbms/101-sharding.md
@ -1 +1,10 @@
 # Sharding
+
+Sharding distributes data across different databases such that each database can only manage a subset of the data. Taking a users database as an example, as the number of users increases, more shards are added to the cluster.
+
+Similar to the advantages of federation, sharding results in less read and write traffic, less replication, and more cache hits. Index size is also reduced, which generally improves performance with faster queries. If one shard goes down, the other shards are still operational, although you'll want to add some form of replication to avoid data loss. Like federation, there is no single central master serializing writes, allowing you to write in parallel with increased throughput.
+
+Learn more from the following links:
+
+- [The coming of the Shard](http://highscalability.com/blog/2009/8/6/an-unorthodox-approach-to-database-design-the-coming-of-the.html)
+- [Shard (database architecture)](https://en.wikipedia.org/wiki/Shard_(database_architecture))
--- a/src/roadmaps/system-design/content/111-databases/100-rdbms/102-federation.md
+++ b/src/roadmaps/system-design/content/111-databases/100-rdbms/102-federation.md
@ -1 +1,7 @@
 # Federation
+
+Federation (or functional partitioning) splits up databases by function. For example, instead of a single, monolithic database, you could have three databases: forums, users, and products, resulting in less read and write traffic to each database and therefore less replication lag. Smaller databases result in more data that can fit in memory, which in turn results in more cache hits due to improved cache locality. With no single central master serializing writes you can write in parallel, increasing throughput.
+
+Learn more from the following links:
+
+- [Intro to Federation](https://github.com/donnemartin/system-design-primer#federation)
--- a/src/roadmaps/system-design/content/111-databases/100-rdbms/103-denormalization.md
+++ b/src/roadmaps/system-design/content/111-databases/100-rdbms/103-denormalization.md
@ -1 +1,10 @@
 # Denormalization
+
+Denormalization attempts to improve read performance at the expense of some write performance. Redundant copies of the data are written in multiple tables to avoid expensive joins. Some RDBMS such as PostgreSQL and Oracle support materialized views which handle the work of storing redundant information and keeping redundant copies consistent.
+
+Once data becomes distributed with techniques such as federation and sharding, managing joins across data centers further increases complexity. Denormalization might circumvent the need for such complex joins.
+
+To learn more, visit the following links:
+
+- [Guide to Denormalization](https://github.com/donnemartin/system-design-primer#denormalization)
+- [Denormalization](https://en.wikipedia.org/wiki/Denormalization)
--- a/src/roadmaps/system-design/content/111-databases/100-rdbms/104-sql-tuning.md
+++ b/src/roadmaps/system-design/content/111-databases/100-rdbms/104-sql-tuning.md
@ -1 +1,13 @@
-# Sql tuning
+# SQL Tuning
+
+SQL tuning is a broad topic and many books have been written as reference. It's important to benchmark and profile to simulate and uncover bottlenecks.
+
+ - Benchmark - Simulate high-load situations with tools such as ab.
+ - Profile - Enable tools such as the slow query log to help track performance issues.
+
+Benchmarking and profiling might point you to the following optimizations.
+
+To learn more, visit the following links:
+
+- [What is SQL Tuning?](https://github.com/donnemartin/system-design-primer#sql-tuning)
+- [Optimizing MySQL Queries](https://aiddroid.com/10-tips-optimizing-mysql-queries-dont-suck/)
--- a/src/roadmaps/system-design/content/111-databases/100-rdbms/index.md
+++ b/src/roadmaps/system-design/content/111-databases/100-rdbms/index.md
@ -1 +1,14 @@
-# Rdbms
+# RDBMS
+
+A relational database like SQL is a collection of data items organized in tables. ACID is a set of properties of relational database transactions.
+
+ - **Atomicity** - Each transaction is all or nothing
+ - **Consistency** - Any transaction will bring the database from one valid state to another
+ - **Isolation** - Executing transactions concurrently has the same results as if the transactions were executed serially
+ - **Durability** - Once a transaction has been committed, it will remain so
+
+There are many techniques to scale a relational database: master-slave replication, master-master replication, federation, sharding, denormalization, and SQL tuning.
+
+To learn more, visit the following links:
+
+- [Guide to RDBMS?](https://github.com/donnemartin/system-design-primer#relational-database-management-system-rdbms)
--- a/src/roadmaps/system-design/content/111-databases/101-nosql/100-key-value-store.md
+++ b/src/roadmaps/system-design/content/111-databases/101-nosql/100-key-value-store.md
@ -1 +1,10 @@
-# Key value store
+# Key Value Store
+
+A key-value store generally allows for O(1) reads and writes and is often backed by memory or SSD. Data stores can maintain keys in lexicographic order, allowing efficient retrieval of key ranges. Key-value stores can allow for storing of metadata with a value.
+
+Key-value stores provide high performance and are often used for simple data models or for rapidly-changing data, such as an in-memory cache layer. Since they offer only a limited set of operations, complexity is shifted to the application layer if additional operations are needed.
+
+To learn more, visit the following links:
+
+- [Key–value database](https://en.wikipedia.org/wiki/Key%E2%80%93value_database)
+- [What are the disadvantages of using a key/value table?](https://stackoverflow.com/questions/4056093/what-are-the-disadvantages-of-using-a-key-value-table-over-nullable-columns-or)
--- a/src/roadmaps/system-design/content/111-databases/101-nosql/101-document-store.md
+++ b/src/roadmaps/system-design/content/111-databases/101-nosql/101-document-store.md
@ -1 +1,10 @@
-# Document store
+# Document Store
+
+A document store is centered around documents (XML, JSON, binary, etc), where a document stores all information for a given object. Document stores provide APIs or a query language to query based on the internal structure of the document itself. Note, many key-value stores include features for working with a value's metadata, blurring the lines between these two storage types.
+
+Based on the underlying implementation, documents are organized by collections, tags, metadata, or directories. Although documents can be organized or grouped together, documents may have fields that are completely different from each other.
+
+To learn more, visit the following links:
+
+- [Getting started with Document Store](https://github.com/donnemartin/system-design-primer#document-store)
+- [Document-oriented database](https://en.wikipedia.org/wiki/Document-oriented_database)
--- a/src/roadmaps/system-design/content/111-databases/101-nosql/102-wide-column-store.md
+++ b/src/roadmaps/system-design/content/111-databases/101-nosql/102-wide-column-store.md
@ -1 +1,10 @@
-# Wide column store
+# Wide Column Store
+
+A wide column store's basic unit of data is a column (name/value pair). A column can be grouped in column families (analogous to a SQL table). Super column families further group column families. You can access each column independently with a row key, and columns with the same row key form a row. Each value contains a timestamp for versioning and for conflict resolution.
+
+Google introduced Bigtable as the first wide column store, which influenced the open-source HBase often-used in the Hadoop ecosystem, and Cassandra from Facebook. Stores such as BigTable, HBase, and Cassandra maintain keys in lexicographic order, allowing efficient retrieval of selective key ranges.
+
+Learn more from the following links:
+
+- [A brief of Wide Column Store](https://github.com/donnemartin/system-design-primer#Wide%20column%20store)
+- [Bigtable architecture](https://www.read.seas.harvard.edu/~kohler/class/cs239-w08/chang06bigtable.pdf)
--- a/src/roadmaps/system-design/content/111-databases/101-nosql/103-graph-databases.md
+++ b/src/roadmaps/system-design/content/111-databases/101-nosql/103-graph-databases.md
@ -1 +1,10 @@
-# Graph databases
+# Graph Databases
+
+In a graph database, each node is a record and each arc is a relationship between two nodes. Graph databases are optimized to represent complex relationships with many foreign keys or many-to-many relationships.
+
+Graphs databases offer high performance for data models with complex relationships, such as a social network. They are relatively new and are not yet widely-used; it might be more difficult to find development tools and resources. Many graphs can only be accessed with REST APIs.
+
+Learn more from the following links:
+
+- [Graph database](https://en.wikipedia.org/wiki/Graph_database)
+- [Introduction to NoSQL](https://www.youtube.com/watch?v=qI_g07C_Q5I)
--- a/src/roadmaps/system-design/content/111-databases/101-nosql/index.md
+++ b/src/roadmaps/system-design/content/111-databases/101-nosql/index.md
@ -1 +1,15 @@
-# Nosql
+# NoSQL
+
+NoSQL is a collection of data items represented in a key-value store, document store, wide column store, or a graph database. Data is denormalized, and joins are generally done in the application code. Most NoSQL stores lack true ACID transactions and favor eventual consistency.
+
+BASE is often used to describe the properties of NoSQL databases. In comparison with the CAP Theorem, BASE chooses availability over consistency.
+
+- Basically available - the system guarantees availability.
+- Soft state - the state of the system may change over time, even without input.
+- Eventual consistency - the system will become consistent over a period of time, given that the system doesn't receive input during that period.
+
+Learn more from the following links:
+
+- [SQL or noSQL?](https://github.com/donnemartin/system-design-primer#sql-or-nosql)
+- [Brief of noSQL Patterns](http://horicky.blogspot.com/2009/11/nosql-patterns.html)
+- [Introduction to NoSQL](https://www.youtube.com/watch?v=qI_g07C_Q5I)
--- a/src/roadmaps/system-design/content/111-databases/102-sql-vs-nosql.md
+++ b/src/roadmaps/system-design/content/111-databases/102-sql-vs-nosql.md
@ -1 +1,32 @@
-# Sql vs nosql
+# SQL vs noSQL
+
+## Reasons for SQL:
+- Structured data
+- Strict schema
+- Relational data
+- Need for complex joins
+- Transactions
+- Clear patterns for scaling
+- More established: developers, community, code, tools, etc
+- Lookups by index are very fast
+
+
+## Reasons for NoSQL:
+- Semi-structured data
+- Dynamic or flexible schema
+- Non-relational data
+- No need for complex joins
+- Store many TB (or PB) of data
+- Very data intensive workload
+- Very high throughput for IOPS
+
+## Sample data well-suited for NoSQL:
+- Rapid ingest of clickstream and log data
+- Leaderboard or scoring data
+- Temporary data, such as a shopping cart
+- Frequently accessed ('hot') tables
+- Metadata/lookup tables
+
+Learn more from the followinw links:
+- [SQL vs NoSQL: The Differences](https://www.sitepoint.com/sql-vs-nosql-differences/)
+- [Scaling up to your first 10 million users](https://www.youtube.com/watch?v=kKjm4ehYiMs)
--- a/src/roadmaps/system-design/content/111-databases/index.md
+++ b/src/roadmaps/system-design/content/111-databases/index.md
@ -1 +1,15 @@
 # Databases
+
+A database is a collection of data that is organized and stored in a structured way, allowing for efficient retrieval and manipulation of the data. Databases are used in many different types of systems to store and manage data, from small personal applications to large enterprise systems.
+
+There are many different types of databases available, each with their own strengths and weaknesses. Some of the most common types of databases are:
+
+- Relational databases
+- NoSQL databases
+- Graph databases
+- Time-series databases
+
+Learn more from the following links:
+
+- [Intro to Databases](https://github.com/donnemartin/system-design-primer#database)
+- [Database design](https://en.wikipedia.org/wiki/Database_design)
--- a/src/roadmaps/system-design/content/112-caching/100-client-caching.md
+++ b/src/roadmaps/system-design/content/112-caching/100-client-caching.md
@ -1 +1,8 @@
 # Client caching
+
+Caches can be located on the client side (OS or browser), server side, or in a distinct cache layer.
+
+To learn more, visit the following links:
+
+- [Intro to Client Caching](https://github.com/donnemartin/system-design-primer#client%20caching)
+- [Server side Client Caching](https://github.com/donnemartin/system-design-primer#reverse-proxy-web-server)
--- a/src/roadmaps/system-design/content/112-caching/101-cdn-caching.md
+++ b/src/roadmaps/system-design/content/112-caching/101-cdn-caching.md
@ -1 +1,10 @@
-# Cdn caching
+# CDN caching
+
+CDNs are considered a type of cache.
+
+A content delivery network (CDN) is a globally distributed network of proxy servers, serving content from locations closer to the user. Generally, static files such as HTML/CSS/JS, photos, and videos are served from CDN, although some CDNs such as Amazon's CloudFront support dynamic content. The site's DNS resolution will tell clients which server to contact.
+
+To learn more, visit the following links:
+
+- [What is CDN Cache?](https://github.com/donnemartin/system-design-primer#CDN%20cache)
+- [CDN Caching](https://github.com/donnemartin/system-design-primer#content-delivery-network)
--- a/src/roadmaps/system-design/content/112-caching/102-web-server-caching.md
+++ b/src/roadmaps/system-design/content/112-caching/102-web-server-caching.md
@ -1 +1,7 @@
-# Web server caching
+# Web Server Caching
+
+Reverse proxies and caches such as Varnish can serve static and dynamic content directly. Web servers can also cache requests, returning responses without having to contact application servers.
+
+To learn more, visit the following links:
+
+- [Intro to Web Server Caching?](https://github.com/donnemartin/system-design-primer#web-server-caching)
--- a/src/roadmaps/system-design/content/112-caching/103-database-caching.md
+++ b/src/roadmaps/system-design/content/112-caching/103-database-caching.md
@ -1 +1,7 @@
-# Database caching
+# Database Caching
+
+Your database usually includes some level of caching in a default configuration, optimized for a generic use case. Tweaking these settings for specific usage patterns can further boost performance.
+
+To learn more, visit the following links:
+
+- [Intro to Database-Caching?](https://github.com/donnemartin/system-design-primer#database-caching)
--- a/src/roadmaps/system-design/content/112-caching/104-application-caching.md
+++ b/src/roadmaps/system-design/content/112-caching/104-application-caching.md
@ -1 +1,7 @@
-# Application caching
+# Application Caching
+
+In-memory caches such as Memcached and Redis are key-value stores between your application and your data storage. Since the data is held in RAM, it is much faster than typical databases where data is stored on disk. RAM is more limited than disk, so cache invalidation algorithms such as least recently used (LRU) can help invalidate 'cold' entries and keep 'hot' data in RAM.
+
+Visit the following links to learn more:
+
+- [Intro to Application Caching](https://github.com/donnemartin/system-design-primer#application-caching)
--- a/src/roadmaps/system-design/content/112-caching/105-caching-strategies/100-cache-aside.md
+++ b/src/roadmaps/system-design/content/112-caching/105-caching-strategies/100-cache-aside.md
@ -1 +1,16 @@
-# Cache aside
+# Cache-aside
+
+
+The application is responsible for reading and writing from storage. The cache does not interact with storage directly. The application does the following:
+
+- Look for entry in cache, resulting in a cache miss
+- Load entry from the database
+- Add entry to cache
+- Return entry
+
+Memcached is generally used in this manner. Subsequent reads of data added to cache are fast. Cache-aside is also referred to as lazy loading. Only requested data is cached, which avoids filling up the cache with data that isn't requested.
+
+Learn more from the following links:
+
+- [Getting started with Cache-aside](https://github.com/donnemartin/system-design-primer#cache-aside)
+- [What is Memcached?](https://memcached.org/)
--- a/src/roadmaps/system-design/content/112-caching/105-caching-strategies/101-write-through.md
+++ b/src/roadmaps/system-design/content/112-caching/105-caching-strategies/101-write-through.md
@ -1 +1,13 @@
-# Write through
+# Write-through
+
+The application uses the cache as the main data store, reading and writing data to it, while the cache is responsible for reading and writing to the database:
+
+- Application adds/updates entry in cache
+- Cache synchronously writes entry to data store
+- Return
+
+Write-through is a slow overall operation due to the write operation, but subsequent reads of just written data are fast. Users are generally more tolerant of latency when updating data than reading data. Data in the cache is not stale.
+
+To learn more, visit the following links:
+
+- [Getting started with Write-through](https://github.com/donnemartin/system-design-primer#Write-through)
--- a/src/roadmaps/system-design/content/112-caching/105-caching-strategies/102-write-behind.md
+++ b/src/roadmaps/system-design/content/112-caching/105-caching-strategies/102-write-behind.md
@ -1 +1,15 @@
-# Write behind
+# Write-behind
+
+In write-behind, the application does the following:
+
+- Add/update entry in cache
+- Asynchronously write entry to the data store, improving write performance
+
+## Disadvantages of write-behind:
+
+- There could be data loss if the cache goes down prior to its contents hitting the data store.
+- It is more complex to implement write-behind than it is to implement cache-aside or write-through.
+
+To learn more, visit the following links:
+
+- [Getting started with Write-behind](https://github.com/donnemartin/system-design-primer#Write-behind)
--- a/src/roadmaps/system-design/content/112-caching/105-caching-strategies/103-refresh-ahead.md
+++ b/src/roadmaps/system-design/content/112-caching/105-caching-strategies/103-refresh-ahead.md
@ -1 +1,10 @@
-# Refresh ahead
+# Refresh-ahead
+
+You can configure the cache to automatically refresh any recently accessed cache entry prior to its expiration. Refresh-ahead can result in reduced latency vs read-through if the cache can accurately predict which items are likely to be needed in the future.
+
+## Disadvantage of refresh-ahead:
+- Not accurately predicting which items are likely to be needed in the future can result in reduced performance than without refresh-ahead.
+
+To learn more, visit the following links:
+
+- [Getting started with Refresh-ahead](https://github.com/donnemartin/system-design-primer#refresh-ahead)
--- a/src/roadmaps/system-design/content/112-caching/105-caching-strategies/index.md
+++ b/src/roadmaps/system-design/content/112-caching/105-caching-strategies/index.md
@ -1 +1,24 @@
-# Caching strategies
+# Caching Strategies
+
+Caching improves page load times and can reduce the load on your servers and databases. In this model, the dispatcher will first lookup if the request has been made before and try to find the previous result to return, in order to save the actual execution.
+
+Databases often benefit from a uniform distribution of reads and writes across its partitions. Popular items can skew the distribution, causing bottlenecks. Putting a cache in front of a database can help absorb uneven loads and spikes in traffic.
+
+## Client caching
+Caches can be located on the client side (OS or browser), server side, or in a distinct cache layer.
+
+## CDN caching
+CDNs are considered a type of cache.
+
+## Web server caching
+Reverse proxies and caches such as Varnish can serve static and dynamic content directly. Web servers can also cache requests, returning responses without having to contact application servers.
+
+## Database caching
+Your database usually includes some level of caching in a default configuration, optimized for a generic use case. Tweaking these settings for specific usage patterns can further boost performance.
+
+## Application caching
+In-memory caches such as Memcached and Redis are key-value stores between your application and your data storage. Since the data is held in RAM, it is much faster than typical databases where data is stored on disk. RAM is more limited than disk, so cache invalidation algorithms such as least recently used (LRU) can help invalidate 'cold' entries and keep 'hot' data in RAM.
+
+To learn more, visit the following links:
+
+- [Getting started with Cache](https://github.com/donnemartin/system-design-primer#cache)
--- a/src/roadmaps/system-design/content/112-caching/index.md
+++ b/src/roadmaps/system-design/content/112-caching/index.md
@ -1 +1,25 @@
 # Caching
+
+Caching improves page load times and can reduce the load on your servers and databases. In this model, the dispatcher will first lookup if the request has been made before and try to find the previous result to return, in order to save the actual execution.
+
+Databases often benefit from a uniform distribution of reads and writes across its partitions. Popular items can skew the distribution, causing bottlenecks. Putting a cache in front of a database can help absorb uneven loads and spikes in traffic.
+
+## Client caching
+Caches can be located on the client side (OS or browser), server side, or in a distinct cache layer.
+
+## CDN caching
+CDNs are considered a type of cache.
+
+## Web server caching
+Reverse proxies and caches such as Varnish can serve static and dynamic content directly. Web servers can also cache requests, returning responses without having to contact application servers.
+
+## Database caching
+Your database usually includes some level of caching in a default configuration, optimized for a generic use case. Tweaking these settings for specific usage patterns can further boost performance.
+
+## Application caching
+In-memory caches such as Memcached and Redis are key-value stores between your application and your data storage. Since the data is held in RAM, it is much faster than typical databases where data is stored on disk. RAM is more limited than disk, so cache invalidation algorithms such as least recently used (LRU) can help invalidate 'cold' entries and keep 'hot' data in RAM.
+
+To learn more, visit the following links:
+
+- [Getting started with Cache](https://github.com/donnemartin/system-design-primer#cache)
+- [From cache to in-memory data grid](https://www.slideshare.net/tmatyashovsky/from-cache-to-in-memory-data-grid-introduction-to-hazelcast)
--- a/src/roadmaps/system-design/content/113-asynchronism/100-message-queues.md
+++ b/src/roadmaps/system-design/content/113-asynchronism/100-message-queues.md
@ -1 +1,23 @@
-# Message queues
+# Message Queues
+
+Message queues receive, hold, and deliver messages. If an operation is too slow to perform inline, you can use a message queue with the following workflow:
+
+- An application publishes a job to the queue, then notifies the user of job status
+- A worker picks up the job from the queue, processes it, then signals the job is complete
+
+The user is not blocked and the job is processed in the background. During this time, the client might optionally do a small amount of processing to make it seem like the task has completed. For example, if posting a tweet, the tweet could be instantly posted to your timeline, but it could take some time before your tweet is actually delivered to all of your followers.
+
+## Redis 
+It is useful as a simple message broker but messages can be lost.
+
+## RabbitMQ 
+This is popular but requires you to adapt to the 'AMQP' protocol and manage your own nodes.
+
+## Amazon SQS
+Amazon SQS is hosted but can have high latency and has the possibility of messages being delivered twice.
+
+To learn more, visit the following links:
+
+- [What is Redis?](https://redis.io/)
+- [RabbitMQ in Message Queues](https://www.rabbitmq.com/)
+- [Overview of Amazon SQS](https://aws.amazon.com/sqs/)
--- a/src/roadmaps/system-design/content/113-asynchronism/101-task-queues.md
+++ b/src/roadmaps/system-design/content/113-asynchronism/101-task-queues.md
@ -1 +1,10 @@
-# Task queues
+# Task Queues
+
+Tasks queues receive tasks and their related data, runs them, then delivers their results. They can support scheduling and can be used to run computationally-intensive jobs in the background.
+
+Celery has support for scheduling and primarily has python support.
+
+To learn more, visit the following links:
+
+- [Overview of Task Queues](https://github.com/donnemartin/system-design-primer#task%20queues)
+- [Celery - Distributed Task Queue](https://docs.celeryq.dev/en/stable/)
--- a/src/roadmaps/system-design/content/113-asynchronism/102-back-pressure.md
+++ b/src/roadmaps/system-design/content/113-asynchronism/102-back-pressure.md
@ -1 +1,7 @@
-# Back pressure
+# Back Pressure
+
+If queues start to grow significantly, the queue size can become larger than memory, resulting in cache misses, disk reads, and even slower performance. Back pressure can help by limiting the queue size, thereby maintaining a high throughput rate and good response times for jobs already in the queue. Once the queue fills up, clients get a server busy or HTTP 503 status code to try again later. Clients can retry the request at a later time, perhaps with exponential backoff.
+
+To learn more, visit the following links:
+
+- [Overview of Back Pressure](https://github.com/donnemartin/system-design-primer#back%20pressure)
--- a/src/roadmaps/system-design/content/113-asynchronism/index.md
+++ b/src/roadmaps/system-design/content/113-asynchronism/index.md
@ -1 +1,8 @@
 # Asynchronism
+
+Asynchronous workflows help reduce request times for expensive operations that would otherwise be performed in-line. They can also help by doing time-consuming work in advance, such as periodic aggregation of data.
+
+To learn more, visit the following links:
+
+- [Overview of Asynchronism](https://github.com/donnemartin/system-design-primer#Asynchronism)
+- [What is the difference between a message queue and a task queue?](https://www.quora.com/What-is-the-difference-between-a-message-queue-and-a-task-queue-Why-would-a-task-queue-require-a-message-broker-like-RabbitMQ-Redis-Celery-or-IronMQ-to-function)
--- a/src/roadmaps/system-design/content/114-idempotent-operations.md
+++ b/src/roadmaps/system-design/content/114-idempotent-operations.md
@ -1 +1,8 @@
-# Idempotent operations
+# Idempotent Operations
+
+An idempotent operation is an operation, action, or request that can be applied multiple times without changing the result, i.e. the state of the system, beyond the initial application. EXAMPLES (WEB APP CONTEXT): IDEMPOTENT: Making multiple identical requests has the same effect as making a single request.
+
+To learn more, visit the following links:
+
+- [What is an idempotent operation?](https://stackoverflow.com/questions/1077412/what-is-an-idempotent-operation)
+- [Overview of Idempotent Operation](https://www.baeldung.com/cs/idempotent-operations)
--- a/src/roadmaps/system-design/content/115-communication/100-http.md
+++ b/src/roadmaps/system-design/content/115-communication/100-http.md
@ -1 +1,8 @@
-# Http
+# HTTP
+
+HTTP is a method for encoding and transporting data between a client and a server. It is a request/response protocol: clients issue requests and servers issue responses with relevant content and completion status info about the request. HTTP is self-contained, allowing requests and responses to flow through many intermediate routers and servers that perform load balancing, caching, encryption, and compression.
+
+To learn more, visit the following links:
+
+- [What Is HTTP?](https://www.nginx.com/resources/glossary/http/)
+- [What is the difference between HTTP protocol and TCP protocol?](https://www.quora.com/What-is-the-difference-between-HTTP-protocol-and-TCP-protocol)
--- a/src/roadmaps/system-design/content/115-communication/101-tcp.md
+++ b/src/roadmaps/system-design/content/115-communication/101-tcp.md
@ -1 +1,13 @@
-# Tcp
+# TCP
+
+TCP is a connection-oriented protocol over an IP network. Connection is established and terminated using a handshake. All packets sent are guaranteed to reach the destination in the original order and without corruption through:
+
+- Sequence numbers and checksum fields for each packet
+- Acknowledgement packets and automatic retransmission
+
+If the sender does not receive a correct response, it will resend the packets. If there are multiple timeouts, the connection is dropped. TCP also implements flow control and congestion control. These guarantees cause delays and generally result in less efficient transmission than UDP.
+
+To learn more, visit the following links:
+
+- [What Is TCP?](https://github.com/donnemartin/system-design-primer#TCP)
+- [What is the difference between HTTP protocol and TCP protocol?](https://www.quora.com/What-is-the-difference-between-HTTP-protocol-and-TCP-protocol)
--- a/src/roadmaps/system-design/content/115-communication/102-udp.md
+++ b/src/roadmaps/system-design/content/115-communication/102-udp.md
@ -1 +1,18 @@
-# Udp
+# UDP
+
+UDP is connectionless. Datagrams (analogous to packets) are guaranteed only at the datagram level. Datagrams might reach their destination out of order or not at all. UDP does not support congestion control. Without the guarantees that TCP support, UDP is generally more efficient.
+
+UDP can broadcast, sending datagrams to all devices on the subnet. This is useful with DHCP because the client has not yet received an IP address, thus preventing a way for TCP to stream without the IP address.
+
+UDP is less reliable but works well in real time use cases such as VoIP, video chat, streaming, and realtime multiplayer games.
+
+Use UDP over TCP when:
+
+- You need the lowest latency
+- Late data is worse than loss of data
+- You want to implement your own error correction
+
+To learn more, visit the following link:
+
+- [What Is UDP?](https://github.com/donnemartin/system-design-primer#UDP)
+- [Difference between TCP and UDP?](https://stackoverflow.com/questions/5970383/difference-between-tcp-and-udp)
--- a/src/roadmaps/system-design/content/115-communication/103-rpc.md
+++ b/src/roadmaps/system-design/content/115-communication/103-rpc.md
@ -1 +1,7 @@
-# Rpc
+# RPC
+
+In an RPC, a client causes a procedure to execute on a different address space, usually a remote server. The procedure is coded as if it were a local procedure call, abstracting away the details of how to communicate with the server from the client program. Remote calls are usually slower and less reliable than local calls so it is helpful to distinguish RPC calls from local calls. Popular RPC frameworks include Protobuf, Thrift, and Avro.
+
+To learn more, visit the following links:
+
+- [What Is RPC?](https://github.com/donnemartin/system-design-primer#RPC)
--- a/src/roadmaps/system-design/content/115-communication/104-rest.md
+++ b/src/roadmaps/system-design/content/115-communication/104-rest.md
@ -1 +1,15 @@
-# Rest
+# REST
+
+REST is an architectural style enforcing a client/server model where the client acts on a set of resources managed by the server. The server provides a representation of resources and actions that can either manipulate or get a new representation of resources. All communication must be stateless and cacheable.
+
+There are four qualities of a RESTful interface:
+
+- Identify resources (URI in HTTP) - use the same URI regardless of any operation.
+- Change with representations (Verbs in HTTP) - use verbs, headers, and body.
+- Self-descriptive error message (status response in HTTP) - Use status codes, don't reinvent the wheel.
+- HATEOAS (HTML interface for HTTP) - your web service should be fully accessible in a browser.
+
+To learn more, visit the following links:
+
+- [What Is REST?](https://github.com/donnemartin/system-design-primer#REST)
+- [What are the drawbacks of using RESTful APIs?](https://www.quora.com/What-are-the-drawbacks-of-using-RESTful-APIs)
--- a/src/roadmaps/system-design/content/115-communication/105-grpc.md
+++ b/src/roadmaps/system-design/content/115-communication/105-grpc.md
@ -1 +1,7 @@
-# Grpc
+# gRPC
+
+gRPC is a high-performance, open-source framework for building remote procedure call (RPC) APIs. It is based on the Protocol Buffers data serialization format and supports a variety of programming languages, including C#, Java, and Python.
+
+Learn more from the following links:
+
+- [What Is gRPC?](https://www.wallarm.com/what/the-concept-of-grpc)
--- a/src/roadmaps/system-design/content/115-communication/106-graphql.md
+++ b/src/roadmaps/system-design/content/115-communication/106-graphql.md
@ -1 +1,8 @@
-# Graphql
+# GraphQL
+
+GraphQL is a query language and runtime for building APIs. It allows clients to define the structure of the data they need and the server will return exactly that. This is in contrast to traditional REST APIs, where the server exposes a fixed set of endpoints and the client must work with the data as it is returned.
+
+To learn more, visit the following links:
+
+- [GraphQL Server](https://www.howtographql.com/basics/3-big-picture/)
+- [What is GraphQL?](https://www.redhat.com/en/topics/api/what-is-graphql)
--- a/src/roadmaps/system-design/content/115-communication/index.md
+++ b/src/roadmaps/system-design/content/115-communication/index.md
@ -1 +1,28 @@
 # Communication
+
+## Hypertext transfer protocol (HTTP)
+HTTP is a method for encoding and transporting data between a client and a server. It is a request/response protocol: clients issue requests and servers issue responses with relevant content and completion status info about the request. HTTP is self-contained, allowing requests and responses to flow through many intermediate routers and servers that perform load balancing, caching, encryption, and compression.
+
+
+## Transmission control protocol (TCP)
+TCP is a connection-oriented protocol over an IP network. Connection is established and terminated using a handshake. All packets sent are guaranteed to reach the destination in the original order and without corruption through:
+
+- Sequence numbers and checksum fields for each packet
+- Acknowledgement packets and automatic retransmission
+
+
+## User datagram protocol (UDP)
+UDP is connectionless. Datagrams (analogous to packets) are guaranteed only at the datagram level. Datagrams might reach their destination out of order or not at all. UDP does not support congestion control. Without the guarantees that TCP support, UDP is generally more efficient.
+
+UDP can broadcast, sending datagrams to all devices on the subnet. This is useful with DHCP because the client has not yet received an IP address, thus preventing a way for TCP to stream without the IP address.
+
+
+## Remote procedure call (RPC)
+In an RPC, a client causes a procedure to execute on a different address space, usually a remote server. The procedure is coded as if it were a local procedure call, abstracting away the details of how to communicate with the server from the client program. Remote calls are usually slower and less reliable than local calls so it is helpful to distinguish RPC calls from local calls. Popular RPC frameworks include Protobuf, Thrift, and Avro.
+
+## Representational state transfer (REST)
+REST is an architectural style enforcing a client/server model where the client acts on a set of resources managed by the server. The server provides a representation of resources and actions that can either manipulate or get a new representation of resources. All communication must be stateless and cacheable.
+
+To learn more, visit the following links:
+
+- [Getting started with Communication](https://github.com/donnemartin/system-design-primer)
--- a/src/roadmaps/system-design/content/116-performance-antipatterns/100-busy-database.md
+++ b/src/roadmaps/system-design/content/116-performance-antipatterns/100-busy-database.md
@ -1 +1,8 @@
-# Busy database
+# Busy Database
+
+A busy database in system design refers to a database that is handling a high volume of requests or transactions, this can occur when a system is experiencing high traffic or when a database is not properly optimized for the workload it is handling. This can lead to Performance degradation, Increased resource utilization, Deadlocks and contention, Data inconsistencies. To address a busy database, a number of approaches can be taken such as Scaling out, Optimizing the schema, Caching, and Indexing.
+
+To learn more, visit the following links:
+
+- [Busy Database antipattern](https://learn.microsoft.com/en-us/azure/architecture/antipatterns/busy-database/)
+- [Database Design](https://www.sciencedirect.com/topics/computer-science/database-design)
--- a/src/roadmaps/system-design/content/116-performance-antipatterns/101-busy-frontend.md
+++ b/src/roadmaps/system-design/content/116-performance-antipatterns/101-busy-frontend.md
@ -1 +1,8 @@
-# Busy frontend
+# Busy Frontend
+
+A busy frontend in system design refers to a frontend that is handling a high volume of requests or traffic, this can occur when a system is experiencing high traffic or when a frontend is not properly optimized for the workload it is handling. This can lead to Performance degradation, Increased resource utilization, Increased error rates, and Poor user experience. To address a busy frontend, a number of approaches can be taken such as Scaling out, Optimizing the code, Caching, and Load balancing.
+
+To learn more, visit the following link:
+
+- [Busy Front End antipattern](https://learn.microsoft.com/en-us/azure/architecture/antipatterns/busy-front-end/)
+- [What is Front end system design?](https://www.youtube.com/watch?v=XPNMiWyHBAU)
--- a/src/roadmaps/system-design/content/116-performance-antipatterns/102-chatty-io.md
+++ b/src/roadmaps/system-design/content/116-performance-antipatterns/102-chatty-io.md
@ -1 +1,8 @@
-# Chatty io
+# Chat IO
+
+Chat IO in system design refers to the design of a chat system, which allows real-time communication between multiple users. A chat system typically consists of the following components: Client, Server, Messaging protocol, Message store, and Notification. To design a chat system, there are several key considerations to keep in mind such as Scalability, Reliability, and Security.
+
+To learn more, visit the following links:
+
+- [Chat Applications System Design](https://javascript.plainenglish.io/chat-applications-system-design-6a070c60c8cd)
+- [Design A Chat System](https://bytebytego.com/courses/system-design-interview/design-a-chat-system)
--- a/src/roadmaps/system-design/content/116-performance-antipatterns/103-extraneous-fetching.md
+++ b/src/roadmaps/system-design/content/116-performance-antipatterns/103-extraneous-fetching.md
@ -1 +1,15 @@
-# Extraneous fetching
+# Extraneous Fetching
+
+Extraneous fetching in system design refers to the practice of retrieving more data than is needed for a specific task or operation. This can occur when a system is not optimized for the specific workload or when the system is not properly designed to handle the data requirements.
+
+Extraneous fetching can lead to a number of issues, such as:
+
+- Performance degradation
+- Increased resource utilization
+- Increased network traffic
+- Poor user experience
+
+Visit the following links to learn more:
+
+- [Extraneous Fetching antipattern](https://learn.microsoft.com/en-us/azure/architecture/antipatterns/extraneous-fetching/)
+- [What’s the difference between extraneous and confounding variables?](https://www.scribbr.com/frequently-asked-questions/extraneous-vs-confounding-variables/)
--- a/src/roadmaps/system-design/content/116-performance-antipatterns/104-improper-instantiation.md
+++ b/src/roadmaps/system-design/content/116-performance-antipatterns/104-improper-instantiation.md
@ -1 +1,7 @@
-# Improper instantiation
+# Improper Instantiation
+
+Improper instantiation in system design refers to the practice of creating unnecessary instances of an object, class or service, which can lead to performance and scalability issues. This can happen when the system is not properly designed, when the code is not written in an efficient way, or when the code is not optimized for the specific use case.
+
+Learn more from the following links:
+- [Improper Instantiation antipattern](https://learn.microsoft.com/en-us/azure/architecture/antipatterns/improper-instantiation/)
+- [What is Instantiation?](https://www.techtarget.com/whatis/definition/instantiation)
--- a/src/roadmaps/system-design/content/116-performance-antipatterns/105-monolithic-persistence.md
+++ b/src/roadmaps/system-design/content/116-performance-antipatterns/105-monolithic-persistence.md
@ -1 +1,8 @@
-# Monolithic persistence
+# Monolithic Persistence
+
+Monolithic Persistence in system design refers to the use of a single, monolithic database to store all of the data for an application or system. This approach can be used for simple, small-scale systems but as the system grows and evolves it can become a bottleneck, resulting in poor scalability, limited flexibility, and increased complexity. To address these limitations, a number of approaches can be taken such as Microservices, Sharding, and NoSQL databases.
+
+To learn more, visit the following links:
+
+- [Monolithic Persistence antipattern](https://learn.microsoft.com/en-us/azure/architecture/antipatterns/monolithic-persistence/)
+- [System Design: Monoliths and Microservices](https://dev.to/karanpratapsingh/system-design-monoliths-and-microservices-24jn)
--- a/src/roadmaps/system-design/content/116-performance-antipatterns/106-no-caching.md
+++ b/src/roadmaps/system-design/content/116-performance-antipatterns/106-no-caching.md
@ -1 +1,15 @@
-# No caching
+# No Caching
+
+Monolithic persistence in system design refers to the use of a single, monolithic database to store all of the data for an application or system. This approach can be used for simple, small-scale systems, but as the system grows and evolves, it can become a bottleneck, resulting in poor scalability, limited flexibility, and increased complexity.
+
+A monolithic persistence can have several disadvantages:
+
+- Scalability
+- Limited Flexibility
+- Increased Complexity
+- Single Point of Failure
+
+Learn from the following links:
+
+- [What is Caching in system design?](enjoyalgorithms.com/blog/caching-system-design-concept)
+- [No Caching antipattern](https://learn.microsoft.com/en-us/azure/architecture/antipatterns/no-caching/)
--- a/src/roadmaps/system-design/content/116-performance-antipatterns/107-noisy-neighbor.md
+++ b/src/roadmaps/system-design/content/116-performance-antipatterns/107-noisy-neighbor.md
@ -1 +1,14 @@
-# Noisy neighbor
+# Noisy Neighbor
+
+Noisy neighbor in system design refers to a situation in which one or more components of a system are utilizing a disproportionate amount of shared resources, leading to resource contention and reduced performance for other components. This can occur when a system is not properly designed or configured to handle the workload, or when a component is behaving unexpectedly.
+
+Examples of noisy neighbor scenarios include:
+
+- One user on a shared server utilizing a large amount of CPU or memory, leading to reduced performance for other users on the same server.
+- One process on a shared server utilizing a large amount of I/O, causing other processes to experience slow I/O and increased latency.
+- One application consuming a large amount of network bandwidth, causing other applications to experience reduced throughput.
+
+Learn from the following links:
+
+- [Noisy Neighbor](https://docs.aws.amazon.com/wellarchitected/latest/saas-lens/noisy-neighbor.html)
+- [Get started with Noisy Neighbor antipattern](https://learn.microsoft.com/en-us/azure/architecture/antipatterns/noisy-neighbor/noisy-neighbor)
--- a/src/roadmaps/system-design/content/116-performance-antipatterns/108-retry-storm.md
+++ b/src/roadmaps/system-design/content/116-performance-antipatterns/108-retry-storm.md
@ -1 +1,8 @@
-# Retry storm
+# Retry Storm
+
+ Retry Storm in system design refers to a situation in which a large number of retries are triggered in a short period of time, leading to a significant increase in traffic and resource usage. This can occur when a system is not properly designed to handle failures or when a component is behaving unexpectedly. This can lead to Performance degradation, Increased resource utilization, Increased network traffic, and Poor user experience. To address retry storms, a number of approaches can be taken such as Exponential backoff, Circuit breaking, and Monitoring and alerting.
+
+To learn more, visit the following links:
+
+- [Retry Storm antipattern](https://learn.microsoft.com/en-us/azure/architecture/antipatterns/retry-storm/s)
+- [How To Avoid Retry Storms In Distributed Systems](https://faun.pub/how-to-avoid-retry-storms-in-distributed-systems-91bf34f43c7f)
--- a/src/roadmaps/system-design/content/116-performance-antipatterns/109-synchronous-io.md
+++ b/src/roadmaps/system-design/content/116-performance-antipatterns/109-synchronous-io.md
@ -1 +1,11 @@
-# Synchronous io
+# Synchronous IO
+
+In system design, synchronous IO refers to a type of input/output (IO) operation where the program execution is blocked or halted until the IO operation completes. This means that the program will wait for the IO operation to finish before it can continue executing the next instruction. Synchronous IO can be used in a variety of scenarios, such as:
+
+- **Reading and writing files:** When a program needs to read or write a file, it can use synchronous IO to ensure that the operation completes before continuing.
+- **Communicating with a database:** When a program needs to query or update a database, it can use synchronous IO to ensure that the operation completes before continuing.
+- **Networking:** When a program needs to send or receive data over a network, it can use synchronous IO to ensure that the operation completes before continuing.
+
+To learn more, visit the following links:
+
+- [What is Synchronous I/O antipattern?](https://learn.microsoft.com/en-us/azure/architecture/antipatterns/synchronous-io/)
--- a/src/roadmaps/system-design/content/116-performance-antipatterns/index.md
+++ b/src/roadmaps/system-design/content/116-performance-antipatterns/index.md
@ -1 +1,16 @@
-# Performance antipatterns
+# Performance Antipatterns
+
+what is Performance Antipatterns in system design
+Performance antipatterns in system design refer to common mistakes or suboptimal practices that can lead to poor performance in a system. These patterns can occur at different levels of the system and can be caused by a variety of factors such as poor design, lack of optimization, or lack of understanding of the workload.
+
+Examples of performance antipatterns include:
+
+- **N+1 queries:** This occurs when a system makes multiple queries to a database to retrieve related data, instead of using a single query to retrieve all the necessary data.
+- **Chatty interfaces:** This occurs when a system makes too many small and frequent requests to an external service or API, instead of making fewer, larger requests.
+- **Unbounded data:** This occurs when a system retrieves or processes more data than is necessary for the task at hand, leading to increased resource usage and reduced performance.
+- **Inefficient algorithms:** This occurs when a system uses an algorithm that is not well suited to the task at hand, leading to increased resource usage and reduced performance.
+
+Learn more from the following links:
+
+- [Performance antipatterns for cloud applications](https://learn.microsoft.com/en-us/azure/architecture/antipatterns/)
+- [Guide to Software Performance Antipatterns](http://www.perfeng.com/papers/antipat.pdf)
--- a/src/roadmaps/system-design/content/117-monitoring/100-health-monitoring.md
+++ b/src/roadmaps/system-design/content/117-monitoring/100-health-monitoring.md
@ -1 +1,7 @@
-# Health monitoring
+# Health Monitoring
+
+A health monitoring system is a system that is designed to collect, store, and analyze health-related data from a variety of sources, such as wearable devices, medical devices, and electronic health records. The goal of a health monitoring system is to provide healthcare professionals and individuals with real-time insights into their health, allowing them to make informed decisions about their care.
+
+Learn more from the following:
+
+- [Design of Wearable Health Monitoring Systems](https://link.springer.com/chapter/10.1007/978-3-319-23341-3_6)
--- a/src/roadmaps/system-design/content/117-monitoring/101-availability-monitoring.md
+++ b/src/roadmaps/system-design/content/117-monitoring/101-availability-monitoring.md
@ -1 +1,14 @@
-# Availability monitoring
+# Availability Monitoring
+
+Availability monitoring in system design refers to the practice of monitoring the availability of a system, service or application, to ensure that it is functioning correctly and is accessible to users when they need it. This is an important aspect of ensuring that a system is reliable and performs well.
+
+Availability monitoring typically includes the following components:
+
+- Heartbeat monitoring
+- Transaction monitoring
+- Alerts and notifications
+- Root cause analysis
+
+Learn more from the following:
+
+- [System Monitoring, Alerting and Availability](https://www.aits.uillinois.edu/services/network_and_desktop_services/system_monitoring__alerting_and_availability)
--- a/src/roadmaps/system-design/content/117-monitoring/102-performance-monitoring.md
+++ b/src/roadmaps/system-design/content/117-monitoring/102-performance-monitoring.md
@ -1 +1,7 @@
-# Performance monitoring
+# Performance Monitoring
+
+Performance monitoring in system design refers to the practice of monitoring the performance of a system, service, or application, in order to ensure that it is performing well and meeting the needs of users. This is an important aspect of ensuring that a system is reliable and performs well.
+
+Learn more from following links:
+
+- [Get More on Performance Monitoring Systems](https://www.solarwinds.com/server-application-monitor/use-cases/performance-monitoring-system)
--- a/src/roadmaps/system-design/content/117-monitoring/103-security-monitoring.md
+++ b/src/roadmaps/system-design/content/117-monitoring/103-security-monitoring.md
@ -1 +1,15 @@
-# Security monitoring
+# Security Monitoring
+
+Security monitoring in system design refers to the practice of monitoring the security of a system, service, or application, in order to detect and respond to security threats and vulnerabilities. This is an important aspect of ensuring that a system is secure and protected against unauthorized access, data breaches, and other security incidents.
+
+Security monitoring typically includes the following components:
+
+- Event collection
+- Event analysis and correlation
+- Alerts and notifications
+- Incident response
+- Compliance and audit
+
+Visit the following to learn more:
+
+- [Intro to Security Monitoring](https://www.sciencedirect.com/topics/computer-science/security-monitoring)
--- a/src/roadmaps/system-design/content/117-monitoring/104-usage-monitoring.md
+++ b/src/roadmaps/system-design/content/117-monitoring/104-usage-monitoring.md
@ -1 +1,14 @@
-# Usage monitoring
+# Usage Monitoring
+
+Usage monitoring in system design refers to the practice of monitoring the usage of a system, service, or application, in order to understand how it is being used and identify any potential issues or areas for improvement. This is an important aspect of ensuring that a system is meeting the needs of users and providing value.
+
+Usage monitoring typically includes the following components:
+
+- Data collection
+- Data analysis and visualization
+- Alerts and notifications
+- Trend analysis
+
+Learn more from the following links:
+
+- [What is Usage Monitoring?](https://patterns.arcitura.com/cloud-computing-patterns/design_patterns/usage_monitoring)
--- a/src/roadmaps/system-design/content/117-monitoring/105-instrumentation.md
+++ b/src/roadmaps/system-design/content/117-monitoring/105-instrumentation.md
@ -1 +1,7 @@
 # Instrumentation
+
+
+
+Learn more from the following links:
+
+- [Instrumentation System Docs](http://eolss.net/Sample-Chapters/C05/E6-39A-04-08.pdf)
--- a/src/roadmaps/system-design/content/117-monitoring/106-visualization-and-alerts.md
+++ b/src/roadmaps/system-design/content/117-monitoring/106-visualization-and-alerts.md
@ -1 +1,14 @@
-# Visualization and alerts
+# Visualization and Alerts
+
+Instrumentation in system design refers to the process of adding monitoring and measurement capabilities to a system, service, or application. This allows developers and operations teams to observe the behavior of the system, measure its performance, and identify any issues or areas for improvement.
+
+Instrumentation can be used to monitor a wide variety of aspects of a system, such as:
+
+- Performance: Instrumentation can be used to measure the performance of a system, such as response time, throughput, and resource utilization.
+- Errors: Instrumentation can be used to detect and diagnose errors, such as exceptions and stack traces.
+- Security: Instrumentation can be used to monitor for security-related events, such as authentication attempts and network traffic.
+- Usage: Instrumentation can be used to monitor usage-related data, such as the number of users and requests.
+
+To learn more, visit the following links:
+
+- [Visualize Data and Raise Alerts](https://learn.microsoft.com/en-us/azure/architecture/framework/devops/monitor-visualize-data)
--- a/src/roadmaps/system-design/content/117-monitoring/index.md
+++ b/src/roadmaps/system-design/content/117-monitoring/index.md
@ -1 +1,8 @@
 # Monitoring
+
+System monitoring involves the continuous monitoring of an infrastructure – aka an IT system – by an IT manager. It includes the monitoring of CPU, server memory, routers, switches, bandwidth, and applications, as well as the performance and availability of important network devices.
+
+Visit the following to learn more:
+
+- [Design and implement a monitoring system](https://www.tdh.ch/sites/default/files/tdh_gmm_en_nouvelleversion_ang.pdf)
+- [System Design — Design a Monitoring System](https://gongybable.medium.com/system-design-design-a-monitoring-system-f0f0cbafc895)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/100-asynchronous-request-reply.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/100-asynchronous-request-reply.md
@ -1 +1,8 @@
-# Asynchronous request reply
+# Asynchronous Request Reply
+
+Asynchronous Request-Reply in system design refers to a pattern where a client sends a request to a server and the server responds asynchronously, allowing the client to continue processing other tasks or requests without waiting for the server's response. This can improve the performance and scalability of a system by allowing multiple requests to be processed concurrently. It can be implemented using callbacks, promises or event-based models.
+
+Learn more from the following links:
+
+- [Asynchronous Request-Reply pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/async-request-reply)
+- [Intro to Asynchronous Request-Response](https://codeopinion.com/asynchronous-request-response-pattern-for-non-blocking-workflows/)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/101-claim-check.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/101-claim-check.md
@ -1 +1,8 @@
-# Claim check
+# Claim Check
+
+Claim check in system design is a pattern where large or complex data is replaced with a small token or reference, which is passed along with a message or request. This can help to reduce the size and complexity of messages, and improve the performance and scalability of a system. The large or complex data is stored in a separate location, and a token generator is used to create a unique token for the actual data.
+
+Learn more from the following links:
+
+- [An Introduction to Claim-Check Pattern and Its Uses](https://aws.plainenglish.io/an-introduction-to-claim-check-pattern-and-its-uses-b018649a380d)
+- [Claim Check - Cloud Design patterns](https://learn.microsoft.com/en-us/azure/architecture/patterns/)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/102-choreography.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/102-choreography.md
@ -1 +1,8 @@
 # Choreography
+
+Choreography in system design refers to the design and coordination of interactions between autonomous systems or services, without the use of a central controlling entity. Each system or service is responsible for its own behavior and communication with other systems or services, and there is no central point of control or coordination. Choreography can be used to improve the scalability, flexibility, and resilience of a system, by allowing each service to evolve and scale independently. It can be implemented using event-based, message-based or API-based models.
+
+Learn more from the following links:
+
+- [Choreography pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/choreography)
+- [Service choreography](https://en.wikipedia.org/wiki/Service_choreography)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/103-competing-consumers.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/103-competing-consumers.md
@ -1 +1,8 @@
-# Competing consumers
+# Competing Consumers
+
+Competing Consumers in system design is a pattern that allows multiple consumers to process messages concurrently from a shared message queue. This approach can be used to improve the performance and scalability of a system by allowing multiple consumers to process messages in parallel. This pattern can be used in scenarios like load balancing and fault tolerance. It can be implemented using a variety of messaging technologies such as message queues, message brokers, and publish-subscribe systems.
+
+Learn more from the following links:
+
+- [Competing Consumers pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/competing-consumers)
+- [Competing Consumers Pattern - Explained](https://medium.com/event-driven-utopia/competing-consumers-pattern-explained-b338d54eff2b)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/104-pipes-and-filters.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/104-pipes-and-filters.md
@ -1 +1,8 @@
-# Pipes and filters
+# Pipes and Filters
+
+Pipes and Filters in system design is a pattern that separates the processing of a task into a series of smaller, independent components, connected together in a pipeline. Each component, or filter, performs a specific task, and the output of one filter is passed as the input to the next filter. This approach can be used to build modular and extensible systems, by allowing filters to be added, removed, or replaced easily. Pipes and Filters pattern can be used in scenarios like data processing and data transformation. It can be implemented using a variety of technologies such as streams, generators, and iterators.
+
+Learn more from the following links:
+
+- [Pipe and Filter Architectural Style](https://cs.uwaterloo.ca/~m2nagapp/courses/CS446/1181/Arch_Design_Activity/PipeFilter.pdf)
+- [Pipes and Filters pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/pipes-and-filters)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/105-priority-queue.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/105-priority-queue.md
@ -1 +1,7 @@
-# Priority queue
+# Priority Queue
+
+A priority queue in system design is a data structure that stores items with a priority value, and allows for efficient retrieval and manipulation of the items based on their priority. The items with the highest priority are retrieved first. This pattern is useful in situations where certain items or tasks are more important than others and should be processed first. Priority Queue can be used in scenarios like scheduling and real-time systems. It can be implemented using various data structures such as heap, linked list, and array.
+
+Learn more from the following links:
+
+- [Priority Queue pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/priority-queue)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/106-publisher-subscriber.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/106-publisher-subscriber.md
@ -1 +1,8 @@
-# Publisher subscriber
+# Publisher Subscriber
+
+Publisher-Subscriber in system design is a pattern that allows multiple subscribers to receive updates from a single publisher, without the publisher and subscribers being aware of each other's existence. This pattern allows for decoupling of the publisher and subscribers, and can be used to build scalable and flexible systems. It can be used in scenarios like event-driven architecture and data streaming. It can be implemented using a variety of technologies such as message queues, message brokers, and event buses.
+
+Learn more from the following links:
+
+- [What is Pub/Sub Messaging?](https://aws.amazon.com/pub-sub-messaging/)
+- [Publisher Subscriber - Pattern](https://www.enjoyalgorithms.com/blog/publisher-subscriber-pattern)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/107-queue-based-load-leveling.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/107-queue-based-load-leveling.md
@ -1 +1,8 @@
-# Queue based load leveling
+# Queue Based Load Leveling
+
+Queue-based load leveling in system design is a pattern that allows for the buffering of incoming requests, and the processing of those requests at a controlled rate. This pattern can be used to prevent overloading of a system, and to ensure that the system can handle a variable rate of incoming requests. It can be used in scenarios like traffic spikes and variable workloads. It can be implemented using various data structures such as linked list, array, and heap.
+
+Learn more from the following links:
+
+- [Queue-Based Load Leveling pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/queue-based-load-leveling)
+- [Design Patterns: Queue-Based Load Leveling Pattern](https://blog.cdemi.io/design-patterns-queue-based-load-leveling-pattern/)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/108-scheduling-agent-supervisor.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/108-scheduling-agent-supervisor.md
@ -1 +1,7 @@
-# Scheduling agent supervisor
+# Scheduling Agent Supervisor
+
+Scheduling Agent Supervisor in system design is a pattern that allows for the scheduling and coordination of tasks or processes by a central entity, known as the Scheduling Agent. The Scheduling Agent is responsible for scheduling tasks, monitoring their execution, and handling errors or failures. This pattern can be used to build robust and fault-tolerant systems, by ensuring that tasks are executed as intended and that any errors or failures are handled appropriately.
+
+Learn more from the following links:
+
+- [Scheduler Agent Supervisor pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/scheduler-agent-supervisor)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/109-sequential-convoy.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/109-sequential-convoy.md
@ -1 +1,8 @@
-# Sequential convoy
+# Sequential Convoy
+
+Sequential Convoy in system design is a pattern that allows for the execution of a series of tasks, or convoy, in a specific order. This pattern can be used to ensure that a set of dependent tasks are executed in the correct order and to handle errors or failures during the execution of the tasks. It can be used in scenarios like workflow and transaction. It can be implemented using a variety of technologies such as state machines, workflows, and transactions.
+
+Learn more from the following links:
+
+- [What is Sequential Convoy?](https://learn.microsoft.com/en-us/biztalk/core/sequential-convoys)
+- [Overview - Sequential Convoy pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/sequential-convoy)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/index.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/100-messaging/index.md
@ -1 +1,8 @@
 # Messaging
+
+Messaging in system design is a pattern that allows for the communication and coordination between different components or systems, using messaging technologies such as message queues, message brokers, and event buses. This pattern allows for decoupling of the sender and receiver, and can be used to build scalable and flexible systems. Messaging pattern can be used in scenarios like asynchronous communication, loose coupling, and scalability. It can be implemented using a variety of technologies such as message queues, message brokers, and event buses.
+
+Learn more from the following links:
+
+- [System Design — Message Queues](https://medium.com/must-know-computer-science/system-design-message-queues-245612428a22)
+- [Intro to System Design - Message Queues](https://dev.to/karanpratapsingh/system-design-message-queues-k9a)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/100-cache-aside.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/100-cache-aside.md
@ -1 +1,7 @@
-# Cache aside
+# Cache Aside
+
+Cache-Aside in system design is a pattern that allows for the caching of data, in order to improve the performance and scalability of a system. This pattern is typically used in systems where data is read more frequently than it is written. It can be used to reduce the load on a primary data store, and to improve the responsiveness of a system by reducing the latency of data access. Cache-Aside pattern can be used in scenarios like read-heavy workloads and latency-sensitive workloads. It can be implemented using various caching technologies such as in-memory cache, distributed cache, and file-based cache.
+
+Learn more from the following links:
+
+- [Cache-Aside pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/cache-aside)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/101-cqrs.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/101-cqrs.md
@ -1 +1,9 @@
-# Cqrs
+# CQRS
+
+CQRS (Command Query Responsibility Segregation) in system design is a pattern that separates the responsibilities of handling read and write operations in a system. This pattern allows for the separation of concerns between the read and write operations, and can be used to improve the scalability, performance, and maintainability of a system.
+
+In this pattern, the read and write operations are handled by different components in the system. The write operations, known as commands, are handled by a Command component that updates the state of the system. The read operations, known as queries, are handled by a Query component that retrieves the current state of the system.
+
+Learn more from the following links:
+
+- [CQRS pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/cqrs)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/102-event-sourcing.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/102-event-sourcing.md
@ -1 +1,8 @@
-# Event sourcing
+# Event Sourcing
+
+Event Sourcing in system design is a pattern that stores the state of a system as a sequence of events, rather than the current state. Each change to the state of the system is recorded as an event, which is stored in an event store. The current state of the system can be derived from the events in the event store. Event sourcing can be used for various purposes such as tracking history, reconstruct state, recover from failures, and auditing. It is often implemented in conjunction with CQRS (Command Query Responsibility Segregation) pattern, which separates the responsibilities of handling read and write operations in a system.
+
+Learn more from the following links:
+
+- [Event Sourcing pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/event-sourcing)
+- [Overview of Event Sourcing](https://microservices.io/patterns/data/event-sourcing.html)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/103-index-table.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/103-index-table.md
@ -1 +1,8 @@
-# Index table
+# Index Table
+
+ An index table in system design is a data structure that allows for efficient lookup of data in a larger data set. It is used to improve the performance of searching, sorting, and retrieving data, by allowing for quick access to specific records or data elements. There are several types of index tables such as B-Tree, Hash table, and Trie each with its own strengths and weaknesses. Index tables can be used in a variety of scenarios such as searching, sorting, and retrieving.
+
+Learn more from the following links:
+
+- [System Design — Indexes](https://medium.com/must-know-computer-science/system-design-indexes-f6ad3de9925d)
+- [Overview of Index Table](https://dev.to/karanpratapsingh/system-design-indexes-2574)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/104-materialized-view.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/104-materialized-view.md
@ -1 +1,8 @@
-# Materialized view
+# Materialized View
+
+A Materialized View in system design is a pre-computed and stored version of a query result, which is used to improve the performance of frequently executed queries. It can be used to improve the performance of read-heavy workloads, by providing a pre-computed version of the data that can be quickly accessed. Materialized views can be used in scenarios like complex queries, large datasets, and real-time analytics. A materialized view can be created by executing a query and storing the result in a table. The data in the materialized view is typically updated periodically, to ensure that it stays up-to-date with the underlying data.s
+
+Learn more from the following links:
+
+- [Materialized View pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/materialized-view)
+- [Overview of Materialized View Pattern](https://medium.com/design-microservices-architecture-with-patterns/materialized-view-pattern-f29ea249f8f8)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/105-sharding.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/105-sharding.md
@ -1 +1,8 @@
 # Sharding
+
+Sharding in system design is a technique used to horizontally partition a large data set across multiple servers, in order to improve the performance, scalability, and availability of a system. This is done by breaking the data set into smaller chunks, called shards, and distributing the shards across multiple servers. Each shard is self-contained and can be managed and scaled independently of the other shards. Sharding can be used in scenarios like scalability, availability, and geo-distribution. Sharding can be implemented using several different algorithms such as range-based sharding, hash-based sharding, and directory-based sharding.
+
+Learn more from the following links:
+
+- [Database Sharding: Concepts and Examples](https://www.mongodb.com/features/database-sharding-explained)
+- [Database Sharding – System Design Interview Concept](https://www.geeksforgeeks.org/database-sharding-a-system-design-concept/)
--- a/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/106-static-content-hosting.md
+++ b/src/roadmaps/system-design/content/118-cloud-design-patterns/101-data-management/106-static-content-hosting.md
@ -1 +1,8 @@
-# Static content hosting
+# Static Content Hosting
+
+Static Content Hosting in system design is a technique used to serve static resources such as images, stylesheets, and JavaScript files, from a dedicated server or service, rather than from the main application server. This approach can be used to improve the performance, scalability, and availability of a system. Static content hosting can be used in scenarios like performance, scalability, and availability. Static content hosting can be implemented using several different techniques such as Content Delivery Network (CDN), Object Storage and File Server.
+
+Learn more from the following links:
+
+- [The pros and cons of the Static Content Hosting](https://www.redhat.com/architect/pros-and-cons-static-content-hosting-architecture-pattern)
+- [Static Content Hosting pattern](https://learn.microsoft.com/en-us/azure/architecture/patterns/static-content-hosting)
--- a/Show More
+++ b/Show More