System Design Concepts Course and Interview Prep

By freeCodeCamp.org

Summary

## Key takeaways - **System design interviews focus on architecture, not code.**: System design interviews assess your ability to connect components and build systems, rather than your coding proficiency. [00:25], [00:30] - **Computer architecture: Disk, RAM, Cache, CPU interplay.**: Understanding how disk storage, RAM, cache, and CPU work together is fundamental. Disk storage is non-volatile, RAM is volatile with fast access, cache is smaller but even faster for frequently used data, and the CPU executes instructions. [00:45], [03:40] - **CAP Theorem: Consistency, Availability, Partition Tolerance tradeoffs.**: A distributed system can only guarantee two out of three properties: Consistency, Availability, and Partition Tolerance. Design decisions involve making informed compromises based on specific use case requirements. [08:52], [10:02] - **Protocols: HTTP vs. WebSockets for communication.**: HTTP is a stateless request-response protocol, while WebSockets provide a persistent, two-way communication channel for real-time updates, essential for applications like chat or live feeds. [19:14], [20:41] - **API Design: REST, GraphQL, and gRPC paradigms.**: REST APIs use standard HTTP methods but can lead to over/under-fetching, GraphQL allows clients to request exact data, and gRPC, built on HTTP/2, is efficient for microservices. [25:44], [26:27] - **Databases scale via sharding and replication.**: Horizontal scaling involves distributing data across multiple servers through sharding or keeping copies via replication for high availability, offering greater scalability than vertical scaling. [51:30], [52:06]

Topics Covered

Mastering Trade-offs: The CAP Theorem's Inflexible Rule
The Dramatic Difference of 'Five Nines' Availability
Why System Design is Harder to Refactor Than Code
Reverse Proxies: The Unsung Heroes of Modern Systems
Beyond Database Choice: Accelerating Data Access

Full Transcript

this complete system design tutorial

covers scalability reliability data

handling and high level architecture

with clear explanations real world

examples and practical strategies hike

will teach you the Core Concepts you

need to know for a system designs

interview this is a complete crash

course on system design interview

Concepts that you need to know to as

your job interview the system design

interview doesn't have to do much with

coding and people don't want to see you

write actual code but how you glue an

entire system together and that is

exactly what we're going to cover in

this tutorial we'll go through all of

the concepts that you need to know to as

your job interview before designing

large scale distributed systems it's

important to understand the high level

architecture of the individual computer

let's see how different parts of the

computer work together to execute our

code computers function through a

layered system each optimized for

varying tasks at Decor computers

understand only binary zeros and ones

these are represented as bits one bit is

the smallest data unit in Computing it

can be either zero or one one bite

consists of eight bits and it's used to

represent a single character like a or

number like one expanding from here we

have kilobyte megabyte gigabytes and

terabytes to store this data we have

computer disk storage which holds the

primary data it can be either htd or SS

D type the disk storage is nonvolatile

it maintains data without power meaning

if you turn off or restart the computer

the data will still be there it contains

the OS applications and all user files

in terms of size discs typically range

from hundreds of gigabytes to multiple

terabytes while ssds are more expensive

they offer significantly faster data

retrieval than HDD for instance an SSD

may have a r speed of 500 MB per second

to

3,500 while an HDD might offer 80 to 160

mb per second the next immediate access

point after dis is the Ram or random

access memory RAM serves as the primary

active data holder and it holds data

structures variables and applications

data that are currently in use or being

processed when a program runs its

variables intermediate computations

runtime stack and more are stored in Ram

because it allows for a quick read and

write access this is a volatile memory

which means that it requires power to

retain its contents and after you

restart the computer the data may not be

persisted in terms of size Rams range

from a few Gaby in consumer devices to

hundreds of gabt in high-end

servers their read right speed often

surpasses 5,000 megabytes per second

which is faster than even the fastest SS

this dis speed but sometimes even this

speed isn't enough which brings us to

the cache the cache is smaller than Ram

typically it's measured in megabytes but

access times for cach memory are even

faster than Ram offering just a few Nan

for the L1 cache the CPU first checks

the L1 cach for the data if it's not

found it checks the L2 and L3 cache and

then finally it checks the ram the

purpose of a cach is to reduce the

average time to Access Data that's why

we store frequently used data here to

optimize CPU performance and what about

the CPU CPU is the brain of the computer

it fetches decodes and executes

instructions when you run your code it's

the CPU that processes the operations

defined in that program but before it

can run our code which is written in

high level languages like Java C++

python or other languages our code first

needs to be compiled into machine code a

compiler performs this translation and

once the code is compiled into machine

code the CPU can execute it it can read

and write from our Ram disk and cach

data and finally we have motherboard or

main board which is what you might think

of as the component that connects

everything it provides the path phase

that allow data to flow between these

components now let's have a look at the

very high level architecture of a

production ready up our first key area

is the cicd pipeline continuous

integration and continuous deployment

this ensures that our code goes from the

repository through a series of tests and

pipeline checks and onto the production

server without any manual intervention

it's configured with platforms like

Jenkins or GitHub actions for automating

our deployment

processes and once our app is in

production it has to handle lots of user

requests this is managed by our load

balancers and reverse proxies like

ngx they ensure that the user request

are evenly distributed across multiple

servers maintaining a smooth user

experience even during traffic specs our

server is also going to need to store

data for that we also have an external

storage server that is not running on

the same production server instead it's

connected over a

network our servers might also be

communicating with other servers as well

and we can have many such services not

just one to ensure everything runs

smoothly we have logging and monitoring

system s keeping a Keen Eye on every

micro interaction of storing logs and

analyzing data it's standard practice to

store logs on external Services often

outside of our primary production server

for the back end tools like pm2 can be

used for logging and monitoring on the

front end platforms like Sentry can be

used to capture and Report errors in

real time and when things don't go as

plann meaning our logging systems detect

failing requests or anomalies first it

enforce our alerting service after that

push notifications are sent to keep

users informed from generic something

rank wrong to specific payment failed

and modern practice is to integrate

these alerts directly into platforms we

commonly use like slack imagine a

dedicated slack Channel where alerts pop

up at the moment an issue arises this

allows developers to jump into action

almost instantly addressing the root CS

before it escalates and after that

developers have to debug the issue first

and foremost the issue needs to be

identified those logs we spoke about

earlier they are our first Port of Call

developers go through them searching for

patterns or anomalies that could point

to the source of the problem after that

it needs to be replicated in a safe

environment the golden rule is to never

debug directly in the production

environment instead developers recreate

the issue in a staging or test

environment this ensures users don't get

affected by the debugging process then

developers use tools to peer into the

running app apption and start debugging

once the bug is fixed a hot fix is

rolled out this is a quick temporary fix

designed to get things running again

it's like a patch before a more

permanent solution can be implemented in

this section let's understand the

pillars of system design and what it

really takes to create a robust and

resilent application now before we jump

into the technicalities let's talk about

what actually makes a good design when

we talk about good design in system

architecture we are really focusing ing

on a few key principles scalability

which is our system growth with its user

base maintainability which is ensuring

future developers can understand and

improve our system and efficiency which

is making the best use of our resources

but good design also means planning for

failure and building a system that not

only performs well when everything is

running smoothly but also maintains its

composure when things go wrong at the

heart of system design are three key

elements moving data storing data and

transforming data moving data is about

ensuring that data can flow seamlessly

from one part of our system to another

whether it's user request seeding our

servers or data transfers between

databases we need to optimize for Speed

and security storing data isn't just

about choosing between SQL or nosql

databases it's about understanding

access patterns indexing strategies and

backup Solutions we need to ensure that

our data is not only stored securely but

is also readily available when needed

and data transformation is about taking

row data and turning it into meaningful

information whether it's aggregating log

files for analysis or converting user

input into a different format now let's

take a moment to understand the crucial

Concept in system design the cap theorem

also known as Brewers theorem named

after computer scientist Eric Brewer

this theorem is a set of principles that

guide us in making informed tradeoffs

between three key components of a

distributed system consistency

availability and partition tolerance

consistency ensures that all nodes in

the distributed system have the same

data at the same time if you make a

change to one node that change should

also be reflected across all nodes think

of it like updating a Google doc if one

person makes an edit everyone else sees

that edit immediately availability means

that the system is is always operational

and responsive to requests regardless of

what might be happening behind the

scenes like a reliable online store no

matter when you visit it's always open

and ready to take your order and

partition tolerance refers to the

system's ability to continue functioning

even when a network partition occur

meaning if there is a disruption in

communication between nodes the system

still works it's like having a group

chat where even if one person loses

connection the rest of the group can

continue chatting and according to cap

theorem a distributed system can only

achieve two out of these three

properties at the same time if you

prioritize consistency and partition

tolerance you might have to compromise

on availability and vice versa for

example a banking system needs to be

consistent and partition tolerant to

ensure Financial accuracy even if it

means some transactions take longer to

process temporarily compromising

availability so every design DEC

decision comes with tradeoffs for

example a system optimized for read

operations might perform poorly on write

operations or in order to gain

performance we might have to sacrifice a

bit of complexity so it's not about

finding the perfect solution it's about

finding the best solution for our

specific use case and that means making

informed decision about where we can

afford to compromise so one important

measurement of system is availability

this is the measure of systems

operational performance and

reliability when we talk about

availability we are essentially asking

is our system up and running when our

users need it this is often measured in

terms of percentage aiming for that

golden 5 9's availability let's say we

are running a critical service with 99.9

availability that allows for around 8.76

hours of downtime per year but if we add

two NES to it we are talking just about

5 minutes of downtime per year and

that's a massive difference especially

for services where every second counts

we often measure it in terms of uptime

and downtime and here is where service

level objectives and service level

agreements come into place slos are like

setting goals for our systems

performance and availability for example

we might set an SLO stating that our web

service should respond to request within

300 milliseconds and

99.9% of the time slas on the other hand

are like for formal contracts with our

users or customers they Define the

minimum level of service we are

committing to provide so if our SLA

guarantees 99.99 availability and we

drop below that we might have to provide

refunds or other compensations to our

customers building resilence into our

system means expecting the unexpected

this could mean implementing redundant

systems ensuring there is always a

backup ready to take over in case of

failure or it could mean designing our

system to degrade gracefully so even if

certain features are unavailable the

core functionality remains intact to

measure this aspect we used reliability

fault tolerance and

redundancy reliability means ensuring

that our system works correctly and

consistently fa tolerance is about

preparing for when things go wrong how

does our system handle unexpected

failures or attacks and redundancy is

about having backups ensuring that if

one part of our system fails there is

another ready to take its place we also

need to measure the speed of our system

and for that we have throughput and

latency throughput measures how much

data our system can handle over a

certain period of time we have server

throughput which is measured in requests

per second this metric provides an

indication of how many client requests a

server can handle in a given time frame

a higher RPS value typically indicates

better performance and the ability to

handle more concurrent users we have

database throughput which is measured in

queries per second this quantifies the

number of queries a database can process

in a second like server throughput a

higher QPS value usually signifies

better

performance and we also have data

throughput which is measured in bytes

per second this reflects the amount of

data transferred over a network or

processed by a system in a given period

of time on the other hand latency

measures how long it takes to handle a

single request it's the time it takes

for a request to get a response and

optimizing for one can often lead to

sacrifices in the other for example

batching operations can increase

throughput but might also increase

latency and designing a system poly can

lead to a lot of issues down the line

from performance bottlenecks to security

vulnerabilities and unlike code which

can be refactored easily redesigning A

system can be a Monumental task that's

why it's crucial to invest time and

resources into getting the design right

from the start and laying a solid

foundation that can support the weight

of future features and user growth now

let's talk about networking Basics when

we talk about networking Basics we are

essentially discussing how computers

communicate with each other at the heart

of this communication is the IP address

a unique identifier for each device on a

network IP V4 addresses are 32bit which

allows for approximately 4 billion

unique addresses however with the

increasing number of devices we are

moving to IP V6 which uses 128bit

addresses significantly increasing the

number of available unique addresses

when two computers communicate over a

network they send and receive packets of

data and each packet contains an IP

header which contains essential

information like the senders and

receivers IP addresses ensuring that the

data reaches the correct destination

this process is governed by the Internet

Protocol which is a set of rules that

defines how data is sent and received

besides the IP layer we also have the

application layer where data specific to

the application protocol is stored the

data in these packets is formatted

according to specific application

protocol data like HTTP for web browsing

so that the data is interpreted

correctly by the receiving device once

we understand the basics of Ip

addressing and data packets we can dive

into transport layer where TCP and UDP

come into play TCP operates at the

transport layer and ensures reliable

communication it's like a delivery guy

who makes sure that your package not

only arrives but also checks that

nothing is missing so each data packet

also includes a TCP header which is

carrying essential information like port

numbers and control flux necessary for

managing the connection and data flow

TCP is known for its reliability it

ensures the complete and correct

delivery of data packets it accomplishes

this through features like sequence

numbers which keep track of the order of

packets and the process known as the

freeway handshake which establishes a

stable connection between two devices in

contrast UDP is faster but less reliable

than TCP it doesn't establish a

connection before sending data and

doesn't guarantee the delivery or order

of the packets but this makes UDP

preferable for time sensitive

Communications like video calls or live

streaming where speed is crucial and

some data loss is acceptable to tie all

these Concepts together let's talk about

DNS domain name system DNS acts like the

internet form book translating human

friendly domain names into IP addresses

when you enter a URL in your browser the

browser sends a DNS query to find the

corresponding IP address allowing it to

establish a connection to the server and

and retrieve the web page the

functioning of DNS is overseen by I can

which coordinates the global IP address

space and domain name system and domain

name registers like name chip or gold Ed

are accredited by I can to sell domain

names to the public DNS uses different

types of Records like a records which

map The Domain to its corresponding IP

address ensuring that your request

reaches to the correct server or 4 a

records which map a domain name name to

an IP V6 address and finally let's talk

about the networking infrastructure

which supports all these communication

devices on a network have either public

or private IP addresses public IP

addresses are unique across the internet

while private IP addresses are unique

within a local network an IP address can

be stated permanently assigned to a

device or dynamic changing over time

Dynamic IP addresses are commonly used

for res idential internet connections

and devices connected in a local area

network can communicate with each other

directly and to protect these networks

we are using firewalls which are

monitoring and controlling incoming and

outgoing Network traffic and within a

device specific processes or services

are identified by ports which when

combined with an IP address create a

unique identifier for a network service

some ports are reserved for specific

protocols like 80 for HTTP or 22 for

SSH now let's cover all the essential

application layer protocols the most

common protocol out of this is HTTP

which stands for hyper text transfer

protocol which is built on TCP IP it's a

request response protocol but imagine it

as a conversation with no memory each

interaction is separate with no

recollection of the past this means that

the server doesn't have to store any

context between requests instead each

request contains all the necessary

information and notice how the headers

include details like URL and Method

while body carries the substance of the

request or response each response also

includes the status code which is just

to provide feedback about the result of

a client's request on a server for

instance 200 series are success codes

these indicate that the request was

successfully received and processed 300

series are redirection codes this

signify that further action needs to be

taken by the user agent in order to

fulfill the request 400 series are

client error codes these are used when

the request contains bad syntax or

cannot be fulfilled and 500 series are

server error codes this indicates that

something went wrong on the server we

also have a method on each request the

most common methods are get post put

patch and delete get is used for

fetching data post is usually for

creating a data on server puted patch

are for updating a record and delete is

for removing a record from database HTTP

is oneway connection but for realtime

updates we use web sockets that provide

a two-way Communication channel over a

single long lift connection allowing

servers to push real-time updates to

clients this is very important for

applications requiring constant data

updates without the overhead of repeated

HTTP request response Cycles it is

commonly used for chat applications live

sport updates or stock market feeds

where the action never stops and neither

does the

conversation from email related

protocols SMTP is the standard for email

transmission over the Internet it is the

protocol for sending email messages

between servers most email clients use

SMTP for sending emails and either IMAP

or pop free for retrieving them imup is

used to retrieve emails from a server

allowing a client to access and

manipulate messages this is ideal for

users who need to access their emails

from multiple

devices pop free is used for downloading

emails from a server to a local client

typically used when emails are managed

from a single device moving on to file

transfer and management protocols the

traditional protocol for transferring

files over the Internet is FTP which is

often used in Website Maintenance and

large data transfers it is used for the

trans of files between a client and

server useful for uploading files to

server or backing up files and we also

have SSH or secure shell which is for

operating Network Services securely on

an unsecured Network it's commonly used

for logging into a remote machine and

executing commands or transferring files

there are also real-time communication

protocols like web RTC which enables

browser to browser applications for

voice calling video chat and file Shar

sharing without internal or external

plugins this is essential for

applications like video conferencing and

live

streaming another one is mqtt which is a

lightweight messaging protocol ideal for

devices with limited processing power

and in scenarios requiring low bandwidth

such as iot devices and amqp is a

protocol for message oriented middleware

providing robustness and security for

Enterprise level message communication

for example it is used in tools like

rabbit mq let's also talk about RPC

which is a protocol that allows a

program on one computer to execute code

on a server or another computer it's a

method used to invoke a function as if

it were a local call when in reality the

function is executed on a remote machine

so it abstracts the details of the

network communication allowing the

developer to interact with remote

functions seamlessly as if they were

local to the application and many

application player protocols use RPC

mechanisms to perform their operations

for example in web services HTTP

requests can result in RPC calls being

made on backend to process data or

perform actions on behalf of the client

or SMTP servers might use RPC calls

internally to process email messages or

interact with

databases of course there are numerous

other application layer protocols but

devance covered here are among the most

commonly used Bo and essential for web

development in this section let's go

through the API design starting from the

basics and advancing towards the best

practices that Define exceptional apis

let's consider an API for an e-commerce

platform like Shopify which if you're

not familiar with is a well-known

e-commerce platform that allows

businesses to set up online stores in

API design we are concerned with

defining the inputs like product details

for a new product which is provided by a

seller and the output like the

information returned when someone

queries a product of an API so the focus

is mainly on defining how the crow

operations are exposed to the user

interface CR stands for create read

update and delete which are basic

operations of any data driven

application for example to add a new

product we need to send a post request

to/ API products where the product

details are sent in the request body to

retrieve these products we need to send

the get request requ EST to/ API SL

products for updating we use put or

patch requests to/ product/ the ID of

that product and removing is similar to

updating it's again/ product/ ID of the

product we need to remove and similarly

we might also have another get request

to/ product/ ID which fetches the single

product another part is to decide on the

communication protocol that will be used

like HTTP websockets or other protocols

and the data transport mechanism which

can be Json XML or protocol buffers this

is usually the case for restful apis but

we also have graphql and grpc paradigms

so apis come in different paradigms each

with its own set of protocols and

standards the most common one is rest

which stands for representational State

transfer it is stateless which means

that each request from a client to a

server must contain all the information

needed to understand and complete the

request it uses standard HTTP methods

get post put and delete and it's easily

consumable by different clients browsers

or mobile apps the downside of restful

apis is that they can lead to over

fetching or under fetching of data

because more endpoints may be required

to access specific data and usually

restful apis use Json for data exchange

on the other hand graphql apis allow

clients to request exactly what they

need avoiding over fetching and under

fetching data they have strongly typed

queries but complex queries can impact

server performance and all the requests

are sent as post requests and graphql

API typically responds with HTTP 200

status code even in case of errors with

error details in the response body grpc

stands for Google remote procedure call

which is built on http2 which provides

advanced featur features like

multiplexing and server push it uses

protocol buffers which is a way of

serializing structured data and because

of that it's sufficient in terms of

bandwidth and resources especially

suitable for

microservices the downside is that it's

less human readable compared to Json and

it requires http2 support to operate in

an e-commerce setting you might have

relationships like user to orders or

orders to products and you need to

design endpoints to reflect these

relationships for example to fetch the

orders for a specific user you need to

query to get/ users SL the user id/

orders common queries also include limit

and offset for pagination or start and

end date for filtering products within a

certain date range this allows users or

the client to retrieve specific sets of

data without overwhelming the system a

well-designed get request should be itm

ponent meaning calling it multiple times

doesn't change the result and it should

always return the same result and get

requests should never mutate data they

are meant only for retrieval if you need

to update or create a data you need to

do a put or post request when modifying

end points it's important to maintain

backward compatibility this means that

we need to ensure that changes don't

break existing clients a common practice

is to introduce new versions like

version two products so that the version

one API can still serve the old clients

and version 2 API should serve the

current clients this is in case of

restful apis in the case of graph Co

apis adding new Fields like V2 Fields

without removing old one helps in

evolving the API without breaking

existing clients another best practice

is to set rate limitations this can

prevent the API from Theos attacks it is

used to control the number of requests a

user can make in certain time frame and

it prevents a single user from sending

too many requests to your single API a

common practice is to also set course

settings which stands for cross origin

resource sharing with course settings

you can control which domains can access

to your API preventing unwanted

cross-site interactions now imagine a

company is hosting a website on a server

in Google cloud data centers in Finland

it may take around 100 milliseconds to

load for users in Europe but it takes 3

to 5 Seconds to load for users in Mexico

fortunately there are strategies to

minimize this request latency for users

who are far away these strategies are

called caching and content delivery

networks which are two important

Concepts in modern web development and

system design caching is a technique

used to improve the performance and

efficiency of a system it involves

storing a copy of certain data in a

temporary storage so that future

requests for that data can be served

faster there are four common places

where cash can be stored the first one

is browser caching where we store

website resources on a user's local

computer so when a user revisits a site

the browser can load the site from the

local cache rather than fetching

everything from the server again users

can disable caching by adjusting the

browser settings in most browsers

developers can disable cach from the

developer tools for instance in Chrome

we have the disable cache option in the

dev Vel opers tools Network tab the cach

is stored in a directory on the client's

hard drive managed by the browser and

browser caches store HTML CSS and JS

bundle files on the user's local machine

typically in a dedicated cache directory

managed by the browser we use the cache

control header to tell browser how long

this content should be cached for

example here the cache control is set to

7,200 seconds which is equivalent to 2

hours when the re ested data is found in

the cache we call that a cash hit and on

the other hand we have cash Miss which

happens when the requested data is not

in the cash necessitating a fetch from

the original source and cash ratio is

the percentage of requests that are

served from the cach compared to all

requests and the higher ratio indicates

a more effective cach you can check if

the cash fall hit or missed from the

xcash header for example in this case it

says Miss so the cash was missed and in

case the cash is found we will have here

it here we also have server caching

which involves storing frequently

accessed data on the server site

reducing the need to perform expensive

operations like database queries serers

side caches are stored on a server or on

a separate cache server either in memory

like redis or on disk typically the

server checks the cache from the data

before quering the database if the data

is in the cach it is returned directly

otherwise the server queries the

database and if the data is not in the

cache the server retrieves it from the

database returns it to the user and then

stores it in the cache for future

requests this is the case of right

around cache where data is written

directly to permanent storage byp

passing the cache it is used when right

performance is less critical you also

have write through cache where data is

simultaneously written to cache and the

permanent storage it ensures data

consistency but can be slower than right

round cache and we also have right back

cach where data is first written to the

cache and then to permanent storage at a

later time this improves right

performance but you have a risk of

losing that data in case of a crush of

server but what happens if the cash is

full and we need to free up some space

to use our cash again for that we have

eviction policies which are rules that

determine which items to remove from the

cash when it's full common policies are

to remove least recently used ones or

first in first out where we remove the

ones that were added first or removing

the least frequently used ones database

caching is another crucial aspect and it

refers to the practice of caching

database query results to improve the

performance of database driven

applications it is often done either

within the database system itself or via

an external caching layer like redies or

M cache when a query is made we first

check the cache to see if the result of

that query has been stored if it is we

return the cach state avoiding the need

to execute the query against the

database but if the data is not found in

the cache the query is executed against

the database and the result is stored in

the cache for future requests this is

beneficial for read heavy applications

where some queries are executed

frequently and we use the same eviction

policies as we have for server side

caching another type of caching is CDN

which are a network of servers

distributed geographically they are

generally used to serf static content

such as JavaScript HTML CSS or image and

video files they cat the content from

the original server and deliver it to

users from the nearest CDN server when a

user requests a file like an image or a

website the request is redirected to the

nearest CDN server if the CDN server has

the cached content it delivers it to the

user if not it fetches the content from

the origin server caches it and then

forwards it to the user this is the pool

based type type of CDN where the CDN

automatically pulls the content from the

origin server when it's first requested

by a user it's ideal for websites with a

lot of static content that is updated

regularly it requires less active

management because the CDN automatically

keeps the content up to date another

type is push based CDs this is where you

upload the content to the origin server

and then it distributes these files to

the CDN this is useful when you have

large files that are infrequently

updated but need to be quickly

distributed when updated it requires

more active management of what content

is stored on the edn we again use the

cache control header to tell the browser

for how long it should cach the content

from CDN CDN are usually used for

delivering static assets like images CSS

files JavaScript bundles or video

content and it can be useful if you need

to ensure High availability and

performance for users it can also reduce

the load on the origin server but there

are some instances where we still need

to hit our origin server for example

when serving Dynamic content that

changes frequently or handling tasks

that require real-time processing and in

cases where the application requires

complex server side logic that cannot be

done in the CDN some of the benefits

that we get from CDN are reduced latency

by serving content from locations closer

to the user CDN significantly reduce

latency it also adds High avail ability

and scalability CDN can handle high

traffic loads and are resilent against

Hardware failures it also adds improved

security because many CDN offer security

features like DDS protection and traffic

encryption and the benefits of caching

are also reduced latency because we have

fast data retrieval since the data is

fetched from the nearby cache rather

than a remote server it lowers the

server load by reducing the number of

requests to the primary data source

decreasing server load and overall

faster load times lead to a better user

experience now let's talk about proxy

servers which act as an intermediary

between a client requesting a resource

and the server providing that resource

it can serve various purposes like

caching resources for faster access

anonymizing requests and load balancing

among multiple servers essentially it

receives requests from clients forwards

them to the relevant servers and then

Returns the servers respond back to the

client there are several types of proxy

servers each serving different purposes

here are some of the main types the

first one is forward proxy which sits in

front of clients and is used to send

requests to other servers on the

Internet it's often used within the

internal networks to control internet

access next one is reverse proxy which

sits in front of one or more web servers

intercepting requests from the internet

it is used for load balancing web

acceleration and as a security layer

another type is open proxy which allows

any user to connect and utilize the

proxy server often used to anonymize web

browsing and bypass content restrictions

we also have transparent proxy types

which passes along requests and

resources without modifying them but

it's visible to the client and it's

often used for caching and content

filtering next type is anonymous proxy

which is identifiable as a proxy server

but but does not make the original IP

address available this type is used for

anonymous browsing we also have

distorting proxies which provides an

incorrect original Ip to the destination

server this is similar to an anonymous

proxy but with purposeful IP

misinformation and next popular type is

high anonymity proxy or Elite proxy

which makes detecting the proxy use very

difficult these proxies do not send X

forwarded for or other identifying

header and they ensure maximum anonymity

the most commonly used proxy servers are

forward and reverse proxies a forward

proxy acts as a middle layer between the

client and the server it sits between

the client which can be a computer on an

internal Network and the external

servers which can be websites on the

internet when the client makes a request

it is first sent to the forward proxy

the proxy then evaluates the request and

decides based on its configuration and

rules whether to allow the request

modify it or to block it one of the

primary functions of a forward proxy is

to hide the client's IP address when it

forwards the request to the Target

server it appears as if the request is

coming from the proxy server itself

let's look at some example use cases of

forward proxies one popular example is

Instagram proxies these are a specific

type of forward proxy used to manage

multiple Instagram accounts without

triggering bonds or restrictions and

marketers and social media managers use

Instagram proxies to appear as if they

are located in different area or as

different users which allows them to

manage multiple accounts automate tasks

or gather data without being flaged for

suspicious activity next example is

internet use control and monitoring

proxies some organizations use forward

proxies to Monitor and control employee

internet usage they can block access to

non-related sites and protect against

web based threats they can also scan for

viruses and malware in incoming content

next common use case is caching

frequently accessed content forward

proxies can also cach popular websites

or content reducing bandwidth usage and

speeding up access for users within the

network this is especially beneficial in

networks where bandwidth is costly or

limited and it can be also used for

anonymizing web access people who are

concerned about privacy can use forward

proxies to hide their IP address and

other identifying information from

websites they Vis it and making it

difficult to track their web browsing

activities on the other hand the reverse

proxy is a type of proxy server that

sits in front of one or more web servers

intercepting requests from clients

before they reach the servers while a

forward proxy hides the client's

identity a reverse proxy essentially

hides the servers Identity or the

existence of multiple servers behind it

the client interacts only with the

reverse proxy and may not know about the

servers behind it it also distributes

client requests across multiple servers

balancing load and ensuring no single

server becomes overwhelmed reverse proxy

can also compress inbound and outbound

data cache files and manage SSL

encryption there be speeding up load

time and reducing server load some

common use case cases of reverse proxies

are load balancers these distribute

incoming Network traffic across multiple

servers ensuring no single server gets

too much load and by Distributing

traffic we prevent any single server

from becoming a bottleneck and it's

maintaining optimal service speed and

reliability CDs are also a type of

reverse proxies they are a network of

servers that deliver cach static content

from websites to users based on the

geographical location of the user they

act as Reverse proxies by retrieving

content from the origin server and

caching it so that it's closer to the

user for faster delivery another example

is web application firewalls which are

positioned in front of web applications

they inspect incoming traffic to block

hacking attempts and filter out unwanted

traffic firewalls also protect the

application from common web exploits and

another example is SSL off loading or

acceleration some reverse proxies handle

the encryption and decryption of SSL TLS

traffic offloading that task from web

servers to optimize their performance

load balancers are perhaps the most

popular use cases of proxy servers they

distribute incoming traffic across

multiple servers to make sure that no

server Bears Too Much load by spreading

the requests effectively they increase

the capacity and reliability of

applications here are some common

strategies and algorithms used in load

balancing

first one is round robin which is the

simplest form of load balancing where

each server in the pool gets a request

in sequential rotating order when the

last server is reached it Loops back to

the first one this type works well for

servers with similar specifications and

when the load is uniformly

distributable next one is list

connections algorithm which directs

traffic to the server with the fewest

active connections it's ideal for longer

tasks or when the server load is not

evenly distributed next we have the

least response time algorithm which

chooses the server with the lowest

response time and fewest active

connections this is effective and the

goal is to provide the fastest response

to requests next algorithm is IP hashing

which determines which server receives

the request based on the hash of the

client's IP address this ensures a

client consistently connects to the same

server and it's useful for session

persistence in application where it's

important that the client consistently

connects to the same server the variance

of these methods can also be vited which

brings us to the weighted algorithms for

example in weighted round robin or

weighted list connections servers are

assigned weights typically based on

their capacity or performance metrics

and the servers which are more capable

handle the most requests this is

effective when the servers in the pool

have different capabilities like

different CPU or different Rams we also

have geographical algorithms which

direct requests to the server

geographically closest to the user or

based on specific Regional requirements

this is useful for Global Services where

latency reduction is priority and the

next common algorithm is consistent

hashing which uses a hash function to

distribute data across various nodes

imagine a hash space that forms a circle

where the end wraps around to the

beginning often referred to as a has

ring and both the nodes and the data

like keys or stored values are hushed

onto this ring this makes sure that the

client consistently connects to the same

server every time an essential feature

of load balancers is continuous Health

checking of servers to ensure traffic is

only directed to servers that are online

and responsive if a server fails the

load balancer will stop sending traffic

to it until it is back online and load

balancers can be in different forms

including Hardware applications software

Solutions and cloud-based Services some

of the popular Hardware load balancers

are F5 big IP which is a widely used

Hardware load balancer known for its

high performance and extensive feature

set it offers local traffic management

Global server load balancing and

application security another example is

Citrix forly known as net scaler which

provides load balancing content

switching and ation acceleration some

popular software load balancers are AJ

proxy which is a popular open-source

software load balancer and proxy server

for TCP and HTTP based applications and

of course Eng X which is often used as a

web server but it also functions as a

load balancer and reverse proxy for HTTP

and other network protocols and some

popular cloud-based load balancers are

aws's elastic load balancing or microsof

oft aure load balancer or Google Cloud's

load balancer there are even some

virtual load balancers like Vim ver

Advanced load balancer which offers a

softwar defined application delivery

controller that can be deployed on

premises or in the cloud now let's see

what happens when a load balancer goes

down when the load balancer goes down it

can impact the whole availability and

performance of the application or

Services it manages it's basically a

single point of failure and in case it

goes down all of the servers become

unavailable for the clients to avoid or

minimize the impact of a load balancer

failure we have several strategies which

can be employed first one is

implementing a redundant load balancing

by using more than one load balancer

often in pairs which is a common

approach if one of them fails the other

one takes over which is a method known

as a

failover next strategy is to

continuously monitor and do health

checks of load balancer itself this can

ensure that any issues are detected

early and can be addressed before

causing significant disruption we can

also Implement Autos scaling and

selfhealing systems some Modern

infrastructures are designed to

automatically detect the failure of load

balancer and replace it with the new

instance without manual intervention and

in some configurations the NS failover

can reroute traffic away from an IP

address that is is no longer accepting

connections like a failed load balancer

to a preconfigured standby IP which is

our new load balancer system design

interviews are incomplete without a deep

dive into databases in the next few

minutes I'll take you through the

database Essentials you need to

understand to a that interview we'll

explore the role of databases in system

design sharding and replication

techniques and the key ACD properties

we'll also discuss different types of

databases vertical and horizontal

scaling options and database performance

techniques we have different types of

databases each designed for specific

tasks and challenges let's explore them

first type is relational databases think

of a relational database like a well

organized filling cabinet where all the

files are neatly sorted into different

drawers and folders some popular

examples of SQL databases are poster SQL

MySQL and SQL light all of the SQL

databases use tables for data storage

and they use SQL as a query language

they are great for transactions complex

queries and integrity relational

databases are also acid compliant

meaning they maintain the ACD properties

a stands for atomicity which means that

transactions Are All or Nothing C stands

for consistency which means that after a

transaction your database should be in a

consistent state I is isolation which

means that transactions should be

independent and D is for durability

which means that once transaction is

committed the data is there to stay we

also have nosql databases which drop the

consistency property from the ACD

imagine a nosql database as a

brainstorming board with sticky notes

you can add or remove notes in any shape

of form it's flexible some popular

examples are mongod DB Cassandra and

redis there are different different

types of nosql databases such as key

value pairs like redis document based

databases like mongod DB or graph based

databases like Neo 4G nosql databases

are schema less meaning they don't have

foreign Keys between tables which link

the data together they are good for

unstructured data ideal for scalability

quick iteration and simple queries there

are also inmemory databases this is like

having a whiteboard for quick

calculations and temporary sketches it's

fast because everything is in memory

some examples are redies and M cach they

have lightning fast data retrieval and

are used primarily for caching and

session storage now let's see how we can

scale databases the first option is

vertical scaling or scale up in vertical

scaling you improve the performance of

your database by enhancing the

capabilities of individual server where

the data is running this could involve

increasing CPU power adding more RAM

adding faster or more dis storage or

upgrading the network but there is a

maximum limit to the resources you can

add to a single machine and because of

that it's very limited the next option

is horizontal scaling or scale out which

involves adding more machines to the

existing pool of resources rather than

upgrading the single unit databases that

support horizontal scaling distribute

data across a cluster of machines this

could involve database sharding or data

replication the first option is database

sharding which is Distributing different

portions shards of the data set across

multiple servers this means you split

the data into smaller chunks and

distribute it across multiple servers

some of the sharding strategies include

range based sharding where you

distribute data based on the range of a

given key directory based sharding which

is utilizing a lookup service to direct

traffic to the correct database we also

have geographical charting which is

splitting databases based on

geographical

locations and the next horizontal

scaling option is data replication this

is keeping copies of data on multiple

servers for high availability we have

Master Slave replication which is where

you have one master database and several

read only slave databases or you can

have master master application which is

multiple databases that can both read

and write scaling your data database is

one thing but you also want to access it

faster so let's talk about different

performance techniques that can help to

access your data faster the most obvious

one is caching caching isn't just for

web servers database caching can be done

through inmemory databases like redies

you can use it to cat frequent queries

and boost your performance the next

technique is indexing indexes are

another way to boost the performance of

your database creating an index for

frequently accessed column will

significantly speed up retrieval times

and the next technique is query

optimization you can also consider

optimizing queries for fast data access

this includes minimizing joints and

using tools like SQL query analyzer or

explain plan to understand your query's

performance in all cases you should

remember the cap theorem which states

that you can only have two of these

three consistency availability and

partition tolerance when designing a

system you should prioritize two of the

is based on the requirements that you

have given in the interview if you

enjoyed this crash course then consider

watching my other videos about system

Design Concepts and interviews see you

next time

Loading...

Loading video analysis...