00:00:00
design a social media platform that can
00:00:02
handle millions of user requests where
00:00:04
do you even start from here today we'll
00:00:06
walk through core fundamentals that you
00:00:08
need to get started we can start with a
00:00:10
simple web server and one database to
00:00:12
store your user data however this will
00:00:14
not scale as your user base grows so
00:00:17
distributed system are the go-to
00:00:19
solution these are network of
00:00:20
independent computers working as one
00:00:22
coherent system when we talk about
00:00:24
distributed system we need to understand
00:00:26
the key characteristics scalability this
00:00:29
is the system ability to handle growing
00:00:31
demands there are two ways to scale
00:00:33
horizontal scaling by adding more
00:00:35
servers and vertical scaling by
00:00:37
upgrading existing Hardwares reliability
00:00:40
a reliable system continues to function
00:00:42
correctly even when components fail
00:00:45
availability is the percentage of time a
00:00:48
system remains operational this is often
00:00:50
expressed in nines for example 99.9%
00:00:53
availability means the system is down
00:00:55
for no more than 8.76 hours per year
00:00:58
while 99.99% with would only be down for
00:01:00
52.6 minutes per year efficiency
00:01:03
measured by two main factors latency
00:01:06
which is the delay in getting the first
00:01:07
response and throughput which is the
00:01:09
number of operations handled in a given
00:01:12
time these characteristics often involve
00:01:14
trade-offs your goal is to balance these
00:01:16
factors based on the given requirements
00:01:18
although ideal distributed system face
00:01:20
inherent limitations the cap theorum
00:01:23
states that a distributed system can
00:01:24
only guarantee two out of the three
00:01:26
properties consistency all notes display
00:01:29
identically data guaranteeing that reads
00:01:31
always reflect the most recent rate
00:01:33
availability every request receives a
00:01:35
respond without guaranteeing that the
00:01:38
data is the most recent partition
00:01:40
tolerant the system continues to
00:01:41
function despite Network failures
00:01:43
between note this trade-off is crucial
00:01:45
in designing distributed system
00:01:47
influencing how systems handle data
00:01:49
updates and respon to failures now our
00:01:52
architecture uses multiple web servers
00:01:54
which is amazing because we can handle
00:01:55
more load by adding more servers but
00:01:58
what happens if one server ends up
00:02:00
receiving more requests than others to
00:02:02
manage distributed system load
00:02:03
efficiently we need a load balancer
00:02:06
which distributes incoming requests
00:02:07
across multiple servers to ensure that
00:02:10
no single servers becomes overwhelmed if
00:02:12
one server goes down the load balancer
00:02:14
will only redirect traffic to healthy
00:02:16
servers a load balancer can be placed at
00:02:18
various levels between the users and web
00:02:20
servers between web servers and
00:02:22
application servers and between the
00:02:24
application servers and databases there
00:02:26
are several algorithms load balancers
00:02:28
used to distribute traffic such as lease
00:02:30
connection method sends request to
00:02:32
server with the fewest active
00:02:34
connections round robin cycle through a
00:02:36
list of servers sequentially IP hash
00:02:39
uses the client's IP address to
00:02:40
determine which server receives the
00:02:42
request which one to use really depends
00:02:44
on the specific needs it's also worth
00:02:47
noting that load balancer itself could
00:02:49
become a single point of failure to
00:02:51
prevent this we can add another load
00:02:53
balancer for standby if the primary one
00:02:55
fails the second one takes over
00:02:57
immediately things are going great so
00:03:00
far but we start to notice that these
00:03:01
servers often request the same data to
00:03:04
our database that's where caching comes
00:03:06
into play caching takes advantage of the
00:03:08
principle that recently requested data
00:03:10
is likely to be requested again
00:03:12
retrieving data from cash is typically
00:03:14
way faster than from the original
00:03:16
database aside from application cache
00:03:18
there is also content delivery Network
00:03:20
or CDN which is ideal for serving static
00:03:23
media cdns cach content closer to the
00:03:26
user to reduce latency however caching
00:03:28
does come with its own set of challenges
00:03:30
which is maintaining data consistency
00:03:32
and making sure that the data is in sync
00:03:34
with the source of Truth we don't want
00:03:36
to serve the data from cash if it's not
00:03:38
up to date this leads us to cash and
00:03:40
validation strategies write through data
00:03:42
is written to both cash and storage at
00:03:44
the same time uring consistency but
00:03:47
increasing right latency right around
00:03:50
data bypasses the cash and goes directly
00:03:52
to the storage preventing cash flooding
00:03:54
but potentially increasing read latency
00:03:56
for new data right back data is written
00:03:59
to cash first and later to storage
00:04:01
offering low latency but risking data
00:04:04
loss in case of system failures when a
00:04:06
cach reaches capacity we need eviction
00:04:09
policy to make room for new data some
00:04:11
common ones are least recently used
00:04:13
removes the least recently accessed
00:04:16
data first in first out removes the
00:04:19
oldest item
00:04:21
first and least frequently used removes
00:04:23
the least often accessed items as our
00:04:26
platform grows we need to think about
00:04:28
storage strategy should we stick with a
00:04:30
traditional SQL database or go with no
00:04:32
SQL SQL stores data and tables with
00:04:35
predefined schemas each row contains all
00:04:37
the information about a piece of record
00:04:40
if you want to add a new column the
00:04:41
changes would need to be applied to all
00:04:43
the records in the table popular SQL
00:04:45
database include MySQL Oracle and
00:04:47
postgress nosql on the other hand is a
00:04:50
non-relational databases that have a
00:04:52
more flexible data structure they come
00:04:54
in four main types key value stores like
00:04:57
redis document databases like m DB wide
00:05:00
column like Cassandra graph databases
00:05:03
like neo4j when comparing SQL versus no
00:05:06
SQL we often look at the structure SQL
00:05:08
has a rigid schema while no SQL has a
00:05:11
more flexible schema querying SQL
00:05:14
databases use standard structured query
00:05:17
language while no SQL databases queries
00:05:19
are more focused on collection of
00:05:21
documents in terms of scalability SQL
00:05:24
typically scales vertically although can
00:05:26
be done horizontally through sharting
00:05:28
while no SQL scales hor onally
00:05:30
reliability SQL is AIT compliant while
00:05:33
no SQL often sacrifices this for
00:05:35
performance and scalability AIT refers
00:05:38
to a set of principle where automac City
00:05:40
ensures that transaction is fully
00:05:42
completed or not at all consistency
00:05:44
guarantees that a transaction takes a
00:05:46
database from one valid state to another
00:05:49
enforcing all defined rules isolation
00:05:51
keeps transactions separate so their
00:05:53
operations don't interfere with each
00:05:55
other durability ensures that once a
00:05:57
transaction is committed it remains
00:05:59
permanent even in case of failure so
00:06:01
which one to use we want to use SQL when
00:06:04
we need access compliance than financial
00:06:06
applications and when our data structure
00:06:08
doesn't change often no SQL would be a
00:06:10
good option if we're dealing with large
00:06:12
volumes of unstructured data or if we're
00:06:14
in need of Rapid development that
00:06:16
requires a lot of flexibility after
00:06:18
choosing our database we notice that
00:06:20
queries are really slow and we need to
00:06:22
fix this ASAP we notice that data that
00:06:24
we're querying doesn't have an index so
00:06:26
we're constantly having to search
00:06:28
through the entire user table every
00:06:30
single time indexes work by creating a
00:06:32
separate data structure that points to
00:06:34
the location of the actual data speeding
00:06:36
up search operations the most common
00:06:38
types of indexes are primary key the
00:06:40
unique identifier for each record in the
00:06:43
table secondary index the additional
00:06:45
index on a non-p primary key columns for
00:06:47
faster search query such as searching
00:06:49
for users's first name Composite Index
00:06:52
which created on multiple columns useful
00:06:54
for queries involving those columns
00:06:56
together such as first name and last
00:06:58
name not an end but worth mentioning
00:07:01
foreign key which is a constraint that
00:07:03
enforces a relationship between columns
00:07:05
and different tables while index is
00:07:07
dramatically improve read performance
00:07:09
they can also slow down right operations
00:07:11
this is because every time you insert
00:07:13
update or delete data the index must
00:07:16
also be updated that's why it's very
00:07:18
important that we are decisive and
00:07:20
intentional when creating indexes
00:07:22
because we designed such an amazing
00:07:23
platform so many users decide to sign up
00:07:26
for this app and that we're now facing
00:07:28
challenges with the sheer volume of data
00:07:30
our database is literally going to
00:07:32
explode so you try to beef up your
00:07:34
database by adding more Hardware but the
00:07:36
growth continues and it's just not
00:07:38
enough when our database can no longer
00:07:40
scale vertically we can look into Data
00:07:42
partitioning which is a technique for
00:07:44
breaking large databases into smaller
00:07:47
more manageable Parts this improves
00:07:49
performance availability and load
00:07:50
balancing as your application scale
00:07:52
there are three main partitioning
00:07:54
methods horizontal partitioning which
00:07:56
divides rows of a table across multiple
00:07:59
data bases vertical partitioning
00:08:01
separates entire features or columns
00:08:03
into two different databases and
00:08:05
directory based partitioning uses a
00:08:07
lookup service to abstract the
00:08:09
partitioning scheme partitioning can be
00:08:11
done on various criteria key or hash
00:08:13
base applies a hash function to a key
00:08:16
attribute to determine which partition
00:08:18
the data belongs to a notable approach
00:08:20
here is consistent hashing which is a
00:08:23
technique that minimizes data
00:08:24
redistribution when scaling the number
00:08:26
of servers at a very high level it works
00:08:29
by Distributing data across some number
00:08:31
of servers around a hash ring each data
00:08:34
is hashed to determine which server it
00:08:36
belongs to each server is also only
00:08:39
responsible for a portion of the hash
00:08:41
range when adding or removing servers
00:08:43
only a small fraction of data needs to
00:08:45
be remapped this makes it very easy to
00:08:47
scale dynamically and reduces the impact
00:08:50
of server changes list partitioning
00:08:52
assign each partition a list of value
00:08:54
storing each data based on which list
00:08:57
its key belongs to round robin this
00:08:59
distribute data evenly across partition
00:09:01
in a circular order composite
00:09:03
partitioning combines two or more
00:09:05
partitioning methods while partitioning
00:09:07
solve scaling issues it also introduces
00:09:10
its own challenges like difficulty in
00:09:12
joining across multiple partitions
00:09:14
leading to potentially tricky data
00:09:17
rebalancing we've taken our social media
00:09:20
platform from a simple single server
00:09:22
setup to a robust scalable architecture
00:09:25
I couldn't cover everything in this
00:09:27
introductory overview to system design
00:09:29
and there's just so much more to cover
00:09:31
if you're interested let me know if you
00:09:32
want to see more of this but I hope that
00:09:34
you are able to learn something new
00:09:35
today as always thank you so much for
00:09:37
watching and see you in the next one