00:00:00
[Music]
00:00:08
all right
00:00:09
that brings us to what is apache
00:00:11
cassandra apache cassandra is a fast
00:00:13
distributed database
00:00:15
built for high availability and linear
00:00:16
scalability we want to be able to get
00:00:18
predictable performance
00:00:19
out of our database that means that we
00:00:21
can guarantee sla
00:00:23
a very very low latency we know that as
00:00:26
our cluster scales up
00:00:27
we're going to have the same performance
00:00:28
whether it's five nodes or ten nodes or
00:00:30
a thousand nodes
00:00:31
we know that we're not going to have any
00:00:34
single points of failure one of the
00:00:35
great things about cassandra
00:00:37
is that it's a peer-to-peer technology
00:00:39
so there's no master slave there's no
00:00:41
failover there's no leader election
00:00:42
there's none of this funny business that
00:00:44
we really have to worry about anymore
00:00:46
out of the box with open source we can
00:00:48
do multi-dc
00:00:49
when we're talking high availability we
00:00:51
want to be able to withstand failure of
00:00:52
an
00:00:53
entire data center cassandra can do that
00:00:55
you absolutely are not going to get that
00:00:57
out of a relational database we also
00:00:59
want to deploy everything commodity
00:01:00
hardware we talked about how expensive
00:01:02
it is to scale things up vertically
00:01:04
with cassandra you're going to put it on
00:01:05
cheap hardware and you're just going to
00:01:07
use a whole bunch of it
00:01:08
it's really really easy to manage and
00:01:10
speaking of management it's
00:01:12
extremely easy to manage operationally
00:01:14
you can take the same three-man team and
00:01:16
have them
00:01:17
manage a three-node cluster or 30-node
00:01:19
cluster
00:01:20
or a 100 node cluster i have put this
00:01:22
into production with that size
00:01:24
team and it absolutely works one thing
00:01:26
to keep in mind
00:01:27
it is not a drop-in replacement for a
00:01:29
relational database
00:01:30
so you're not going to take the same
00:01:31
data model and just throw it in
00:01:32
cassandra and hope it works
00:01:34
you are going to have to design your
00:01:35
application around cassandra's data
00:01:37
modeling rules but the net result is an
00:01:39
application that will never go down
00:01:41
if you want to think about cassandra
00:01:42
conceptually you can think about it as a
00:01:44
giant hash ring
00:01:45
where all nodes in the cluster are equal
00:01:47
so when i say nodes i basically mean
00:01:48
virtual machines or machines could
00:01:50
actually be physical computers
00:01:52
all participating in a cluster all equal
00:01:54
and each node owns a
00:01:56
range of hashes so like a bucket of
00:01:58
hashes and when you define a data model
00:02:00
in cassandra and
00:02:02
we won't be talking too much about cql
00:02:04
in this video but when you define a data
00:02:05
model and cassandra when you create a
00:02:07
table one of the things that you specify
00:02:09
is a primary key and part of that
00:02:11
primary key is something called the
00:02:12
partition key
00:02:13
and the partition key is what's actually
00:02:16
used when you insert data intake sander
00:02:18
the value of that
00:02:19
partition key is run through a
00:02:20
consistent hashing function
00:02:22
and depending on the output we can
00:02:23
figure out which bucket or which range
00:02:25
of hashes
00:02:26
that value fits into and thus which node
00:02:29
we need to go talk to
00:02:30
to actually distribute the data around
00:02:33
the cluster
00:02:34
the other cool thing about cassandra is
00:02:35
that data is replicated
00:02:37
to multiple servers and that all of
00:02:39
those servers all of them are equal so
00:02:41
like john mentioned
00:02:42
when we were just talking on the
00:02:43
previous slide there's none of this
00:02:45
master slave there's no zookeeper
00:02:46
there's no config servers
00:02:48
all nodes are equal any node in the
00:02:50
cluster can service any given
00:02:52
read or write request for the cluster
00:02:54
okay let's talk about trade-offs one of
00:02:56
the things that's really important to
00:02:57
understand when looking at a database
00:02:59
is how the cap theorem works so the cap
00:03:01
theorem says that during a network
00:03:02
partition which basically means when
00:03:04
computers can't talk to each other
00:03:06
either between data centers or on a
00:03:07
single network that you can either
00:03:09
choose consistency
00:03:10
or actually you can't get consistency or
00:03:12
you can get high availability
00:03:14
so really what happens here is if two
00:03:16
machines can't talk and you do a right
00:03:18
to them and you have to be completely
00:03:19
consistent
00:03:20
they can't talk to each other the system
00:03:22
is going to appear as if it's down
00:03:24
so if we give up consistency that means
00:03:26
we can be highly available
00:03:27
so that's what cassandra chooses it
00:03:29
chooses to be highly available
00:03:31
in a network partition as opposed to
00:03:33
being down
00:03:34
in a lot of applications this is
00:03:35
definitely way better than downtime
00:03:38
another thing that we need to know is
00:03:41
from data center to data center let's
00:03:42
say we were to take
00:03:43
three data centers around the world
00:03:44
maybe one in the u.s one in europe
00:03:47
one in asia it's completely impractical
00:03:49
to try and be consistent across data
00:03:51
centers
00:03:52
we want to asynchronously replicate our
00:03:54
data from one dc to another
00:03:56
because it just takes way too long for
00:03:58
data to travel from the u.s to asia
00:04:00
we're limited by the speed of light here
00:04:01
this is something that we are never
00:04:03
going to get around
00:04:04
it's just not going to happen that's why
00:04:05
we choose availability and that's how
00:04:08
consistency is affected now we want to
00:04:09
talk about some of the dials that
00:04:11
cassandra puts
00:04:12
into your hands as a developer to kind
00:04:14
of control this idea of
00:04:16
fault tolerance so john talked about the
00:04:18
cap theorem earlier this idea of
00:04:20
being maybe more consistent or being
00:04:22
more available it's kind of a sliding
00:04:23
scale and cassandra doesn't impose
00:04:25
one model on you you get a couple of
00:04:27
dials that you get to turn
00:04:29
to configure this like this idea of
00:04:30
being more consistent or more available
00:04:32
so first one we want to talk about is
00:04:34
replication so replication
00:04:36
usually this is called replication
00:04:37
factor it's abbreviated as rf you'll see
00:04:39
that a lot
00:04:40
when you're looking at cassandra
00:04:41
documentation and whatnot and a very
00:04:43
typical
00:04:44
replication factor for people running in
00:04:46
production is a replication factor of
00:04:47
three
00:04:48
and essentially all this means is how
00:04:50
many copies of each piece of data should
00:04:52
there be
00:04:53
in your cluster so when i do a write to
00:04:54
cassandra you can see in this example on
00:04:56
the slide here client's writing a
00:04:58
the a node gets a copy the b node gets a
00:05:00
copy and the c node there also gets a
00:05:01
copy so three copies total for an rf of
00:05:03
three
00:05:04
data is always replicated in cassandra
00:05:06
so we're going to talk on the next slide
00:05:08
about consistency level
00:05:09
data is always replicated in cassandra
00:05:11
you set this replication factor when you
00:05:13
configure a key space
00:05:14
which in cassandra is essentially a
00:05:16
collection of tables it's very similar
00:05:18
to
00:05:18
a schema in oracle or a database in
00:05:20
mysql or microsoft sql server this
00:05:23
replication
00:05:24
happens asynchronously and if a machine
00:05:26
is down while
00:05:27
while this replication is supposed to go
00:05:28
on whatever node you happen to be
00:05:30
talking to
00:05:31
is going to save what's called a hint
00:05:33
and cassandra uses something called
00:05:34
hinted handoffs
00:05:35
to be able to replay when that node
00:05:37
comes back up
00:05:38
and rejoins the cluster to be able to
00:05:40
replay all of the writes
00:05:42
that that node that was down missed the
00:05:44
other dial that cassandra gives you as a
00:05:46
developer is something called
00:05:47
consistency level
00:05:48
so you get to set this on any given read
00:05:50
or write request that you do
00:05:52
as a developer from your application
00:05:54
talking to cassandra
00:05:56
so i'm going to show you an example of
00:05:57
two of the most popular consistency
00:05:59
levels with the hope that that'll kind
00:06:00
of illustrate
00:06:01
exactly what consistency level means but
00:06:03
basically a consistency level means
00:06:05
how many replicas do i need to hear when
00:06:07
i do a read or write
00:06:09
before that read or that write is
00:06:10
considered successful so if i'm doing a
00:06:12
read
00:06:12
how many replicas do i need to hear from
00:06:14
before cassandra gives
00:06:15
the data back to the client or if i'm
00:06:18
doing a write
00:06:19
how many replicas need to say yep we got
00:06:21
your data we've written it to disk
00:06:23
before cassandra replies to the client
00:06:25
and says yep got your data
00:06:27
so the two most popular consistency
00:06:29
levels are consistency level of one
00:06:31
which like the name sort of implies just
00:06:33
means one replica so
00:06:35
you can see that first example there
00:06:36
client is writing a to the cluster
00:06:39
the a node gets its copy and since we're
00:06:41
writing at consistency level of one we
00:06:42
can acknowledge right back to the client
00:06:44
immediately
00:06:44
yep we got the data the dashed lines on
00:06:46
that diagram there are there
00:06:48
to indicate to you that just because you
00:06:49
write with a consistency level of one
00:06:51
doesn't mean that cassandra is not going
00:06:53
to honor your replication factor still
00:06:55
now the other most popular consistency
00:06:57
level that people use a lot
00:06:58
is quorum so quorum if you've never
00:07:00
heard the term before essentially
00:07:02
means a majority of replicas so in the
00:07:04
case of replication factor of three
00:07:06
uh this is two out of three uh for 51
00:07:09
or greater so in this example again we
00:07:12
got a client writing a
00:07:13
again and you can see the anode gets a
00:07:15
copy the b node gets a copy and responds
00:07:17
and at that point once we've got two out
00:07:19
of the three replicas that have
00:07:20
acknowledged we can acknowledge back to
00:07:22
the client
00:07:22
yes we've got your data now again dashed
00:07:25
line indicates
00:07:26
that uh you know we're still going to
00:07:27
honor your replication factor we're just
00:07:28
not waiting on that extra node to reply
00:07:31
before we acknowledge back to the client
00:07:33
you might say well why would i pick one
00:07:34
consistency level versus another you
00:07:36
know what kind of
00:07:37
impact is it going to have well one
00:07:38
pretty obvious one if you if you think
00:07:40
about it
00:07:41
is how fast you can read and write data
00:07:43
is definitely going to be impacted by
00:07:45
what consistency level you use so if i'm
00:07:47
using a lower consistency level
00:07:49
like say one if i'm only waiting on a
00:07:51
single server i'm going to be able to
00:07:52
read and write data really really
00:07:54
quickly whereas if i'm using a higher
00:07:55
consistency level
00:07:56
gonna be much slower to read and write
00:07:59
data
00:07:59
now the other thing that you kind of
00:08:01
have to keep in mind with consistency
00:08:02
level is this is also going to impact
00:08:04
your availability
00:08:05
so john talked about the cap theorem and
00:08:07
talked about c versus a
00:08:08
and how you know it's kind of a sliding
00:08:10
scale well if i choose a higher
00:08:12
consistency level
00:08:14
where i have to hear from more nodes
00:08:15
where more nodes have to be online to be
00:08:17
able to acknowledge reads and writes
00:08:19
then i'm going to be less available i'm
00:08:20
going to be less tolerant
00:08:22
to nodes going down whereas if i choose
00:08:24
a lower consistency level like
00:08:26
1 i'm going to be much more highly
00:08:28
available i'm going to be able to
00:08:29
withstand
00:08:30
in this example here with an rf3 i'm
00:08:32
going to be able to withstand two nodes
00:08:33
going down
00:08:34
and still be able to do reads and writes
00:08:36
in my cluster
00:08:37
and cassandra lets you pick this for
00:08:38
every query you do so it's not going to
00:08:40
impose one model
00:08:41
you as a developer get to choose which
00:08:42
consistency level is appropriate for
00:08:44
which parts of your application
00:08:45
one of the great things about cassandra
00:08:47
because we get asynchronous replication
00:08:49
is that it's really easy to do multiple
00:08:51
data centers so when we do a write to a
00:08:53
single data center we can specify our
00:08:55
consistency level
00:08:56
maybe we say one or quorum if we're
00:08:58
doing multiple data centers we're going
00:08:59
to say local one or local quorum
00:09:01
now when that happens you get your write
00:09:03
and it happens in your local data center
00:09:05
and then it returns to the client
00:09:07
and then let's say you had five data
00:09:08
centers the information that you wrote
00:09:10
to your first data center is going to be
00:09:12
asynchronously replicated
00:09:13
to the other data centers around the
00:09:15
world this is very very powerful
00:09:17
this is how you can get super super high
00:09:19
availability even when an entire data
00:09:21
center goes down
00:09:22
so you can specify the replication
00:09:24
factor per key space
00:09:26
so you can have one key space that has
00:09:28
five replicas one that has three one
00:09:30
that has one
00:09:31
it's completely up to you and completely
00:09:33
configurable
00:09:34
it's important to understand that a data
00:09:36
center can be logical or physical
00:09:37
one of the things that's great about
00:09:38
cassandra is that you can run it with a
00:09:40
tool like spark
00:09:41
and if you're doing that you may want to
00:09:43
have one data center which is your oltp
00:09:46
you're serving your fast reads to your
00:09:48
application and then one data center
00:09:50
virtually that's serving your olap
00:09:53
queries
00:09:53
and then doing that you can make sure
00:09:55
that your olap queries don't impact your
00:09:57
oltp stuff
00:10:04
[Music]
00:10:07
you