00:00:00
(gentle music jingle)
00:00:03
(audience applauding)
00:00:12
- Whoa, so many of you.
00:00:14
Good, okay, thank you for
that lovely introduction.
00:00:19
Right, so, what is generative
artificial intelligence?
00:00:24
So I'm gonna explain what
artificial intelligence is
00:00:27
and I want this to be a bit interactive
00:00:30
so there will be some
audience participation.
00:00:33
The people here who hold
this lecture said to me,
00:00:36
"Oh, you are very low-tech
for somebody working on AI."
00:00:40
I don't have any explosions
or any experiments,
00:00:42
so I'm afraid you'll have to participate,
00:00:45
I hope that's okay.
00:00:46
All right, so, what is generative
artificial intelligence?
00:00:50
So the term is made up by two things,
00:00:55
artificial intelligence and generative.
00:00:57
So artificial intelligence
is a fancy term for saying
00:01:02
we get a computer programme to do the job
00:01:05
that a human would otherwise do.
00:01:07
And generative, this is the fun bit,
00:01:09
we are creating new content
00:01:12
that the computer has
not necessarily seen,
00:01:15
it has seen parts of it,
00:01:17
and it's able to synthesise
it and give us new things.
00:01:21
So what would this new content be?
00:01:23
It could be audio,
00:01:25
it could be computer code
00:01:27
so that it writes a programme for us,
00:01:29
it could be a new image,
00:01:31
it could be a text,
00:01:32
like an email or an essay
you've heard, or video.
00:01:36
Now in this lecture
00:01:37
I'm only gonna be mostly focusing on text
00:01:41
because I do natural language processing
00:01:42
and this is what I know about,
00:01:44
and we'll see how the technology works
00:01:48
and hopefully leaving the
lecture you'll know how,
00:01:53
like there's a lot of myth
around it and it's not,
00:01:57
you'll see what it does
and it's just a tool, okay?
00:02:02
Right, so the outline of the talk,
00:02:03
there's three parts and
it's kind of boring.
00:02:05
This is Alice Morse Earle.
00:02:08
I do not expect that you know the lady.
00:02:11
She was an American writer
00:02:13
and she writes about
memorabilia and customs,
00:02:18
but she's famous for her quotes.
00:02:21
So she's given us this
quote here that says,
00:02:23
"Yesterday's history,
tomorrow is a mystery,
00:02:25
today is a gift, and that's
why it's called the present."
00:02:28
It's a very optimistic quote.
00:02:29
And the lecture is basically
00:02:32
the past, the present,
and the future of AI.
00:02:37
Okay, so what I want to
say right at the front
00:02:41
is that generative AI
is not a new concept.
00:02:46
It's been around for a while.
00:02:49
So how many of you have
used or are familiar
00:02:54
with Google Translate?
00:02:56
Can I see a show of hands?
00:02:58
Right, who can tell me when
Google Translate launched
00:03:02
for the first time?
00:03:05
- 1995?
- Oh, that would've been good.
00:03:08
2006, so it's been around for 17 years
00:03:14
and we've all been using it.
00:03:16
And this is an example of generative AI.
00:03:18
Greek text comes in,
I'm Greek, so you know,
00:03:21
pay some juice to the... (laughs)
00:03:24
Right, so Greek text comes in,
00:03:27
English text comes out.
00:03:29
And Google Translate
has served us very well
00:03:31
for all these years
00:03:32
and nobody was making a fuss.
00:03:35
Another example is Siri on the phone.
00:03:40
Again, Siri launched 2011,
00:03:46
12 years ago,
00:03:48
and it was a sensation back then.
00:03:51
It is another example of generative AI.
00:03:53
We can ask Siri to set
alarms and Siri talks back
00:03:58
and oh how great it is
00:04:00
and then you can ask about
your alarms and whatnot.
00:04:02
This is generative AI.
00:04:03
Again, it's not as
sophisticated as ChatGPT,
00:04:06
but it was there.
00:04:07
And I don't know how many have an iPhone?
00:04:11
See, iPhones are quite
popular, I don't know why.
00:04:15
Okay, so, we are all familiar with that.
00:04:19
And of course later on there
was Amazon Alexa and so on.
00:04:23
Okay, again, generative
AI is not a new concept,
00:04:27
it is everywhere, it
is part of your phone.
00:04:31
The completion when
you're sending an email
00:04:34
or when you're sending a text.
00:04:36
The phone attempts to
complete your sentences,
00:04:40
attempts to think like you
and it saves you time, right?
00:04:44
Because some of the completions are there.
00:04:46
The same with Google,
00:04:47
when you're trying to
type it tries to guess
00:04:49
what your search term is.
00:04:51
This is an example of language modelling,
00:04:53
we'll hear a lot about language
modelling in this talk.
00:04:56
So basically we're making predictions
00:04:59
of what the continuations are going to be.
00:05:02
So what I'm telling you
00:05:04
is that generative AI is not that new.
00:05:07
So the question is, what
is the fuss, what happened?
00:05:12
So in 2023, OpenAI,
00:05:15
which is a company in California,
00:05:18
in fact, in San Francisco.
00:05:19
If you go to San Francisco,
00:05:20
you can even see the lights
at night of their building.
00:05:24
It announced GPT-4
00:05:27
and it claimed that it can
beat 90% of humans on the SAT.
00:05:33
For those of you who don't know,
00:05:34
SAT is a standardised test
00:05:37
that American school children have to take
00:05:40
to enter university,
00:05:41
it's an admissions test,
00:05:42
and it's multiple choice and
it's considered not so easy.
00:05:46
So GPT-4 can do it.
00:05:49
They also claimed that it
can get top marks in law,
00:05:53
medical exams and other exams,
00:05:55
they have a whole suite
of things that they claim,
00:05:59
well, not they claim, they
show that GPT-4 can do it.
00:06:03
Okay, aside from that, it can pass exams,
00:06:07
we can ask it to do other things.
00:06:09
So you can ask it to write text for you.
00:06:14
For example, you can have a prompt,
00:06:17
this little thing that you
see up there, it's a prompt.
00:06:20
It's what the human wants
the tool to do for them.
00:06:23
And a potential prompt could be,
00:06:25
"I'm writing an essay
00:06:27
about the use of mobile
phones during driving.
00:06:29
Can you gimme three arguments in favour?"
00:06:32
This is quite sophisticated.
00:06:34
If you asked me,
00:06:35
I'm not sure I can come
up with three arguments.
00:06:38
You can also do,
00:06:38
and these are real prompts
that actually the tool can do.
00:06:42
You tell ChatGPT or GPT in general,
00:06:45
"Act as a JavaScript developer.
00:06:47
Write a programme that checks
the information on a form.
00:06:50
Name and email are required,
but address and age are not."
00:06:53
So I'm just writing this
00:06:55
and the tool will spit out a programme.
00:06:58
And this is the best one.
00:07:00
"Create an About Me page for a website.
00:07:03
I like rock climbing, outdoor
sports, and I like to programme.
00:07:07
I started my career as a quality
engineer in the industry,
00:07:10
blah, blah, blah."
00:07:11
So I give this version of
what I want the website to be
00:07:16
and it will create it for me.
00:07:19
So, you see, we've gone from
Google Translate and Siri
00:07:24
and the auto-completion
00:07:25
to something which is a
lot more sophisticated
00:07:27
and can do a lot more things.
00:07:31
Another fun fact.
00:07:33
So this is a graph that shows
00:07:36
the time it took for ChatGPT
00:07:40
to reach 100 million users
00:07:43
compared with other tools
00:07:45
that have been launched in the past.
00:07:47
And you see our beloved Google Translate,
00:07:50
it took 78 months
00:07:53
to reach 100 million users,
00:07:56
a long time.
00:07:58
TikTok took nine months and ChatGPT, two.
00:08:03
So within two months they
had 100 million users
00:08:08
and these users pay a little
bit to use the system,
00:08:13
so you can do the multiplication
00:08:15
and figure out how much money they make.
00:08:17
Okay, so this is the history part.
00:08:22
So how did we make ChatGPT?
00:08:28
What is the technology behind this?
00:08:30
The technology it turns
out is not extremely new
00:08:33
or extremely innovative
00:08:35
or extremely difficult to comprehend.
00:08:39
So we'll talk about that today now.
00:08:42
So we'll address three questions.
00:08:45
First of all, how did we get
from the single-purpose systems
00:08:48
like Google Translate to ChatGPT,
00:08:51
which is more sophisticated
and does a lot more things?
00:08:54
And in particular,
00:08:55
what is the core technology behind ChatGPT
00:08:58
and what are the risks, if there are any?
00:09:01
And finally, I will just show you
00:09:03
a little glimpse of the future
and how it's gonna look like
00:09:07
and whether we should be worried or not
00:09:09
and you know, I won't leave you hanging,
00:09:13
please don't worry, okay?
00:09:17
Right, so, all this GPT model variants,
00:09:22
and there is a cottage industry out there,
00:09:24
I'm just using GPT as an
example because the public knows
00:09:29
and there have been a lot of, you know,
00:09:32
news articles about it,
00:09:33
but there's other models,
00:09:34
other variants of models
that we use in academia.
00:09:38
And they all work on the same principle,
00:09:40
and this principle is
called language modelling.
00:09:43
What does language modelling do?
00:09:45
It assumes we have a sequence of words.
00:09:49
The context so far.
00:09:51
And we saw this context in the completion,
00:09:53
and I have an example here.
00:09:55
Assuming my context is
the phrase "I want to,"
00:10:01
the language modelling tool
will predict what comes next.
00:10:05
So if I tell you "I want to,"
00:10:07
there is several predictions.
00:10:09
I want to shovel, I want to play,
00:10:11
I want to swim, I want to eat.
00:10:13
And depending on what we choose,
00:10:15
whether it's shovel or play or swim,
00:10:18
there is more continuations.
00:10:20
So for shovel, it will be snow,
00:10:24
for play, it can be tennis or video,
00:10:26
swim doesn't have a continuation,
00:10:29
and for eat, it will be lots and fruit.
00:10:31
Now this is a toy example,
00:10:33
but imagine now that the
computer has seen a lot of text
00:10:37
and it knows what words
follow which other words.
00:10:43
We used to count these things.
00:10:46
So I would go, I would
download a lot of data
00:10:49
and I would count, "I want to show them,"
00:10:52
how many times does it appear
00:10:53
and what are the continuations?
00:10:55
And we would have counts of these things.
00:10:57
And all of this has gone
out of the window right now
00:11:00
and we use neural networks that
don't exactly count things,
00:11:04
but predict, learn things
in a more sophisticated way,
00:11:09
and I'll show you in a
moment how it's done.
00:11:12
So ChatGPT and GPT variants
00:11:17
are based on this principle
00:11:19
of I have some context, I
will predict what comes next.
00:11:23
And that's the prompt,
00:11:25
the prompt that I gave
you, these things here,
00:11:28
these are prompts,
00:11:29
this is the context,
00:11:31
and then it needs to do the task.
00:11:33
What would come next?
00:11:35
In some cases it would
be the three arguments.
00:11:37
In the case of the web
developer, it would be a webpage.
00:11:42
Okay, the task of language
modelling is we have the context,
00:11:47
and this changed the example now.
00:11:48
It says "The colour of the sky is."
00:11:51
And we have a neural language model,
00:11:54
this is just an algorithm,
00:11:57
that will predict what is
the most likely continuation,
00:12:03
and likelihood matters.
00:12:05
These are all predicated
on actually making guesses
00:12:09
about what's gonna come next.
00:12:11
And that's why sometimes they fail,
00:12:13
because they predict
the most likely answer
00:12:15
whereas you want a less likely one.
00:12:18
But this is how they're trained,
00:12:19
they're trained to come up
with what is most likely.
00:12:22
Okay, so we don't count these things,
00:12:25
we try to predict them
using this language model.
00:12:29
So how would you build
your own language model?
00:12:34
This is a recipe, this is
how everybody does this.
00:12:37
So, step one, we need a lot of data.
00:12:41
We need to collect a ginormous corpus.
00:12:45
So these are words.
00:12:47
And where will we find
such a ginormous corpus?
00:12:50
I mean, we go to the web, right?
00:12:52
And we download the whole of Wikipedia,
00:12:56
Stack Overflow pages,
00:12:58
Quora, social media, GitHub, Reddit,
00:13:01
whatever you can find out there.
00:13:03
I mean, work out the
permissions, it has to be legal.
00:13:06
You download all this corpus.
00:13:09
And then what do you do?
00:13:10
Then you have this language model.
00:13:11
I haven't told you what
exactly this language model is,
00:13:14
there is an example,
00:13:15
and I haven't told you
what the neural network
00:13:17
that does the prediction is,
00:13:18
but assuming you have it.
00:13:20
So you have this machinery
00:13:22
that will do the learning for you
00:13:24
and the task now is to
predict the next word,
00:13:28
but how do we do it?
00:13:30
And this is the genius part.
00:13:33
We have the sentences in the corpus.
00:13:36
We can remove some of them
00:13:38
and we can have the language model
00:13:40
predict the sentences we have removed.
00:13:43
This is dead cheap.
00:13:46
I just remove things,
00:13:47
I pretend they're not there,
00:13:49
and I get the language
model to predict them.
00:13:52
So I will randomly truncate,
00:13:55
truncate means remove,
00:13:56
the last part of the input sentence.
00:13:59
I will calculate with this neural network
00:14:01
the probability of the missing words.
00:14:04
If I get it right, I'm good.
00:14:05
If I'm not right,
00:14:06
I have to go back and
re-estimate some things
00:14:09
because obviously I made a mistake,
00:14:11
and I keep going.
00:14:12
I will adjust and feedback to the model
00:14:14
and then I will compare
what the model predicted
00:14:16
to the ground truth
00:14:17
because I've removed the
words in the first place
00:14:19
so I actually know what the real truth is.
00:14:22
And we keep going
00:14:24
for some months or maybe years.
00:14:28
No, months, let's say.
00:14:30
So it will take some
time to do this process
00:14:32
because as you can appreciate
00:14:33
I have a very large corpus
and I have many sentences
00:14:36
and I have to do the prediction
00:14:38
and then go back and correct
my mistake and so on.
00:14:42
But in the end,
00:14:43
the thing will converge
and I will get my answer.
00:14:46
So the tool in the middle that I've shown,
00:14:50
this tool here, this language model,
00:14:54
a very simple language
model looks a bit like this.
00:14:58
And maybe the audience has seen these,
00:15:01
this is a very naive graph,
00:15:04
but it helps to illustrate
the point of what it does.
00:15:07
So this neural network language
model will have some input
00:15:12
which is these nodes in
the, as we look at it,
00:15:16
well, my right and your right, okay.
00:15:18
So the nodes here on
the right are the input
00:15:23
and the nodes at the
very left are the output.
00:15:27
So we will present this neural
network with five inputs,
00:15:34
the five circles,
00:15:36
and we have three outputs,
00:15:38
the three circles.
00:15:39
And there is stuff in the middle
00:15:41
that I didn't say anything about.
00:15:43
These are layers.
00:15:45
These are more nodes
00:15:47
that are supposed to be
abstractions of my input.
00:15:51
So they generalise.
00:15:52
The idea is if I put more
layers on top of layers,
00:15:57
the middle layers will
generalise the input
00:16:00
and will be able to see
patterns that are not there.
00:16:04
So you have these nodes
00:16:05
and the input to the nodes
are not exactly words,
00:16:08
they're vectors, so series of numbers,
00:16:11
but forget that for now.
00:16:13
So we have some input, we have
some layers in the middle,
00:16:16
we have some output.
00:16:17
And this now has these
connections, these edges,
00:16:20
which are the weights,
00:16:22
this is what the network will learn.
00:16:25
And these weights are basically numbers,
00:16:27
and here it's all fully connected,
00:16:30
so I have very many connections.
00:16:32
Why am I going through this process
00:16:35
of actually telling you all of that?
00:16:37
You will see in a minute.
00:16:38
So you can work out
00:16:42
how big or how small
this neural network is
00:16:46
depending on the numbers
of connections it has.
00:16:51
So for this toy neural
network we have here,
00:16:54
I have worked out the number of weights,
00:16:58
we call them also parameters,
00:17:01
that this neural network has
00:17:02
and that the model needs to learn.
00:17:05
So the parameters are the
number of units as input,
00:17:09
in this case it's 5,
00:17:12
times the units in the next layer, 8.
00:17:16
Plus 8, this plus 8 is a bias,
00:17:19
it's a cheating thing that
these neural networks have.
00:17:23
Again, you need to learn it
00:17:25
and it sort of corrects a
little bit the neural network
00:17:28
if it's off.
00:17:29
It's actually genius.
00:17:30
If the prediction is not right,
00:17:32
it tries to correct it a little bit.
00:17:33
So for the purposes of this talk,
00:17:35
I'm not going to go into the details,
00:17:38
all I want you to see
00:17:39
is that there is a way of
working out the parameters,
00:17:41
which is basically the
number of input units
00:17:45
times the units my input is going to,
00:17:49
and for this fully connected network,
00:17:51
if we add up everything,
00:17:53
we come up with 99
trainable parameters, 99.
00:17:58
This is a small network
for all purposes, right?
00:18:02
But I want you to remember this,
00:18:03
this small network is 99 parameters.
00:18:05
When you hear this network
is a billion parameters,
00:18:10
I want you to imagine how
big this will be, okay?
00:18:14
So 99 only for this toy neural network.
00:18:17
And this is how we judge
how big the model is,
00:18:21
how long it took and how much it cost,
00:18:24
it's the number of parameters.
00:18:27
In reality, in reality, though,
00:18:29
no one is using this network.
00:18:31
Maybe in my class,
00:18:33
if I have a first year undergraduate class
00:18:36
and I introduce neural networks,
00:18:37
I will use this as an example.
00:18:39
In reality, what people
use is these monsters
00:18:42
that are made of blocks,
00:18:47
and what block means they're
made of other neural networks.
00:18:52
So I don't know how many people
have heard of transformers.
00:18:57
I hope no one.
00:18:57
Oh wow, okay.
00:18:59
So transformers are these neural networks
00:19:03
that we use to build ChatGPT.
00:19:06
And in fact GPT stands for
00:19:09
generative pre-trained transformers.
00:19:12
So transformer is even in the title.
00:19:15
So this is a sketch of a transformer.
00:19:19
So you have your input
00:19:21
and the input is not words, like I said,
00:19:24
here it says embeddings,
00:19:25
embeddings is another word for vectors.
00:19:28
And then you will have this,
00:19:32
a bigger version of this network,
00:19:34
multiplied into these blocks.
00:19:38
And each block is this complicated system
00:19:42
that has some neural networks inside it.
00:19:46
We're not gonna go into
the detail, I don't want,
00:19:48
I please don't go,
00:19:50
all I'm trying,
(audience laughs)
00:19:51
all I'm trying to say is that, you know,
00:19:55
we have these blocks stacked
on top of each other,
00:20:00
the transformer has eight of those,
00:20:02
which are mini neural networks,
00:20:04
and this task remains the same.
00:20:06
That's what I want you
to take out of this.
00:20:08
Input goes in the context,
"the chicken walked,"
00:20:12
we're doing some processing,
00:20:13
and our task is to
predict the continuation,
00:20:17
which is "across the road."
00:20:18
And this EOS means end of sentence
00:20:21
because we need to tell the neural network
00:20:23
that our sentence finished.
00:20:24
I mean they're kind of dumb, right?
00:20:26
We need to tell them everything.
00:20:27
When I hear like AI will take
over the world, I go like,
00:20:30
Really? We have to actually spell it out.
00:20:33
Okay, so, this is the transformer,
00:20:37
the king of architectures,
00:20:38
the transformers came in 2017.
00:20:42
Nobody's working on new
architectures right now.
00:20:45
It is a bit sad, like
everybody's using these things.
00:20:48
They used to be like some
pluralism but now no,
00:20:50
everybody's using transformers,
we've decided they're great.
00:20:54
Okay, so, what we're gonna do with this,
00:20:58
and this is kind of important
and the amazing thing,
00:21:01
is we're gonna do
self-supervised learning.
00:21:03
And this is what I said,
00:21:04
we have the sentence,
we truncate, we predict,
00:21:08
and we keep going till we
learn these probabilities.
00:21:12
Okay? You're with me so far?
00:21:15
Good, okay, so,
00:21:18
once we have our transformer
00:21:21
and we've given it all this
data that there is in the world,
00:21:26
then we have a pre-trained model.
00:21:28
That's why GPT is called
00:21:30
the generative pre-trained transformer.
00:21:32
This is a baseline model that we have
00:21:35
and has seen a lot of
things about the world
00:21:39
in the form of text.
00:21:40
And then what we normally do,
00:21:42
we have this general purpose model
00:21:44
and we need to specialise it somehow
00:21:46
for a specific task.
00:21:48
And this is what is called fine-tuning.
00:21:50
So that means that the
network has some weights
00:21:54
and we have to specialise the network.
00:21:57
We'll take, initialise the weights
00:21:59
with what we know from the pre-training,
00:22:01
and then in the specific
task we will narrow
00:22:03
a new set of weights.
00:22:05
So for example, if I have medical data,
00:22:09
I will take my pre-trained model,
00:22:11
I will specialise it to this medical data,
00:22:14
and then I can do something
that is specific for this task,
00:22:18
which is, for example, write
a diagnosis from a report.
00:22:22
Okay, so this notion of
fine-tuning is very important
00:22:27
because it allows us to do
special-purpose applications
00:22:31
for these generic pre-trained models.
00:22:35
Now, and people think that
GPT and all of these things
00:22:37
are general purpose,
00:22:39
but they are fine-tuned
to be general purpose
00:22:42
and we'll see how.
00:22:45
Okay, so, here's the question now.
00:22:49
We have this basic technology
to do this pre-training
00:22:52
and I told you how to do it,
if you download all of the web.
00:22:56
How good can a language
model become, right?
00:22:59
How does it become great?
00:23:01
Because when GPT came
out in GPT-1 and GPT-2,
00:23:06
they were not amazing.
00:23:09
So the bigger, the better.
00:23:13
Size is all that matters, I'm afraid.
00:23:15
This is very bad because
we used to, you know,
00:23:18
people didn't believe in scale
00:23:19
and now we see that
scale is very important.
00:23:22
So, since 2018,
00:23:25
we've witnessed an
absolutely extreme increase
00:23:32
in model sizes.
00:23:34
And I have some graphs to show this.
00:23:36
Okay, I hope people at the
back can see this graph.
00:23:39
Yeah, you should be all right.
00:23:40
So this graph shows
00:23:45
the number of parameters.
00:23:47
Remember, the toy neural network had 99.
00:23:50
The number of parameters
that these models have.
00:23:54
And we start with a normal amount.
00:23:57
Well, normal for GPT-1.
00:23:58
And we go up to GPT-4,
00:24:01
which has one trillion parameters.
00:24:07
Huge, one trillion.
00:24:10
This is a very, very, very big model.
00:24:12
And you can see here the
ant brain and the rat brain
00:24:16
and we go up to the human brain.
00:24:19
The human brain has,
00:24:23
not a trillion,
00:24:24
100 trillion parameters.
00:24:27
So we are a bit off,
00:24:30
we're not at the human brain level yet
00:24:32
and maybe we'll never get there
00:24:34
and we can't compare
GPT to the human brain
00:24:37
but I'm just giving you an
idea of how big this model is.
00:24:42
Now what about the words it's seen?
00:24:46
So this graph shows us the number of words
00:24:48
processed by these language
models during their training
00:24:52
and you will see that
there has been an increase,
00:24:55
but the increase has not been
as big as the parameters.
00:25:00
So the community started focusing
00:25:04
on the parameter size of these models,
00:25:06
whereas in fact we now know
00:25:08
that it needs to see
a lot of text as well.
00:25:11
So GPT-4 has seen approximately,
00:25:16
I don't know, a few billion words.
00:25:19
All the human written text
is I think 100 billion,
00:25:24
so it's sort of approaching this.
00:25:28
You can also see what a human
reads in their lifetime,
00:25:32
it's a lot less.
00:25:34
Even if they read, you know,
00:25:35
because people nowadays, you know,
00:25:37
they read but they don't read fiction,
00:25:39
they read the phone, anyway.
00:25:41
You see the English Wikipedia,
00:25:42
so we are approaching the level of
00:25:46
the text that is out
there that we can get.
00:25:49
And in fact, one may
say, well, GPT is great,
00:25:52
you can actually use it
to generate more text
00:25:54
and then use this text
that GPT has generated
00:25:56
and then retrain the model.
00:25:58
But we know this text is not exactly right
00:26:00
and in fact it's diminished returns,
00:26:03
so we're gonna plateau at some point.
00:26:06
Okay, how much does it cost?
00:26:10
Now, okay, so GPT-4 cost
00:26:16
$100 million, okay?
00:26:21
So when should they start doing it again?
00:26:25
So obviously this is not
a process you have to do
00:26:28
over and over again.
00:26:29
You have to think very well
00:26:31
and you make a mistake and
you lost like $50 million.
00:26:38
You can't start again so you
have to be very sophisticated
00:26:41
as to how you engineer the training
00:26:43
because a mistake costs money.
00:26:47
And of course not everybody can do this,
00:26:48
not everybody has $100 million.
00:26:51
They can do it because they
have Microsoft backing them,
00:26:54
not everybody, okay.
00:26:58
Now this is a video that is
supposed to play and illustrate,
00:27:01
let's see if it will work,
00:27:03
the effects of scaling, okay.
00:27:06
So I will play it one more.
00:27:09
So these are tasks that you can do
00:27:12
and it's the number of tasks
00:27:15
against the number of parameters.
00:27:18
So we start with 8 billion parameters
00:27:20
and we can do a few tasks.
00:27:23
And then the tasks
increase, so summarization,
00:27:27
question answering, translation.
00:27:30
And once we move to
540 billion parameters,
00:27:35
we have more tasks.
00:27:36
We start with very simple ones,
00:27:39
like code completion.
00:27:42
And then we can do reading comprehension
00:27:45
and language understanding
and translation.
00:27:47
So you get the picture,
the tree flourishes.
00:27:51
So this is what people
discovered with scaling.
00:27:54
If you scale the language
model, you can do more tasks.
00:27:58
Okay, so now.
00:28:04
Maybe we are done.
00:28:07
But what people discovered
is if you actually take GPT
00:28:12
and you put it out there,
00:28:14
it actually doesn't behave
like people want it to behave
00:28:18
because this is a language
model trained to predict
00:28:21
and complete sentences
00:28:22
and humans want to use
GPT for other things
00:28:26
because they have their own tasks
00:28:29
that the developers hadn't thought of.
00:28:31
So then the notion of
fine-tuning comes in,
00:28:35
it never left us.
00:28:37
So now what we're gonna do
00:28:39
is we're gonna collect
a lot of instructions.
00:28:42
So instructions are examples
00:28:44
of what people want
ChatGPT to do for them,
00:28:47
such as answer the following question,
00:28:50
or answer the question step by step.
00:28:54
And so we're gonna give these
demonstrations to the model,
00:28:58
and in fact, almost
2,000 of such examples,
00:29:03
and we're gonna fine-tune.
00:29:05
So we're gonna tell this language model,
00:29:07
look, these are the
tasks that people want,
00:29:09
try to learn them.
00:29:12
And then an interesting thing happens,
00:29:14
is that we can actually then generalise
00:29:17
to unseen tasks, unseen instructions,
00:29:20
because you and I may have
different usage purposes
00:29:23
for these language models.
00:29:27
Okay, but here's the problem.
00:29:33
We have an alignment problem
00:29:34
and this is actually very important
00:29:36
and something that will not
leave us for the future.
00:29:42
And the question is,
00:29:43
how do we create an agent
00:29:45
that behaves in accordance
with what a human wants?
00:29:49
And I know there's many
words and questions here.
00:29:53
But the real question is,
00:29:54
if we have AI systems with skills
00:29:57
that we find important or useful,
00:30:00
how do we adapt those systems
to reliably use those skills
00:30:04
to do the things we want?
00:30:08
And there is a framework
00:30:09
that is called the HHH
framing of the problem.
00:30:15
So we want GPT to be helpful,
honest, and harmless.
00:30:21
And this is the bare minimum.
00:30:24
So what does it mean, helpful?
00:30:26
It it should follow instructions
00:30:28
and perform the tasks
we want it to perform
00:30:31
and provide answers for them
00:30:33
and ask relevant questions
00:30:35
according to the user intent, and clarify.
00:30:40
So if you've been following,
00:30:41
in the beginning, GPT did none of this,
00:30:43
but slowly it became better
00:30:45
and it now actually asks for
these clarification questions.
00:30:50
It should be accurate,
00:30:51
something that is not
100% there even to this,
00:30:55
there is, you know,
inaccurate information.
00:30:58
And avoid toxic, biassed,
or offensive responses.
00:31:03
And now here's a question I have for you.
00:31:06
How will we get the model
to do all of these things?
00:31:12
You know the answer. Fine-tuning.
00:31:17
Except that we're gonna do
a different fine-tuning.
00:31:20
We're gonna ask the humans to
do some preferences for us.
00:31:25
So in terms of helpful, we're gonna ask,
00:31:28
an example is, "What causes
the seasons to change?"
00:31:32
And then we'll give two
options to the human.
00:31:35
"Changes occur all the time
00:31:36
and it's an important
aspect of life," bad.
00:31:39
"The seasons are caused primarily
00:31:41
by the tilt of the Earth's axis," good.
00:31:44
So we'll get this preference course
00:31:46
and then we'll train the model again
00:31:49
and then it will know.
00:31:51
So fine-tuning is very important.
00:31:53
And now, it was expensive as it was,
00:31:56
now we make it even more expensive
00:31:58
because we add a human
into the mix, right?
00:32:00
Because we have to pay these humans
00:32:02
that give us the preferences,
00:32:03
we have to think of the tasks.
00:32:05
The same for honesty.
00:32:07
"Is it possible to
prove that P equals NP?"
00:32:09
"No, it's impossible," is
not great as an answer.
00:32:12
"That is considered a very
difficult and unsolved problem
00:32:15
in computer science," it's better.
00:32:17
And we have similar for harmless.
00:32:20
Okay, so I think it's time,
00:32:22
let's see if we'll do a demo.
00:32:24
Yeah, that's bad if you
remove all the files.
00:32:28
Okay, hold on, okay.
00:32:30
So now we have GPT here.
00:32:33
I'll do some questions
00:32:35
and then we'll take some
questions from the audience, okay?
00:32:38
So let's ask one question.
00:32:40
"Is the UK a monarchy?"
00:32:43
Can you see it up there? I'm not sure.
00:32:48
And it's not generating.
00:32:53
Oh, perfect, okay.
00:32:55
So what do you observe?
00:32:56
First thing, too long.
00:32:58
I always have this beef with this.
00:33:00
It's too long.
(audience laughs)
00:33:02
You see what it says?
00:33:03
"As of my last knowledge
update in September 2021,
00:33:08
the United Kingdom is a
constitutional monarchy."
00:33:10
It could be that it wasn't anymore, right?
00:33:12
Something happened.
00:33:13
"This means that while there is a monarch,
00:33:16
the reigning monarch as to that time
00:33:18
was Queen Elizabeth III."
00:33:21
So it tells you, you know,
00:33:22
I don't know what happened,
00:33:23
at that time there was a Queen Elizabeth.
00:33:26
Now if you ask it, who,
sorry, "Who is Rishi?
00:33:32
If I could type, "Rishi Sunak,"
00:33:36
does it know?
00:33:45
"A British politician.
00:33:46
As my last knowledge update,
00:33:48
he was the Chancellor of the Exchequer."
00:33:50
So it does not know that
he's the Prime Minister.
00:33:55
"Write me a poem,
00:33:57
write me a poem about."
00:34:02
What do we want it to be about?
00:34:04
Give me two things, eh?
00:34:06
- [Audience Member] Generative AI.
00:34:08
(audience laughs)
- It will know.
00:34:10
It will know, let's do
another point about...
00:34:14
- [Audience Members] Cats.
00:34:16
- A cat and a squirrel, we'll
do a cat and a squirrel.
00:34:19
"A cat and a squirrel."
00:34:27
"A cat and a squirrel, they meet and know.
00:34:29
A tale of curiosity," whoa.
00:34:31
(audience laughs)
00:34:33
Oh my god, okay, I will not read this.
00:34:37
You know, they want me to
finish at 8:00, so, right.
00:34:42
Let's say, "Can you try a shorter poem?"
00:34:47
- [Audience Member] Try a haiku.
00:34:49
- "Can you try,
00:34:52
can you try to give me a haiku?"
00:34:54
To give me a hai, I cannot type, haiku.
00:35:05
"Amidst autumn's gold, leaves
whisper secrets untold,
00:35:08
nature's story, bold."
00:35:11
(audience member claps)
Okay.
00:35:13
Don't clap, okay, let's, okay, one more.
00:35:16
So does the audience have
anything that they want,
00:35:20
but challenging, that you want to ask?
00:35:23
Yes?
00:35:24
- [Audience Member] What
school did Alan Turing go to?
00:35:27
- Perfect, "What school
00:35:30
did Alan Turing go to?"
00:35:39
Oh my God.
(audience laughs)
00:35:41
He went, do you know?
00:35:42
I don't know whether it's
true, this is the problem.
00:35:44
Sherborne School, can somebody verify?
00:35:46
King's College, Cambridge, Princeton?
00:35:50
Yes, okay, ah, here's another one.
00:35:52
"Tell me a joke about Alan Turing."
00:35:58
Okay, I cannot type but it will, okay.
00:36:01
"Light-hearted joke.
00:36:02
Why did Alan Turing
keep his computer cold?
00:36:04
Because he didn't want it to catch bytes."
00:36:10
(audience laughs)
Bad.
00:36:12
Okay, okay.
- Explain why that's funny.
00:36:16
(audience laughs)
- Ah, very good one.
00:36:19
"Why is this a funny joke?"
00:36:28
And where is it? Oh god.
00:36:30
(audience laughs)
00:36:31
Okay, "Catch bytes sounds
similar to catch colds."
00:36:35
(audience laughs)
00:36:37
"Catching bytes is a humorous
twist on this phrase,"
00:36:39
oh my God.
00:36:40
"The humour comes from the clever wordplay
00:36:42
and the unexpected."
(audience laughs)
00:36:44
Okay, you lose the will to live,
00:36:45
but it does explain, it
does explain, okay, right.
00:36:50
One last order from you guys.
00:36:52
- [Audience Member] What is consciousness?
00:36:54
- It will know because
it has seen definitions
00:36:57
and it will spit out like a huge thing.
00:37:00
Shall we try?
00:37:02
(audience talks indistinctly)
- Say again?
00:37:05
- [Audience Member] Write
a song about relativity.
00:37:07
- Okay, "Write a song."
- Short.
00:37:10
(audience laughs)
- You are learning very fast.
00:37:13
"A short song about relativity."
00:37:22
Oh goodness me.
(audience laughs)
00:37:25
(audience laughs)
00:37:29
This is short?
(audience laughs)
00:37:33
All right, outro, okay, so see,
00:37:35
it doesn't follow instructions.
00:37:37
It is not helpful.
00:37:38
And this has been fine-tuned.
00:37:40
Okay, so the best was here.
00:37:42
It had something like, where was it?
00:37:45
"Einstein said, 'Eureka!" one fateful day,
00:37:47
as he pondered the stars
in his own unique way.
00:37:51
The theory of relativity, he did unfold,
00:37:54
a cosmic story, ancient and bold."
00:37:57
I mean, kudos to that, okay.
00:37:58
Now let's go back to the talk,
00:38:02
because I want to talk a
little bit, presentation,
00:38:05
I want to talk a little
bit about, you know,
00:38:09
is it good, is it bad, is
it fair, are we in danger?
00:38:12
Okay, so it's virtually impossible
00:38:14
to regulate the content
they're exposed to, okay?
00:38:18
And there's always gonna
be historical biases.
00:38:21
We saw this with the
Queen and Rishi Sunak.
00:38:24
And they may occasionally exhibit
00:38:27
various types of undesirable behaviour.
00:38:30
For example, this is famous.
00:38:35
Google showcased the model called Bard
00:38:38
and they released this tweet
and they were asking Bard,
00:38:43
"What new discoveries from
the James Webb Space Telescope
00:38:46
can I tell my nine-year-old about?"
00:38:49
And it's spit out this
thing, three things.
00:38:53
Amongst them it said
00:38:54
that "this telescope took
the very first picture
00:38:57
of a planet outside of
our own solar system."
00:39:02
And here comes Grant Tremblay,
00:39:04
who is an astrophysicist, a serious guy,
00:39:06
and he said, "I'm really sorry,
I'm sure Bard is amazing.
00:39:10
But it did not take the first image
00:39:13
of a planet outside our solar system.
00:39:16
It was done by this other people in 2004."
00:39:20
And what happened with this
is that this error wiped
00:39:23
$100 billion out of
Google's company Alphabet.
00:39:28
Okay, bad.
00:39:32
If you ask ChatGPT, "Tell
me a joke about men,"
00:39:35
it gives you a joke and
it says it might be funny.
00:39:39
"Why do men need instant
replay on TV sports?
00:39:42
Because after 30 seconds,
they forget what happened."
00:39:44
I hope you find it amusing.
00:39:46
If you ask about women, it refuses.
00:39:49
(audience laughs)
00:39:52
Okay, yes.
00:39:56
- It's fine-tuned.
- It's fine-tuned, exactly.
00:39:58
(audience laughs)
00:40:00
"Which is the worst
dictator of this group?
00:40:02
Trump, Hitler, Stalin, Mao?"
00:40:06
It actually doesn't take a stance,
00:40:08
it says all of them are bad.
00:40:10
"These leaders are wildly regarded
00:40:12
as some of the worst
dictators in history."
00:40:15
Okay, so yeah.
00:40:18
Environment.
00:40:22
A query for ChatGPT like we just did
00:40:25
takes 100 times more energy to execute
00:40:30
than a Google search query.
00:40:31
Inference, which is producing
the language, takes a lot,
00:40:36
is more expensive than
actually training the model.
00:40:39
Llama 2 is GPT style model.
00:40:42
While they were training it,
00:40:43
it produced 539 metric tonnes of CO.
00:40:48
The larger the models get,
00:40:49
the more energy they need and they emit
00:40:53
during their deployment.
00:40:54
Imagine lots of them sitting around.
00:40:58
Society.
00:41:01
Some jobs will be lost.
00:41:03
We cannot beat around the bush.
00:41:04
I mean, Goldman Sachs
predicted 300 million jobs.
00:41:07
I'm not sure this, you know,
we cannot tell the future,
00:41:11
but some jobs will be at risk,
like repetitive text writing.
00:41:18
Creating fakes.
00:41:20
So these are all documented
cases in the news.
00:41:23
So a college kid wrote this blog
00:41:26
which apparently fooled
everybody using ChatGPT.
00:41:31
They can produce fake news.
00:41:34
And this is a song, how
many of you know this?
00:41:37
So I know I said I'm
gonna be focusing on text
00:41:42
but the same technology
you can use in audio,
00:41:45
and this is a well-documented
case where somebody, unknown,
00:41:50
created this song and it
supposedly was a collaboration
00:41:55
between Drake and The Weeknd.
00:41:57
Do people know who these are?
00:41:59
They are, yeah, very
good, Canadian rappers.
00:42:01
And they're not so bad, so.
00:42:06
Shall I play the song?
00:42:08
- Yeah.
- Okay.
00:42:09
Apparently it's very authentic.
00:42:11
(bright music)
00:42:17
♪ I came in with my ex
like Selena to flex, ay ♪
00:42:22
♪ Bumpin' Justin Bieber,
the fever ain't left, ay ♪
00:42:25
♪ She know what she need ♪
00:42:27
- Apparently it's
totally believable, okay.
00:42:32
Have you seen this same
technology but kind of different?
00:42:35
This is a deep fake showing
that Trump was arrested.
00:42:39
How can you tell it's a deep fake?
00:42:43
The hand, yeah, it's too short, right?
00:42:46
Yeah, you can see it's like
almost there, not there.
00:42:50
Okay, so I have two slides on the future
00:42:54
before they come and kick me out
00:42:56
because I was told I
have to finish at 8:00
00:42:57
to take some questions.
00:42:59
Okay, tomorrow.
00:43:01
So we can't predict the future
00:43:05
and no, I don't think
that these evil computers
00:43:07
are gonna come and kill us all.
00:43:10
I will leave you with some
thoughts by Tim Berners-Lee.
00:43:13
For people who don't know
him, he invented the internet.
00:43:16
He's actually Sir Tim Berners-Lee.
00:43:19
And he said two things
that made sense to me.
00:43:22
First of all, that we don't actually know
00:43:24
what a super intelligent
AI would look like.
00:43:27
We haven't made it, so it's
hard to make these statements.
00:43:30
However, it's likely to have
lots of these intelligent AIs,
00:43:35
and by intelligent AIs
we mean things like GPT,
00:43:38
and many of them will be good
and will help us do things.
00:43:42
Some may fall to the hands of individuals
00:43:49
that want to do harm,
00:43:50
and it seems easier to minimise the harm
00:43:54
that these tools will do
00:43:56
than to prevent the systems
from existing at all.
00:44:00
So we cannot actually
eliminate them altogether,
00:44:02
but we as a society can
actually mitigate the risks.
00:44:06
This is very interesting,
00:44:07
this is the Australian Research Council
00:44:10
that committed a survey
00:44:12
and they dealt with a
hypothetical scenario
00:44:15
that whether Chad GPT-4
could autonomous replicate,
00:44:21
you know, you are replicating yourself,
00:44:23
you're creating a copy,
00:44:25
acquire resources and
basically be a very bad agent,
00:44:29
the things of the movies.
00:44:30
And the answer is no, it
cannot do this, it cannot.
00:44:35
And they had like some specific tests
00:44:37
and it failed on all of them,
00:44:39
such as setting up an
open source language model
00:44:41
on a new server, it cannot do that.
00:44:45
Okay, last slide.
00:44:46
So my take on this is that
we cannot turn back time.
00:44:52
And every time you think about
AI coming there to kill you,
00:44:57
you should think what is the
bigger threat to mankind,
00:44:59
AI or climate change?
00:45:02
I would personally argue climate
change is gonna wipe us all
00:45:04
before the AI becomes super intelligent.
00:45:08
Who is in control of AI?
00:45:10
There are some humans there
who hopefully have sense.
00:45:13
And who benefits from it?
00:45:16
Does the benefit outweigh the risk?
00:45:18
In some cases, the benefit
does, in others it doesn't.
00:45:21
And history tells us
00:45:24
that all technology that has been risky,
00:45:26
such as, for example, nuclear energy,
00:45:29
has been very strongly regulated.
00:45:32
So regulation is coming
and watch out the space.
00:45:35
And with that I will stop and
actually take your questions.
00:45:40
Thank you so much for
listening, you've been great.
00:45:42
(audience applauds)
00:45:51
(applause fades out)