00:00:00
2024 will be the year of AI agents.
00:00:04
So what are AI agents?
00:00:05
And to start explaining that,
00:00:07
we have to look at the various shifts that
we're seeing in the field of generative AI.
00:00:10
And the first shift I would like to talk
to you about
00:00:13
is this move from monolithic models to compound AI systems.
00:00:26
So models on their own are limited by the data they've been trained on.
00:00:31
So that impacts what they know about the world
00:00:34
and what sort of tasks they can solve.
00:00:40
They are also hard to adapt.
00:00:42
So you could tune a model, but it would take
an investment in data,
00:00:46
and in resources.
00:00:51
So let's take a concrete example
to illustrate this point.
00:00:55
I want to plan a vacation for this summer,
00:00:58
and I want to know how many vacation days are at my disposal.
00:01:06
What I can do is take my query,
00:01:10
feed that into a model that can generate a response.
00:01:19
I think we can all expect that this answer will be incorrect,
00:01:23
because the model doesn't know who I am
00:01:26
and does not have access
to this sensitive information about me.
00:01:30
So models on their own could be useful for a
number of tasks, as we've seen in other videos.
00:01:35
So they can help with summarizing documents,
00:01:38
they can help me with creating first drafts for emails
00:01:41
and different reports I'm trying to do.
00:01:43
But the magic gets unlocked when I start building systems
00:01:47
around the model and actually take the model and
integrate them into the existing processes I have.
00:01:52
So if we were to design a system to solve this,
00:01:56
I would have to give the model access to the
database where my vacation data is stored.
00:02:03
So that same query would get
fed into the language model.
00:02:07
The difference now is the model would
be prompted to create a search query,
00:02:13
and that would be a search query that
can go into the database that I have.
00:02:18
So that would go and fetch the information
from the database, output an answer,
00:02:23
and then that would go back into the
model that can generate a sentence
00:02:28
to answer, so, "Maya, you have ten days
left in your vacation database."
00:02:33
So the answer that I would get here would be correct.
00:02:42
This is an example of a compound AI system,
00:02:45
and it recognizes that certain problems are better solved
00:02:48
when you apply the principles of system design.
00:02:55
So what does that mean?
00:02:58
By the term "system", you can understand there's multiple components.
00:03:02
So systems are inherently modular.
00:03:04
I can have a model, I can choose between tuned models,
00:03:08
large language models, image generation models,
00:03:11
but also I have programmatic components that can come around it.
00:03:15
So I can have output verifiers.
00:03:18
I can have programs that can that can take
a query and then break it down
00:03:21
to increase the chances of the answer being correct.
00:03:25
I can combine that with searching databases.
00:03:27
I can combine that with different tools.
00:03:30
So when we talking about a system approaches,
00:03:33
I can break down what I desire my program to do
00:03:36
and pick the right components to be able to solve that.
00:03:40
And this is inherently easier to solve for than tuning a model.
00:03:45
So that makes this much faster and quicker to adapt.
00:03:54
Okay, so the example I use below,
00:03:58
is an example of a compound AI system.
00:04:00
You also might be popular with retrieval augmented generation (RAG),
00:04:05
which is one of the most popular
and commonly used compound AI systems out there.
00:04:11
Most RAG systems and the example I
use below are defined in a certain way.
00:04:18
So if I bring a very different query, let's
ask about the weather in this example here.
00:04:23
It's going to fail because this the path
that this program has to follow
00:04:28
is to always search my vacation policy database.
00:04:32
And that has nothing to do with the weather.
00:04:34
So when we say the path to answer a query,
00:04:37
we are talking about something called
the control logic of a program.
00:04:43
So compound AI systems, we said
most of them have programmatic control logic.
00:04:49
So that was something that I defined myself as the human.
00:04:55
Now let's talk about, where do agents come in?
00:05:00
One other way of controlling the logic
of a compound AI system
00:05:04
is to put a large language model in charge,
00:05:07
and this is only possible because
we're seeing tremendous improvements
00:05:11
in the capabilities of reasoning
of large language models.
00:05:15
So large language models, you
can feed them complex problems
00:05:18
and you can prompt them to break them down
and come up with a plan on how to tackle it.
00:05:23
Another way to think about it is,
00:05:25
on one end of the spectrum,
I'm telling my system to think fast,
00:05:30
act as programmed, and don't deviate
from the instructions I've given you.
00:05:34
And on the other end of the spectrum,
00:05:36
you're designing your system to think slow.
00:05:40
So, create a plan, attack each part of the plan,
00:05:44
see where you get stuck, see if you need to readjust the plan.
00:05:47
So I might give you a complex question,
00:05:49
and if you would just give me the
first answer that pops into your head,
00:05:53
very likely the answer might be wrong,
00:05:55
but you have higher chances of success
if you break it down,
00:05:59
understand where you need external help to
solve some parts of the problem,
00:06:02
and maybe take an afternoon to solve it.
00:06:05
And when we put a LLMs in charge of the logic,
00:06:08
this is when we're talking
about an agentic approach.
00:06:13
So let's break down the components of LLM agents.
00:06:19
The first capability is the ability to reason, which we talked about.
00:06:24
So this is putting the model at the core of how problems are being solved.
00:06:29
The model will be prompted to come up with a plan
and to reason about each step of the process along the way.
00:06:35
Another capability of agents is the ability to act.
00:06:39
And this is done by external programs
that are known in the industry as tools.
00:06:45
So tools are external pieces of the program,
00:06:48
and the model can define when to call them
and how to call them
00:06:52
in order to best execute the
solution to the question they've been asked.
00:06:56
So an example of a tool can be search,
00:06:59
searching the web, searching a database at their disposal.
00:07:03
Another example can be a
calculator to do some math.
00:07:08
This could be a piece of program code
that maybe might manipulate the database.
00:07:13
This can also be another language model that
maybe you're trying to do a translation task,
00:07:18
and you want a model that can be able to do that.
00:07:21
And there's so many other possibilities of what can do here.
00:07:23
So these can be APIs.
00:07:25
Basically any piece of external program
you want to give your model access to.
00:07:30
Third capability, that is
the ability to access memory.
00:07:35
And the term "memory" can mean a couple of things.
00:07:37
So we talked about the models thinking through the program
00:07:41
kind of how you think out loud
when you're trying to solve through a problem.
00:07:45
So those inner logs can be stored and can be
useful to retrieve at different points in time.
00:07:51
But also this could be the history of
conversations that you as a human had
00:07:56
when interacting with the agent.
00:07:57
And that would allow to make the experience
much more personalized.
00:08:01
So the way of configuring agents,
there's many are ways to approach it.
00:08:05
One of the more most popular ways of going about it is through something called ReACT,
00:08:11
which, as you can tell by the name,
00:08:13
combines the reasoning and act components of LLM agents.
00:08:18
So let's make this very concrete.
00:08:21
What happens when I configure a REACT agent?
00:08:23
You have your user query that gets fed into a model.
So an alarm the alarm is given a prompt.
00:08:31
So the instructions that's given is don't
give me the first answer that pops to you.
00:08:37
Think slow planning your work.
And then try to execute something.
00:08:44
Tried to act.
And when you want to act, you can define whether.
00:08:49
If you want to use external tools to
help you come up with the solution.
00:08:53
Once you get you call a
tool and you get an answer.
00:08:56
Maybe it gave you the wrong answer
or it came up with an error.
00:09:00
You can observe that.
So the alarm would observe.
00:09:02
The answer would determine if it does answer the
question at hand, or whether it needs to iterate
00:09:08
on the plan and tackle it differently.
Up until I get to a final answer.
00:09:17
So let's go back and make
this very concrete again.
00:09:20
Let's talk about my vacation example.
And as you can tell, I'm really excited
00:09:25
to go on one, so I want to take
the rest of my vacation days.
00:09:29
I'm planning to go on to Florida next month.
00:09:32
I'm planning on being outdoors
a lot and I'm prone to burning.
00:09:35
So I want to know what is the number of two ounce
sunscreen bottles that I should bring with me?
00:09:43
And this is a complex problem.
So there's a first thing.
00:09:45
There's a number of things to plan.
One is how many vacation days
00:09:49
are my planning to take?
And maybe that is information
00:09:52
the system can retrieve from its memory.
Because I asked that question before.
00:09:56
Two is how many hours do I plan to be in the sun?
I said, I plan to be in there a lot,
00:10:01
so maybe that would mean looking into the weather
forecast, for next month in Florida and seeing
00:10:06
what is the average sun hours that are expected.
Three is trying maybe going to a public health
00:10:13
website to understand what is the recommended
dosage of sunscreen per hour in the sun.
00:10:17
And then for doing some math, to be able
to determine how much of that sunscreen
00:10:22
fits into two ounce bottles.
So that's quite complicated.
00:10:25
But what's really powerful here is
there's so many paths that can be
00:10:29
explored in order to solve a problem.
So this makes the system quite modular.
00:10:33
And I can hit it with much more complex problems.
So going back to the concept of compound AI
00:10:40
systems, compound AI systems are here to stay.
What we're going to observe this year is that
00:10:44
they're going to become more agent tech.
The way I like to think about it is
00:10:49
you have a sliding scale of AI autonomy.
And you would the person defining the system
00:11:02
would examine what trade offs they want in terms
of autonomy in the system for certain problems,
00:11:09
especially problems that are narrow, well-defined.
So you don't expect someone to ask them about the
00:11:14
weather when they need to ask about vacations.
So a narrow problem set.
00:11:19
You can define a narrow system like this one.
It's more efficient to go the programmatic
00:11:24
route because every single query
will be answered the same way.
00:11:27
If I were to apply the genetic approach here,
there might be unnecessarily
00:11:32
looping and iteration.
So for narrow problems, pragmatic approach can
00:11:36
be more efficient than going the generic route.
But if I expect to have a system, accomplish very
00:11:43
complex tasks like, say, trying to solve
GitHub issues independently, and handle
00:11:50
a variety of queries, a spectrum of queries.
This is where an agent de Groot can be helpful,
00:11:54
because it would take you too much effort to
configure every single path in the system.
00:11:59
And we're still in the early days of agent systems.
00:12:02
We're seeing rapid progress when you combine the
effects of system design with a genetic behavior.
00:12:08
And of course, you will have a human in the
loop in most cases as the accuracy is improving.
00:12:13
I hope you found this video very useful, and
please subscribe to the channel to learn more.