What are AI Agents?

00:12:28
https://www.youtube.com/watch?v=F8NKVhkZZWI

摘要

TLDRIn 2024, AI agents are set to transform the landscape of problem-solving through their advanced capabilities. Unlike traditional AI models limited by their pre-trained data, AI agents operate as part of compound AI systems. These systems leverage the reasoning abilities of large language models to tackle complex queries by forming plans, utilizing external tools, and accessing memory for personalized, effective solutions. AI agents are characterized by their modular components, allowing for integration with databases and other tools to provide correct responses tailored to user-specific needs. This approach is advantageous over the conventional tuning of models due to faster adaptability and the ability to handle a spectrum of tasks. The introduction of agentic processes like ReACT illustrates the synthesis of reasoning and action components, paving the way for more autonomous systems. While human involvement remains crucial, the evolution of compound AI systems into more agentic models promises efficiency in addressing both simple and intricate problems across various domains.

心得

  • 🤖 AI agents will be prominent in 2024, enhancing problem-solving capabilities.
  • 📊 Compound AI systems integrate models into processes for efficient problem-solving.
  • 🧩 Modular systems allow for versatile use of models, tools, and databases.
  • 🧠 Reasoning in AI agents helps in planning and solving complex queries.
  • 🔧 Tools in AI agents enable acting and execution of solutions.
  • 💾 Memory access in AI agents aids in personalizing interactions.
  • ⚙️ System control logic defines paths and influences task execution.
  • 📈 Large language models boost agentic AI through improved reasoning.
  • 🔍 ReACT combines reasoning and acting for dynamic AI problem-solving.
  • 🔄 Human oversight remains crucial in developing AI accuracy and effectiveness.

时间轴

  • 00:00:00 - 00:05:00

    2024 is highlighted as the year of AI agents, with an initial focus on shifts in generative AI, especially moving from monolithic models to compound AI systems. Compound AI systems enhance models by integrating them with existing processes. A key example involves querying personal vacation data, demonstrating a system that links a language model to a database for accurate responses. This approach emphasizes modularity, involving models, programmatic components, and potentially multiple tools, adapting efficiently compared to tuning a single model. Retrieval augmented generation (RAG) is mentioned as common in such compound systems, although it may fail in unrelated queries due to predefined control logic.

  • 00:05:00 - 00:12:28

    Control logic in AI systems traditionally requires human-defined programmatic paths, but a new approach uses large language models (LLMs) to manage this logic, capitalizing on improved reasoning capabilities. AI agents, enabled by LLMs, can design comprehensive multi-step plans to solve complex problems, drawing on external tools and historical interaction data. They operate on a spectrum of autonomy—from executing pre-defined actions ('think fast') to dynamically planning ('think slow'). The ReACT framework exemplifies this, encouraging models to plan and adjust responses using tools as needed. Applications of agent systems, such as vacation planning in complex scenarios, demonstrate adaptability and modularity, supporting more intricate tasks and reducing human configuration effort. The video concludes by discussing efficiency trade-offs between agent approaches and traditional programmatic methods.

思维导图

Mind Map

常见问题

  • What are compound AI systems?

    Compound AI systems integrate models into existing processes with multiple components, making them versatile and adaptable for various tasks.

  • How do AI agents differ from traditional AI systems?

    AI agents use reasoning and acting capabilities to devise plans and employ tools for problem-solving, unlike traditional systems with fixed programming.

  • What capabilities make AI agents innovative?

    AI agents have the ability to act using external tools, reason through complex problems, and access memory for personalized experiences.

  • How do large language models contribute to AI agent behavior?

    Large language models use reasoning to break down complex problems, plan solutions, and adjust plans when necessary.

  • How do AI agents use memory?

    AI agents can access history logs and previous interactions to create personalized and context-aware responses.

  • What is REACT in the context of AI agents?

    REACT agents combine large language models' reasoning with the ability to perform tasks using external tools.

  • What is control logic in compound AI systems?

    The control logic dictates the path a system follows to answer a query, influencing its ability to solve different types of problems.

  • What is the future role of AI agents?

    AI agents will become prevalent in handling complex and varied tasks, with improvements in autonomy and system design.

  • Are AI agents fully pre-programmed?

    No, AI agents use an agentic approach allowing for dynamic problem-solving unlike fixed, programmatic systems.

  • Is human intervention still necessary in AI agent systems?

    Yes, human oversight is crucial as AI systems' accuracy continues to improve.

查看更多视频摘要

即时访问由人工智能支持的免费 YouTube 视频摘要!
字幕
en
自动滚动:
  • 00:00:00
    2024 will be the year of AI agents.
  • 00:00:04
    So what are AI agents?
  • 00:00:05
    And to start explaining that,
  • 00:00:07
    we have to look at the various shifts that  we're seeing in the field of generative AI.
  • 00:00:10
    And the first shift I would like to talk  to you about
  • 00:00:13
    is this move from monolithic models to compound AI systems.
  • 00:00:26
    So models on their own are limited by the data they've been trained on.
  • 00:00:31
    So that impacts what they know about the world
  • 00:00:34
    and what sort of tasks they can solve.
  • 00:00:40
    They are also hard to adapt.
  • 00:00:42
    So you could tune a model, but it would take  an investment in data,
  • 00:00:46
    and in resources.
  • 00:00:51
    So let's take a concrete example  to illustrate this point.
  • 00:00:55
    I want to plan a vacation for this summer,
  • 00:00:58
    and I want to know how many vacation days are at my disposal.
  • 00:01:06
    What I can do is take my query,
  • 00:01:10
    feed that into a model that can generate a response.
  • 00:01:19
    I think we can all expect that this answer will be incorrect,
  • 00:01:23
    because the model doesn't know who I am
  • 00:01:26
    and does not have access  to this sensitive information about me.
  • 00:01:30
    So models on their own could be useful for a  number of tasks, as we've seen in other videos.
  • 00:01:35
    So they can help with summarizing documents,
  • 00:01:38
    they can help me with creating first drafts for emails
  • 00:01:41
    and different reports I'm trying to do.
  • 00:01:43
    But the magic gets unlocked when I start building systems
  • 00:01:47
    around the model and actually take the model and  integrate them into the existing processes I have.
  • 00:01:52
    So if we were to design a system to solve this,
  • 00:01:56
    I would have to give the model access to the  database where my vacation data is stored.
  • 00:02:03
    So that same query would get  fed into the language model.
  • 00:02:07
    The difference now is the model would  be prompted to create a search query,
  • 00:02:13
    and that would be a search query that  can go into the database that I have.
  • 00:02:18
    So that would go and fetch the information  from the database, output an answer,
  • 00:02:23
    and then that would go back into the  model that can generate a sentence
  • 00:02:28
    to answer, so, "Maya, you have ten days  left in your vacation database."
  • 00:02:33
    So the answer that I would get here would be correct.
  • 00:02:42
    This is an example of a compound AI system,
  • 00:02:45
    and it recognizes that certain problems are better solved
  • 00:02:48
    when you apply the principles of system design.
  • 00:02:55
    So what does that mean?
  • 00:02:58
    By the term "system", you can understand there's multiple components.
  • 00:03:02
    So systems are inherently modular.
  • 00:03:04
    I can have a model, I can choose between tuned models,
  • 00:03:08
    large language models, image generation models,
  • 00:03:11
    but also I have programmatic components that can come around it.
  • 00:03:15
    So I can have output verifiers.
  • 00:03:18
    I can have programs that can that can take  a query and then break it down
  • 00:03:21
    to increase the chances of the answer being correct.
  • 00:03:25
    I can combine that with searching databases.
  • 00:03:27
    I can combine that with different tools.
  • 00:03:30
    So when we talking about a system approaches,
  • 00:03:33
    I can break down what I desire my program to do
  • 00:03:36
    and pick the right components to be able to solve that.
  • 00:03:40
    And this is inherently easier to solve for than tuning a model.
  • 00:03:45
    So that makes this much faster and quicker to adapt.
  • 00:03:54
    Okay, so the example I use below,
  • 00:03:58
    is an example of a compound AI system.
  • 00:04:00
    You also might be popular with retrieval augmented generation (RAG),
  • 00:04:05
    which is one of the most popular  and commonly used compound AI systems out there.
  • 00:04:11
    Most RAG systems and the example I  use below are defined in a certain way.
  • 00:04:18
    So if I bring a very different query, let's  ask about the weather in this example here.
  • 00:04:23
    It's going to fail because this the path  that this program has to follow
  • 00:04:28
    is to always search my vacation policy database.
  • 00:04:32
    And that has nothing to do with the weather.
  • 00:04:34
    So when we say the path to answer a query,
  • 00:04:37
    we are talking about something called  the control logic of a program.
  • 00:04:43
    So compound AI systems, we said   most of them have programmatic control logic.
  • 00:04:49
    So that was something that I defined myself as the human.
  • 00:04:55
    Now let's talk about, where do agents come in?
  • 00:05:00
    One other way of controlling the logic  of a compound AI system
  • 00:05:04
    is to put a large language model in charge,
  • 00:05:07
    and this is only possible because   we're seeing tremendous improvements
  • 00:05:11
    in the capabilities of reasoning   of large language models.
  • 00:05:15
    So large language models, you  can feed them complex problems
  • 00:05:18
    and you can prompt them to break them down  and come up with a plan on how to tackle it.
  • 00:05:23
    Another way to think about it is,
  • 00:05:25
    on one end of the spectrum,  I'm telling my system to think fast,
  • 00:05:30
    act as programmed, and don't deviate  from the instructions I've given you.
  • 00:05:34
    And on the other end of the spectrum,
  • 00:05:36
    you're designing your system to think slow.
  • 00:05:40
    So, create a plan, attack each part of the plan,
  • 00:05:44
    see where you get stuck, see if you need to readjust the plan.
  • 00:05:47
    So I might give you a complex question,
  • 00:05:49
    and if you would just give me the  first answer that pops into your head,
  • 00:05:53
    very likely the answer might be wrong,
  • 00:05:55
    but you have higher chances of success  if you break it down,
  • 00:05:59
    understand where you need external help to  solve some parts of the problem,
  • 00:06:02
    and maybe take an afternoon to solve it.
  • 00:06:05
    And when we put a LLMs in charge of the logic,
  • 00:06:08
    this is when we're talking  about an agentic approach.
  • 00:06:13
    So let's break down the components of LLM agents.
  • 00:06:19
    The first capability is the ability to reason, which we talked about.
  • 00:06:24
    So this is putting the model at the core of how problems are being solved.
  • 00:06:29
    The model will be prompted to come up with a plan  and to reason about each step of the process along the way.
  • 00:06:35
    Another capability of agents is the ability to act.
  • 00:06:39
    And this is done by external programs  that are known in the industry as tools.
  • 00:06:45
    So tools are external pieces of the program,
  • 00:06:48
    and the model can define when to call them  and how to call them
  • 00:06:52
    in order to best execute the  solution to the question they've been asked.
  • 00:06:56
    So an example of a tool can be search,
  • 00:06:59
    searching the web, searching a database at their disposal.
  • 00:07:03
    Another example can be a  calculator to do some math.
  • 00:07:08
    This could be a piece of program code  that maybe might manipulate the database.
  • 00:07:13
    This can also be another language model that  maybe you're trying to do a translation task,
  • 00:07:18
    and you want a model that can be able to do that.
  • 00:07:21
    And there's so many other possibilities of what can do here.
  • 00:07:23
    So these can be APIs.
  • 00:07:25
    Basically any piece of external program  you want to give your model access to.
  • 00:07:30
    Third capability, that is  the ability to access memory.
  • 00:07:35
    And the term "memory" can mean a couple of things.
  • 00:07:37
    So we talked about the models thinking through the program
  • 00:07:41
    kind of how you think out loud  when you're trying to solve through a problem.
  • 00:07:45
    So those inner logs can be stored and can be  useful to retrieve at different points in time.
  • 00:07:51
    But also this could be the history of  conversations that you as a human had
  • 00:07:56
    when interacting with the agent.
  • 00:07:57
    And that would allow to make the experience   much more personalized.
  • 00:08:01
    So the way of configuring agents,   there's many are ways to approach it.
  • 00:08:05
    One of the more most popular ways of going about it is through something called ReACT,
  • 00:08:11
    which, as you can tell by the name,
  • 00:08:13
    combines the reasoning and act components of LLM agents.
  • 00:08:18
    So let's make this very concrete.
  • 00:08:21
    What happens when I configure a REACT agent?
  • 00:08:23
    You have your user query that gets fed into a model. So an alarm the alarm is given a prompt.
  • 00:08:31
    So the instructions that's given is don't  give me the first answer that pops to you.
  • 00:08:37
    Think slow planning your work. And then try to execute something.
  • 00:08:44
    Tried to act. And when you want to act, you can define whether.
  • 00:08:49
    If you want to use external tools to  help you come up with the solution.
  • 00:08:53
    Once you get you call a  tool and you get an answer.
  • 00:08:56
    Maybe it gave you the wrong answer  or it came up with an error.
  • 00:09:00
    You can observe that. So the alarm would observe.
  • 00:09:02
    The answer would determine if it does answer the  question at hand, or whether it needs to iterate
  • 00:09:08
    on the plan and tackle it differently. Up until I get to a final answer.
  • 00:09:17
    So let's go back and make  this very concrete again.
  • 00:09:20
    Let's talk about my vacation example. And as you can tell, I'm really excited
  • 00:09:25
    to go on one, so I want to take  the rest of my vacation days.
  • 00:09:29
    I'm planning to go on to Florida next month.
  • 00:09:32
    I'm planning on being outdoors  a lot and I'm prone to burning.
  • 00:09:35
    So I want to know what is the number of two ounce  sunscreen bottles that I should bring with me?
  • 00:09:43
    And this is a complex problem. So there's a first thing.
  • 00:09:45
    There's a number of things to plan. One is how many vacation days
  • 00:09:49
    are my planning to take? And maybe that is information
  • 00:09:52
    the system can retrieve from its memory. Because I asked that question before.
  • 00:09:56
    Two is how many hours do I plan to be in the sun? I said, I plan to be in there a lot,
  • 00:10:01
    so maybe that would mean looking into the weather  forecast, for next month in Florida and seeing
  • 00:10:06
    what is the average sun hours that are expected. Three is trying maybe going to a public health
  • 00:10:13
    website to understand what is the recommended  dosage of sunscreen per hour in the sun.
  • 00:10:17
    And then for doing some math, to be able  to determine how much of that sunscreen
  • 00:10:22
    fits into two ounce bottles. So that's quite complicated.
  • 00:10:25
    But what's really powerful here is  there's so many paths that can be
  • 00:10:29
    explored in order to solve a problem. So this makes the system quite modular.
  • 00:10:33
    And I can hit it with much more complex problems. So going back to the concept of compound AI
  • 00:10:40
    systems, compound AI systems are here to stay. What we're going to observe this year is that
  • 00:10:44
    they're going to become more agent tech. The way I like to think about it is
  • 00:10:49
    you have a sliding scale of AI autonomy. And you would the person defining the system
  • 00:11:02
    would examine what trade offs they want in terms  of autonomy in the system for certain problems,
  • 00:11:09
    especially problems that are narrow, well-defined. So you don't expect someone to ask them about the
  • 00:11:14
    weather when they need to ask about vacations. So a narrow problem set.
  • 00:11:19
    You can define a narrow system like this one. It's more efficient to go the programmatic
  • 00:11:24
    route because every single query  will be answered the same way.
  • 00:11:27
    If I were to apply the genetic approach here, there might be unnecessarily
  • 00:11:32
    looping and iteration. So for narrow problems, pragmatic approach can
  • 00:11:36
    be more efficient than going the generic route. But if I expect to have a system, accomplish very
  • 00:11:43
    complex tasks like, say, trying to solve  GitHub issues independently, and handle
  • 00:11:50
    a variety of queries, a spectrum of queries. This is where an agent de Groot can be helpful,
  • 00:11:54
    because it would take you too much effort to  configure every single path in the system.
  • 00:11:59
    And we're still in the early days of agent systems.
  • 00:12:02
    We're seeing rapid progress when you combine the  effects of system design with a genetic behavior.
  • 00:12:08
    And of course, you will have a human in the  loop in most cases as the accuracy is improving.
  • 00:12:13
    I hope you found this video very useful, and  please subscribe to the channel to learn more.
标签
  • AI agents
  • compound AI systems
  • large language models
  • ReACT agents
  • system design
  • autonomy
  • problem-solving
  • reasoning
  • modular systems
  • tools