AWS Summit Berlin 2024 - Prompt engineering best practices for LLMs on Amazon Bedrock (AIM302)

00:39:35
https://www.youtube.com/watch?v=L77pbuKymEU

Summary

TLDRAWS experts detailed best practices for prompt engineering with Amazon Bedrock and Anthropic Claude. They emphasized clarity, role prompting, few-shot prompting, and chain-of-thought prompting to guide LLM responses effectively. Techniques like XML tags improve structural prompts, while retrieval-augmented generation enhances model accuracy using external knowledge bases. Amazon Bedrock simplifies deployment, supporting retrieval-augmented generation, guardrails, and agents for function integration. Examples demonstrated improving LLM behavior for tasks like generating JSON files, classifying tweets, and solving puzzles. Guidance for mitigating hallucinations and malicious prompts, alongside prompt templates, supported responsible and efficient LLM use. Tools like Amazon Bedrock's Knowledge Bases and pre-designed agents streamline workflows.

Takeaways

  • 🧠 Prompt engineering requires clarity and creativity for effective guidance.
  • πŸ“˜ Role prompting improves context relevance for models.
  • πŸ’‘ Few-shot and chain-of-thought techniques enhance response accuracy.
  • πŸ“œ XML tags organize structured prompt formats.
  • πŸ“ˆ Amazon Bedrock simplifies deploying retrieval-augmented generation solutions.
  • πŸ” Retrieval-augmented generation provides dynamically updated knowledge.
  • βš™οΈ Agents for Amazon Bedrock enable API integrations via modeled tools.
  • πŸ”’ Guardrails protect against malicious or harmful prompts.
  • πŸ› οΈ LLM hallucinations can be reduced by context control and retrieval techniques.
  • πŸ“² Amazon Bedrock accelerates the adoption of structured LLM solutions.

Timeline

  • 00:00:00 - 00:05:00

    Introduction to the session on prompt engineering best practices for large language models on Amazon Bedrock, highlighting the creative aspect of prompt engineering and its role compared to traditional engineering disciplines. A simple example is provided to illustrate how different prompts can lead to varied responses from a language model.

  • 00:05:00 - 00:10:00

    Exploration of techniques like 'one-shot prompting', where giving a clear example helps guide the model to the desired response. The session explains this technique with practical examples, demonstrating how setting initial conditions can influence the model's output.

  • 00:10:00 - 00:15:00

    Further discussion on 'few-shot prompting' and 'Chain of Thought prompting,' where providing several examples and encouraging a step-by-step thought process can significantly enhance the model's accuracy and reliability. The combination of these techniques to yield precise outputs is illustrated.

  • 00:15:00 - 00:20:00

    Introduction to more advanced prompting strategies, such as role prompting and the use of XML tags in structuring prompts, especially for complex outputs like formatted email responses. The focus is on achieving clarity and specificity to direct the model's behavior effectively.

  • 00:20:00 - 00:25:00

    Explanation of using structured prompt templates and large context handling in models like CLA with up to 100,000 tokens. The segment emphasizes the importance of pre-filling expected output and using XML tags to manage prompt complexity and improve response accuracy.

  • 00:25:00 - 00:30:00

    Coverage of advanced concepts like 'retrieval augmented generation', where additional context is dynamically integrated into the model’s response process, and the system prompt template for setting initial conversational context for different use cases. These are tied into practical applications like building a career coach assistant.

  • 00:30:00 - 00:39:35

    Insight into implementing function calls and agent frameworks within LLMs to manage user input and extend functionalities. Examples are given on setting up API-like interactions and safeguards against malicious input, reinforcing the robustness of prompt engineering in creating responsible AI applications.

Show more

Mind Map

Video Q&A

  • What is prompt engineering?

    Prompt engineering involves crafting input instructions to guide large language models (LLMs) effectively for desired responses.

  • What are some key best practices for prompt engineering?

    Best practices include clarity and specificity, role prompting, few-shot prompting, chain-of-thought prompting, and leveraging XML tags for structure.

  • What is Few-Shot Prompting?

    Few-shot prompting provides multiple examples to the model to guide its behavior and improve accuracy.

  • What is Chain-of-Thought Prompting?

    Chain-of-thought prompting involves instructing the LLM to think step by step for better reasoning capabilities.

  • How does Amazon Bedrock assist with LLM tasks?

    Amazon Bedrock simplifies the use of multiple LLMs while offering features like Knowledge Bases for retrieval-augmented generation and agents for external API interactions.

  • What are XML tags used for in prompting?

    XML tags are used to structure and organize the input prompts provided to models like Anthropic Claude.

  • What is retrieval-augmented generation?

    This method combines external knowledge bases with LLMs by dynamically injecting relevant information into prompts based on user queries.

  • What are agents in Amazon Bedrock?

    Agents enable the integration of external tools and APIs in conjunction with LLMs using function-based instructions.

  • How do you reduce LLM hallucinations?

    Encourage LLMs to say 'I don't know' when uncertain, limit outputs to pre-defined contexts, and use retrieval-augmented generation for reliable context.

  • How can malicious prompts be mitigated?

    Use harmlessness screens or guardrails like Amazon Bedrock's built-in features to filter harmful prompts and responses.

View more video summaries

Get instant access to free YouTube video summaries powered by AI!
Subtitles
en
Auto Scroll:
  • 00:00:03
    so hello and welcome to aw Summit Berlin
  • 00:00:06
    and I hope you had some great session so
  • 00:00:08
    far my name is conine Gonzalez I'm a
  • 00:00:10
    principal Solutions architect with AWS
  • 00:00:13
    and my name is Elina lek and I'm an
  • 00:00:15
    associate Solutions architect with AWS
  • 00:00:18
    right so today we're going to talk about
  • 00:00:20
    prompt engineering best practices for
  • 00:00:22
    large language models on Amazon Bedrock
  • 00:00:25
    so let's dive in we are going to start
  • 00:00:27
    by setting the scene a little bit then
  • 00:00:30
    then we are going to look at some useful
  • 00:00:31
    techniques that Elina will then give you
  • 00:00:35
    some examples for on how to apply to
  • 00:00:37
    anthropic Cloud which is one of our
  • 00:00:39
    favorite models we'll take a quick look
  • 00:00:42
    Beyond prompting and finally we'll give
  • 00:00:44
    you some resources that you can use when
  • 00:00:46
    you start your own prompt engineering
  • 00:00:49
    Journey so prompt engineering really is
  • 00:00:52
    a bit of an art form and um to
  • 00:00:55
    understand this let's take a look at
  • 00:00:56
    what a prompt really means a prompt is
  • 00:00:59
    the information you pass into a large
  • 00:01:00
    language model to get some kind of
  • 00:01:03
    response so you write some text which is
  • 00:01:05
    the prompt you put it into the large
  • 00:01:07
    language model and you get a res
  • 00:01:10
    response back and um this is a bit of an
  • 00:01:14
    uncertain science so far right if we
  • 00:01:16
    look at a traditional job like an
  • 00:01:19
    engineer Engineers they can depend on
  • 00:01:22
    the rules of physics um so they always
  • 00:01:25
    know what they're getting similarly
  • 00:01:27
    software Engineers as a software
  • 00:01:29
    engineer you can depend on some
  • 00:01:31
    syntactical and semantic rules and
  • 00:01:33
    everything is very precise but as a
  • 00:01:35
    prompt engineer you're a bit more like
  • 00:01:37
    an artist and you have to try out
  • 00:01:41
    different things find your way around be
  • 00:01:43
    creative which is a good thing so why am
  • 00:01:47
    I saying that let's take a look at a
  • 00:01:48
    simple example if the prompt is what is
  • 00:01:50
    10 + 10 you might get as a response 20
  • 00:01:54
    if you think this is a math problem but
  • 00:01:57
    you might also get something like 10
  • 00:01:59
    plus 10 is an addition problem so this
  • 00:02:02
    is more like a
  • 00:02:04
    classification answer right or you can
  • 00:02:06
    get something like Quant SDS Ms which is
  • 00:02:10
    it would be a translation use case so
  • 00:02:12
    one prompt three different
  • 00:02:15
    responses uh there are other ways you
  • 00:02:17
    can have fun with prompting for example
  • 00:02:19
    in this case we're instructing the model
  • 00:02:21
    to uh with uh the beginning phrase of
  • 00:02:24
    you are a high school physics teacher
  • 00:02:26
    answer questions in one sentence and
  • 00:02:29
    then we give in a question like explain
  • 00:02:32
    quantum entanglement and we get
  • 00:02:34
    something that makes sense from a point
  • 00:02:36
    of view of a high school teacher quantum
  • 00:02:37
    entanglement is a phenomenon in which
  • 00:02:39
    two or more particles y y y that sort of
  • 00:02:41
    thing but if we start slightly
  • 00:02:43
    differently by telling the model you are
  • 00:02:45
    an excited three-year-old who ate a lot
  • 00:02:47
    of sugar answer questions in one
  • 00:02:49
    sentence and give it the same question
  • 00:02:52
    explain quantum entanglement we get a
  • 00:02:54
    very different response woo woo bash bam
  • 00:02:58
    particles go Zippy zappy when they're
  • 00:02:59
    together together and even if you take
  • 00:03:00
    them really far apart they still know
  • 00:03:02
    what their friend is doing cuz they're
  • 00:03:04
    magic connected same question different
  • 00:03:08
    answers and the difference really comes
  • 00:03:10
    down to prompt
  • 00:03:12
    engineering so instructions matter and
  • 00:03:16
    um when you put together your own
  • 00:03:19
    prompts think about Clarity and
  • 00:03:22
    specificity here are two ways of
  • 00:03:25
    prompting one way is tell me a story
  • 00:03:26
    about chaos but there are so many ways
  • 00:03:29
    you could tell a story about cows or you
  • 00:03:31
    can be much more specific and clear tell
  • 00:03:33
    me a story about cows it should be
  • 00:03:35
    roughly 2,000 words long and appropriate
  • 00:03:37
    for 5th graders it should be
  • 00:03:38
    entertaining but with a moral message
  • 00:03:40
    about the value of loyalty make it
  • 00:03:42
    amazing and
  • 00:03:44
    memorable so the key Point here is just
  • 00:03:47
    like humans llms cannot read your mind
  • 00:03:50
    as prompt Engineers it is our job to
  • 00:03:54
    bring our mind into a text prompt that
  • 00:03:56
    tells the model how to behave so here
  • 00:03:59
    are some useful techniques that you can
  • 00:04:01
    use one of the earliest discovered
  • 00:04:04
    techniques is called onot prompting
  • 00:04:07
    which essentially boils down to giving
  • 00:04:08
    the model an example of what you're
  • 00:04:10
    looking for for example here we want to
  • 00:04:13
    generate airpod codes out of text so we
  • 00:04:17
    might start with an example of what we
  • 00:04:19
    are really looking for and the example
  • 00:04:20
    is I want to F fly from Los Angeles to
  • 00:04:22
    Miami provide the airpods code only and
  • 00:04:25
    we are already giving the expected
  • 00:04:27
    results for this particular example
  • 00:04:28
    which is l ax and Mia for those two um
  • 00:04:32
    airport codes and then we follow up with
  • 00:04:34
    the actual question we want some answer
  • 00:04:36
    for I want to fly from Dallas to San
  • 00:04:38
    Francisco we start the assistance
  • 00:04:40
    response with airport Cod and square
  • 00:04:43
    bracket and now the model knows exactly
  • 00:04:45
    what we're expecting and it completes
  • 00:04:47
    our sentence with DFW and SFO so this is
  • 00:04:50
    an example of oneshot prompting we give
  • 00:04:53
    the model one example and use that to
  • 00:04:57
    illustrate what we are looking for so
  • 00:04:59
    that the model is kind of guided into
  • 00:05:02
    giving us the the right kind of response
  • 00:05:04
    we uh response we're looking for you can
  • 00:05:07
    also do F short prompting and F short
  • 00:05:09
    prompting is what you would expect we
  • 00:05:11
    give it multiple examples for example in
  • 00:05:13
    this case we want to use a large
  • 00:05:15
    language model to classify tweets we
  • 00:05:18
    give it three different examples of
  • 00:05:20
    three different classifications that we
  • 00:05:21
    are looking for and then the fourth one
  • 00:05:23
    is the actual thing we wanted to do
  • 00:05:25
    where we are kind of pasting in some
  • 00:05:27
    tweet and then we get uh hopefully what
  • 00:05:29
    we are looking for because now we gave
  • 00:05:31
    it some really good examples that tell
  • 00:05:33
    the model what we what we
  • 00:05:36
    want you can
  • 00:05:38
    also use a technique called Chain of
  • 00:05:41
    Thought prompting so with Chain of
  • 00:05:44
    Thought prompting we want to make sure
  • 00:05:46
    that the model
  • 00:05:48
    really puts some energy into thinking
  • 00:05:51
    clearly and we can actually do this
  • 00:05:54
    really easily by telling it let's think
  • 00:05:56
    step by step so on the left hand side
  • 00:05:58
    you see an example example where we're
  • 00:06:00
    asking the model to solve a puzzle for
  • 00:06:02
    us a juggler can juggle 16 balls half of
  • 00:06:05
    the balls are golf balls and half of
  • 00:06:06
    them are blue blah blah blah and we're
  • 00:06:08
    getting a wrong answer because the model
  • 00:06:11
    gets kind of confused right but when we
  • 00:06:13
    are adding the simple sentence let's
  • 00:06:15
    think step by step now we're getting a
  • 00:06:17
    good answer out of the same model this
  • 00:06:20
    is a surprising trick that researchers
  • 00:06:22
    found out in the early days of llms and
  • 00:06:25
    it still works so think about when
  • 00:06:27
    you're not getting the right results ask
  • 00:06:29
    simply ask it to to uh think step by
  • 00:06:32
    step uh which is called Chain of Thought
  • 00:06:35
    prompting and now we can combine these
  • 00:06:38
    two things we can use examples and use
  • 00:06:41
    that example to teach the model how to
  • 00:06:44
    think step by step so left hand side
  • 00:06:47
    again uh an example that doesn't work we
  • 00:06:49
    are using Roger has five tennis balls
  • 00:06:52
    that's the the the first puzzle and
  • 00:06:54
    we're giving it the answer we want like
  • 00:06:56
    the answer is 11 but the model doesn't
  • 00:06:58
    quite understand how to get to that
  • 00:07:00
    answer right on the right hand side same
  • 00:07:04
    example but now in our example answer
  • 00:07:07
    we're kind of telling we're guiding the
  • 00:07:09
    model along our thinking process for
  • 00:07:12
    example in this tennis ball example the
  • 00:07:13
    thinking process is Roger started with
  • 00:07:16
    five balls two cans of three tennis
  • 00:07:17
    balls each is six tennis balls 5 + 6
  • 00:07:20
    equal 11 the answer is 11 so we adding
  • 00:07:23
    the thinking process which is implicit
  • 00:07:26
    Chain of Thought prompting and now we we
  • 00:07:28
    can get when we paste the actual
  • 00:07:30
    question that we want some answer for
  • 00:07:32
    like the cafeteria had 23 apples and so
  • 00:07:34
    on now we're getting the right model
  • 00:07:36
    output including the thinking process
  • 00:07:38
    which also has the capability for us to
  • 00:07:40
    debug what the model was thinking so it
  • 00:07:43
    tells us how it arrived at the answer
  • 00:07:45
    and surprise it gets to the right answer
  • 00:07:48
    because it knows how to think so when
  • 00:07:51
    you craft examples think about using
  • 00:07:54
    examples where you're actually telling
  • 00:07:56
    the model how exactly to
  • 00:07:58
    think so so with those initial examples
  • 00:08:02
    out of way the way let's get way more
  • 00:08:04
    practical and I'd like to introduce you
  • 00:08:07
    to Elina who is going to show you how to
  • 00:08:09
    prompt anthropic cloth which is my
  • 00:08:11
    personal favorite model
  • 00:08:13
    here so are you looking for some
  • 00:08:16
    concrete examples of the best practices
  • 00:08:19
    that we recommend impr prompt
  • 00:08:21
    engineering this section will uh cover
  • 00:08:25
    nine best practices and I would actually
  • 00:08:29
    start was introducing myself so my name
  • 00:08:31
    is Elina lassik and I joined AWS two
  • 00:08:34
    years ago and well I wanted to follow my
  • 00:08:37
    passion in scaling ideas as well as
  • 00:08:39
    built machine learning products I have
  • 00:08:42
    had an honor to work with customers from
  • 00:08:44
    different segments of different company
  • 00:08:46
    sizes and they all have had one
  • 00:08:51
    question that question was what is that
  • 00:08:55
    model that I should start experimenting
  • 00:08:58
    with in many cases the answer was indeed
  • 00:09:01
    entropic models that is exactly why we
  • 00:09:04
    have decided to bring concrete examples
  • 00:09:06
    of prompt engineering with entropic
  • 00:09:10
    clot I would like to start
  • 00:09:15
    with introducing you to the latest
  • 00:09:17
    edition of entropic CLA model family
  • 00:09:21
    namely clae 3 with the corresponding API
  • 00:09:25
    namely messages API an important hint to
  • 00:09:28
    prompt engine engering is to think a
  • 00:09:31
    step back and consider how the llms were
  • 00:09:35
    trained in the case of CLA we have the
  • 00:09:38
    alternating dialogue between user and
  • 00:09:42
    assistant that is built in the form of
  • 00:09:45
    history of the conversation which in
  • 00:09:47
    this case would be
  • 00:09:49
    messages and that is an important hint
  • 00:09:51
    for us to also consider once we are
  • 00:09:53
    doing prompt
  • 00:09:55
    engineering so if you have previously
  • 00:09:58
    used uh models lower than three then you
  • 00:10:01
    would uh see something like completion
  • 00:10:03
    API and if you would like to use clot
  • 00:10:06
    three models you would need to use
  • 00:10:08
    messages API and on the right hand side
  • 00:10:11
    right now you can see the example of how
  • 00:10:13
    the input into messages API can look
  • 00:10:15
    like and that is following this idea of
  • 00:10:18
    the alternating dialogue between user
  • 00:10:20
    and assistant that are given as rowes
  • 00:10:23
    within messages and we also have
  • 00:10:25
    something called system prompt that we
  • 00:10:27
    will cover in several minutes minutes in
  • 00:10:29
    more
  • 00:10:31
    detail now I would like to start with
  • 00:10:34
    the best practice number one and you
  • 00:10:36
    won't be surprised by it because it is
  • 00:10:40
    the tip on indeed being very clear and
  • 00:10:43
    Direct in our
  • 00:10:45
    prompts clot definitely responds better
  • 00:10:48
    to clear and direct
  • 00:10:50
    instructions at the same time I am
  • 00:10:53
    personally still a big victim of using
  • 00:10:55
    you know this polite words like could
  • 00:10:57
    you please
  • 00:10:59
    claw can we consider and uh similar
  • 00:11:03
    things that are rather wake in case of
  • 00:11:06
    prompt engineering so it is helpful for
  • 00:11:09
    me to always recall the principle of
  • 00:11:11
    clarity here as well as something that
  • 00:11:14
    we could consider as rubber duck
  • 00:11:16
    principle in software development but in
  • 00:11:19
    case of building um LMS LM applications
  • 00:11:23
    we can think about Golden Rule of clear
  • 00:11:26
    prompting the idea here is that once we
  • 00:11:28
    are in doubt we can go with our prompt
  • 00:11:31
    to our friend or to our colleague and
  • 00:11:33
    indeed ask them to follow something that
  • 00:11:37
    we are pretty much prompting for if they
  • 00:11:40
    are able to follow llm is also likely to
  • 00:11:43
    follow on the right right uh hand side
  • 00:11:46
    of the slide you can see our example and
  • 00:11:49
    well if we are willing to have a hio on
  • 00:11:52
    a certain topic and we would like to
  • 00:11:54
    skip the Preamble that is in the form of
  • 00:11:56
    well here is this hu the idea would be
  • 00:11:59
    to indeed be very clear in that and
  • 00:12:01
    instruct the model to skip this
  • 00:12:04
    Preamble so having the clarity in mind
  • 00:12:07
    we would proceed and do you recall
  • 00:12:10
    Constantine's example on um the
  • 00:12:12
    difference between explanations of
  • 00:12:16
    quantum entanglement between PhD student
  • 00:12:19
    and this over glucose
  • 00:12:21
    kit well the idea here is indeed to
  • 00:12:25
    utilize
  • 00:12:26
    roles uh also known as Ro prompt
  • 00:12:30
    and it is helping us to give the context
  • 00:12:33
    to claw to understand what role it
  • 00:12:35
    should
  • 00:12:37
    inhibit it won't be big secret that LMS
  • 00:12:40
    are not the best ones when it comes to
  • 00:12:42
    math they generally don't calculate
  • 00:12:45
    responses they generate responses at the
  • 00:12:48
    same time when we're using Ro prompting
  • 00:12:51
    we can help CLA to be more close to math
  • 00:12:54
    world or the world of solving
  • 00:12:57
    puzzles thus imp proving
  • 00:13:01
    accuracy at the same time how I'm
  • 00:13:04
    thinking about role prompting is the
  • 00:13:05
    sort of dress code so if we are thinking
  • 00:13:08
    about the parties that we should join we
  • 00:13:09
    are checking what is the style that is
  • 00:13:11
    expected there should it be casual
  • 00:13:13
    should it be formal that is also the
  • 00:13:15
    concept that we can utilize here and
  • 00:13:18
    think about this way of speaking to llms
  • 00:13:22
    that can change the tone and potential
  • 00:13:25
    style of the conversation that we will
  • 00:13:27
    be going through and on on the right
  • 00:13:29
    hand side of the slide you can see the
  • 00:13:31
    example of the puzzle that without any
  • 00:13:34
    additional role is likely to be solved
  • 00:13:37
    in a wrong way however the simple trick
  • 00:13:40
    of saying you are a master solver of all
  • 00:13:43
    puzzles in the world can help to improve
  • 00:13:45
    the
  • 00:13:48
    accuracy Constantine has introduced us
  • 00:13:50
    to the concept of fusure prompting and
  • 00:13:54
    in context of entropic models how this
  • 00:13:57
    can be put into action
  • 00:13:59
    well we can use examples and we see that
  • 00:14:03
    examples are indeed very effective in
  • 00:14:06
    generally speaking it is also very
  • 00:14:08
    helpful to consider examples that are
  • 00:14:11
    coming from different edge
  • 00:14:14
    cases we know that the more examples the
  • 00:14:18
    better the response is of course with
  • 00:14:20
    certain trade-offs that we can see when
  • 00:14:22
    it comes to the number of tokens and on
  • 00:14:25
    the slide you can see the ex you can see
  • 00:14:27
    the example and if our goal is indeed to
  • 00:14:30
    get the concrete name name and surname
  • 00:14:34
    of the author of a certain passage
  • 00:14:35
    without anything extra we can pretty
  • 00:14:38
    much prompt llm to give us exactly this
  • 00:14:43
    by using
  • 00:14:46
    example we have also thought about uh
  • 00:14:48
    the concept of Chain of Thought and how
  • 00:14:50
    to use it with claw and it works in a
  • 00:14:54
    way that we can just give clot time to
  • 00:14:57
    think if we would look into the example
  • 00:15:00
    of the input that we are passing into
  • 00:15:03
    llm we are using thinking as our XML TX
  • 00:15:08
    and we are letting the model to put
  • 00:15:11
    chain of thought into action and
  • 00:15:13
    consider this stinking stepbystep
  • 00:15:15
    process within thinking
  • 00:15:17
    steps how the output looks like
  • 00:15:22
    Well normally we would get the response
  • 00:15:25
    from the assistant that would start with
  • 00:15:27
    thinking and then with answering and
  • 00:15:29
    what is interesting is that exactly
  • 00:15:31
    something that we would find within
  • 00:15:33
    thinking XML Tex is helpful for us to
  • 00:15:36
    debug or to
  • 00:15:39
    troubleshoot you can already see here
  • 00:15:42
    some peculiar feature of CLA models what
  • 00:15:46
    is it well it is indeed using XML Tex
  • 00:15:49
    and that is the tip number five that we
  • 00:15:51
    would like to
  • 00:15:53
    give XML texts are helpful to structure
  • 00:15:57
    the prompts that we are sending to l
  • 00:15:59
    LS
  • 00:16:01
    and once we would like to have certain
  • 00:16:03
    sections within our prompts that is
  • 00:16:05
    where XML text can help us to have the
  • 00:16:08
    structure if we would look into the
  • 00:16:11
    example of us willing to have the email
  • 00:16:14
    of well if we are the person that should
  • 00:16:16
    ask our team to show up at some early uh
  • 00:16:20
    time of the day as uh 6:00 a.m. and
  • 00:16:23
    draft this email in some formal response
  • 00:16:25
    if we will just throw this into model in
  • 00:16:27
    this form of well just show up at 600
  • 00:16:29
    a.m. because I say so the model can
  • 00:16:33
    sometimes misjudge and not understand
  • 00:16:36
    that we are actually speaking about the
  • 00:16:38
    email that we would like to send to uh
  • 00:16:40
    to our team and it will start responding
  • 00:16:42
    as if it was sent to
  • 00:16:45
    it and once we are introducing XML TXS
  • 00:16:49
    of well here is the context of the email
  • 00:16:52
    we can get the response that we were
  • 00:16:53
    looking
  • 00:16:55
    for we can already see here that we are
  • 00:16:59
    explicitly talking about the output that
  • 00:17:01
    we are willing to see and indeed we can
  • 00:17:04
    help clot with this output by
  • 00:17:08
    speaking and prefilling of our assistant
  • 00:17:12
    for getting the output that we are
  • 00:17:16
    expecting how can it be done well it is
  • 00:17:18
    done very often in the form of preing
  • 00:17:22
    within assistant so very common use case
  • 00:17:25
    that we observe from our customers is
  • 00:17:27
    some Json files or some y files that are
  • 00:17:30
    having certain structure and we would
  • 00:17:31
    like to get exactly files in this
  • 00:17:33
    structure once we are getting the
  • 00:17:35
    response from llms and we are here in
  • 00:17:38
    the example preing for Json files just
  • 00:17:41
    by using a first curly bracket which is
  • 00:17:44
    of course incre increasing the
  • 00:17:46
    likelihood of us getting Json file as
  • 00:17:48
    the
  • 00:17:51
    response another very popular feature of
  • 00:17:54
    cloud models is Big context base of two
  • 00:17:59
    100,000
  • 00:18:01
    tokens we would really encourage you to
  • 00:18:03
    use this context base efficiently
  • 00:18:06
    because well what is 200,000 tokens it
  • 00:18:10
    can be a
  • 00:18:11
    book it can be a book that we will just
  • 00:18:14
    pass to llm
  • 00:18:17
    directly and to utilize this way of uh
  • 00:18:22
    using context base effectively what we
  • 00:18:24
    recommend is again to consider XML text
  • 00:18:27
    to separate what we are passing and
  • 00:18:31
    instructions we would also think about
  • 00:18:34
    using quotes for us to have the response
  • 00:18:37
    that is closer to the content that we
  • 00:18:39
    are passing as well as well as it also
  • 00:18:42
    works with us as humans we can prompt
  • 00:18:45
    for Claude to read the document
  • 00:18:48
    carefully because it will be asked
  • 00:18:50
    questions
  • 00:18:51
    later and the last but not the least tip
  • 00:18:54
    here is again to reconsider fusure
  • 00:18:57
    prompting and to use the
  • 00:18:59
    questions and answer pairs in the form
  • 00:19:01
    of
  • 00:19:05
    examples when it comes to the prompts
  • 00:19:07
    that are having certain steps in between
  • 00:19:11
    what is also helpful is to consider
  • 00:19:13
    prompt
  • 00:19:14
    chaining I will uh uh jump straight into
  • 00:19:17
    the example so if we would be having the
  • 00:19:20
    task of extracting the names from
  • 00:19:23
    certain text and we would like to have
  • 00:19:25
    them in alphabetical order we can
  • 00:19:27
    definitely start with was saying well
  • 00:19:29
    just extract those names in alphabetical
  • 00:19:31
    order or we can consider splitting this
  • 00:19:34
    into two
  • 00:19:35
    prompts so that in the first one we
  • 00:19:38
    would have the extraction of names first
  • 00:19:42
    and in the second prompt we would have
  • 00:19:45
    them
  • 00:19:47
    alphabetized with this way of chaining
  • 00:19:51
    prompts we are increasing the likelihood
  • 00:19:53
    of getting the output that we are
  • 00:19:55
    looking for and you can also recall here
  • 00:19:58
    the example of prefilling the assistant
  • 00:20:00
    was this names XML
  • 00:20:03
    tag so the last but not the least tip
  • 00:20:07
    from us today would be on using
  • 00:20:09
    structured prom templates if we are
  • 00:20:12
    considering the example with the SL
  • 00:20:14
    documents that can be as I mentioned
  • 00:20:17
    books it is very helpful to put this
  • 00:20:20
    input Data before our
  • 00:20:24
    instruction if we are looking into a
  • 00:20:26
    different example of using certain input
  • 00:20:30
    data that is having the effect of
  • 00:20:33
    iteration or certain um dictionary or
  • 00:20:36
    List uh and in this case in the example
  • 00:20:38
    it would be the um types of different
  • 00:20:42
    animals the idea here would be to also
  • 00:20:44
    structure our uh prompt in a way that we
  • 00:20:47
    would have the iteration uh within this
  • 00:20:49
    input data and thus our output would be
  • 00:20:53
    having also different types of animals
  • 00:20:57
    put into it
  • 00:21:00
    so we have started talking here about
  • 00:21:02
    some structures that we are giving to
  • 00:21:03
    our prompts and now we would like to
  • 00:21:07
    cover something different we would like
  • 00:21:10
    to cover system prompts so those prompts
  • 00:21:13
    that are going into llm as initial ones
  • 00:21:18
    those ones that are setting the scene
  • 00:21:20
    for the
  • 00:21:22
    conversation you have already seen
  • 00:21:24
    different beats and pieces of them
  • 00:21:26
    within our previous best practices
  • 00:21:29
    at the same time now we would like to
  • 00:21:30
    bring them all together into one system
  • 00:21:33
    prompt template so to say you will see
  • 00:21:36
    nine sections here and again you can
  • 00:21:39
    definitely avoid using some of them if
  • 00:21:41
    they don't if that doesn't fit uh your
  • 00:21:43
    use case at the same time what we uh
  • 00:21:46
    strongly recommend here is for you to
  • 00:21:48
    follow the order that we are having
  • 00:21:52
    here I will jump straight into the
  • 00:21:55
    example of how this system uh prompt can
  • 00:21:58
    be built and our use case would be to
  • 00:22:01
    have a certain career coach that would
  • 00:22:03
    be helping users with their career
  • 00:22:06
    aspirations so we start with the element
  • 00:22:09
    of our system uh prom template which is
  • 00:22:12
    in giving the
  • 00:22:13
    role so here the F first things first we
  • 00:22:16
    are uh explicitly saying that uh you
  • 00:22:19
    will be acting as an AI career coach
  • 00:22:22
    with certain name and we would probably
  • 00:22:24
    like to maintain the same name for this
  • 00:22:26
    career coach during our conversation
  • 00:22:29
    and at the same time what is your goal
  • 00:22:31
    your goal is to give career advice to
  • 00:22:33
    users the second element would be here
  • 00:22:36
    to use certain tone context or style of
  • 00:22:39
    the conversation so you should maintain
  • 00:22:41
    a friendly customer service
  • 00:22:45
    tone the third tip is utilizing the
  • 00:22:48
    context base and is passing certain
  • 00:22:51
    background data for our career coach to
  • 00:22:54
    process and again it can be included
  • 00:22:57
    within XML tax of in this case
  • 00:23:01
    guide the next element would be to give
  • 00:23:05
    more detailed in task description as
  • 00:23:08
    well as certain rules and here uh you
  • 00:23:10
    can also pay attention to how we are
  • 00:23:12
    letting Claud say that well if you don't
  • 00:23:15
    know you can tell that as well as uh if
  • 00:23:19
    uh someone is asking something that is
  • 00:23:20
    not relevant to the topic you can
  • 00:23:22
    also say that well I am here only to
  • 00:23:25
    give career advice
  • 00:23:28
    The Next Step would be to use
  • 00:23:31
    examples on potential also common um
  • 00:23:34
    common uh cases as well as edge cases as
  • 00:23:37
    well as uh give some immediate data in
  • 00:23:40
    this case it can be probably the history
  • 00:23:42
    of the conversation or for example the
  • 00:23:44
    information of the profile of the user
  • 00:23:45
    that our career coach is talking to the
  • 00:23:49
    seventh step would be to indeed
  • 00:23:51
    reiterate on what is the immediate task
  • 00:23:54
    here and allow clot or another llm of
  • 00:23:58
    your choice to think step by step and
  • 00:24:01
    even take a deep breath because we've
  • 00:24:03
    heard it to work efficiently from
  • 00:24:05
    scientific
  • 00:24:07
    papers as the last but not the least
  • 00:24:09
    element here would be to set certain
  • 00:24:14
    output and again we are utilizing XML
  • 00:24:17
    tags here and that has been the
  • 00:24:20
    suggested elements of our system prom
  • 00:24:22
    template let me bring them all together
  • 00:24:25
    for
  • 00:24:26
    you and would be the overview and I see
  • 00:24:30
    that some of you are willing to take the
  • 00:24:34
    picture so now we are having the system
  • 00:24:37
    promt
  • 00:24:38
    template how can we move even further
  • 00:24:41
    and how can we get even more
  • 00:24:44
    professional and consider what can be
  • 00:24:46
    done after we have gone through
  • 00:24:49
    prompting well we do consider prompt
  • 00:24:53
    engineering to be an art form at the
  • 00:24:56
    same time with Constantine we have
  • 00:24:57
    decided to try to add science to this
  • 00:25:00
    art form and with simple steps help to
  • 00:25:04
    make art work for your use
  • 00:25:07
    cases so the first step would normally
  • 00:25:10
    in our journey would be to of course
  • 00:25:13
    understand the use case as well as
  • 00:25:15
    develop certain test
  • 00:25:17
    cases I live in Munich so I will walk
  • 00:25:20
    you through the example that would be
  • 00:25:22
    helping me with understanding Bavarian
  • 00:25:24
    culture or it can be also the culture of
  • 00:25:26
    b vonach um so let's think about this
  • 00:25:28
    use case and um develop the test the
  • 00:25:32
    test use case that well maybe I will be
  • 00:25:35
    asking about certain dishes that are
  • 00:25:37
    relevant to a certain area of
  • 00:25:41
    Germany thus my um preliminary prompt
  • 00:25:46
    would be on cooking a certain dish so
  • 00:25:48
    how do I how do I make haish P or how do
  • 00:25:52
    I cook o
  • 00:25:54
    BDA and of course afterwards I'm going
  • 00:25:57
    into the ative phases with
  • 00:26:01
    evaluations that would be then going
  • 00:26:03
    into Loops in this case the first Loop
  • 00:26:06
    would be for example to consider adding
  • 00:26:08
    the role such as well you are an
  • 00:26:10
    experienced ethnographer or you an
  • 00:26:12
    experienced
  • 00:26:13
    cook and thus we will be refining The
  • 00:26:16
    Prompt that will be coming to the point
  • 00:26:18
    of being polished to the extent that it
  • 00:26:21
    fulfills our
  • 00:26:23
    goals a tricky part here is indeed in
  • 00:26:27
    setting evaluation
  • 00:26:30
    and if we have some test cases that are
  • 00:26:33
    closed ended questions having yes or no
  • 00:26:36
    answer it is possible to evaluate that
  • 00:26:40
    was not um many issues but if it is
  • 00:26:43
    open-ended
  • 00:26:45
    question let's again look into the
  • 00:26:47
    example of K sple and if we would have
  • 00:26:50
    again this uh first prompt of how do I
  • 00:26:52
    make his we would get of course LM
  • 00:26:55
    response that would be telling us what
  • 00:26:57
    elements do and what ingredients we
  • 00:26:59
    would have in our dish what is
  • 00:27:01
    interesting here is that we can utilize
  • 00:27:05
    rubrics and Define what we would like to
  • 00:27:07
    include in the response for it to be
  • 00:27:09
    evaluated as positive or negative so I
  • 00:27:12
    would prefer to have certain ingredients
  • 00:27:15
    and once we have the response that
  • 00:27:18
    fulfills the criteria we would have the
  • 00:27:21
    uh positive result and we would be
  • 00:27:23
    knowing that well our prompt engineering
  • 00:27:26
    and LM response were indeed true for
  • 00:27:31
    us many customers come to us
  • 00:27:34
    and uh ask about uh what can be done for
  • 00:27:38
    reducing hallucinations and that is also
  • 00:27:40
    something that
  • 00:27:42
    uh we would like to uh encourage you to
  • 00:27:45
    do with uh dealing with gallins and what
  • 00:27:49
    can be done is indeed to give lm's
  • 00:27:51
    permission to say I don't
  • 00:27:53
    know as well as answer only if it is
  • 00:27:56
    very confident in its response
  • 00:28:00
    it is also possible to ask about uh
  • 00:28:03
    relevant quotes from the context that we
  • 00:28:06
    have passed and once our context is
  • 00:28:09
    growing once we are having certain
  • 00:28:12
    iterations and changes to the context
  • 00:28:14
    that we would still like to pass to an
  • 00:28:17
    llm we can consider using something
  • 00:28:20
    called retrieval augmented
  • 00:28:23
    generation retrieval augmented
  • 00:28:26
    generation would look in a certain way
  • 00:28:28
    way that instead of having the typical
  • 00:28:31
    flow of user asking the question from
  • 00:28:33
    the LM and getting the the the response
  • 00:28:36
    we would be also passing a certain
  • 00:28:39
    additional
  • 00:28:40
    context that would be in the form of
  • 00:28:43
    dynamically changing knowledge basis or
  • 00:28:45
    certain
  • 00:28:47
    internal sources we can consider product
  • 00:28:50
    descriptions we can consider fq pages
  • 00:28:53
    and in this way the flow would be
  • 00:28:56
    changing in a way that all all the in
  • 00:28:59
    all the documents um would be passed
  • 00:29:02
    into some uh Vector database and once
  • 00:29:05
    the user is sending the question we
  • 00:29:08
    first get everything relevant from the
  • 00:29:12
    context let's say three uh pieces of um
  • 00:29:16
    certain uh fq description and then that
  • 00:29:19
    would be passed as the context together
  • 00:29:21
    with the question to llm thus creating
  • 00:29:25
    the answer for
  • 00:29:26
    us and while the are different uh moving
  • 00:29:29
    blocks in this architecture so
  • 00:29:32
    Constantine shall we help the audience
  • 00:29:35
    to consider different tools for building
  • 00:29:37
    uh retrieval augmented generation sure
  • 00:29:39
    thank you Elina so I like to think of
  • 00:29:42
    retrieval augmented Generation Um
  • 00:29:44
    similar to giving the llm a cheat sheet
  • 00:29:48
    right so as you know from school
  • 00:29:50
    probably uh when you don't know the
  • 00:29:52
    answer you tend to make up something uh
  • 00:29:54
    but if you have a cheat sheet it's very
  • 00:29:56
    easy to solve the question so what we're
  • 00:29:58
    really doing here is we're using the
  • 00:29:59
    knowledge base and some way of searching
  • 00:30:01
    the knowledge base like a vector
  • 00:30:03
    database or search engine or anything
  • 00:30:05
    you can use to provide the relevant
  • 00:30:07
    information and then we generate a cheat
  • 00:30:09
    sheet that we inject into the prompt as
  • 00:30:12
    as part of the context so that increases
  • 00:30:14
    the likelihood of the llm giving the
  • 00:30:16
    right answer because now it has all the
  • 00:30:18
    data it needs um you can um set up this
  • 00:30:22
    thing on your own by putting together
  • 00:30:25
    the components like the knowledge base
  • 00:30:27
    and setting up the or frustration on all
  • 00:30:29
    all that um that is fun but it also can
  • 00:30:32
    become old quickly especially when your
  • 00:30:35
    uh manager is brething down your neck
  • 00:30:36
    and say hey when is my solution ready
  • 00:30:38
    right so um our job at AWS is to make it
  • 00:30:41
    easy for you so first um while ago we
  • 00:30:44
    introduced Amazon Bedrock which is makes
  • 00:30:46
    it super easy to use a selection of
  • 00:30:48
    different llms um with an easy to ous
  • 00:30:51
    API in a secure way uh so that you stay
  • 00:30:54
    in control with your data and then we
  • 00:30:56
    added an an additional feature called
  • 00:30:58
    knowledge bases for Amazon Bedrock which
  • 00:31:00
    essentially gives you the retrieval
  • 00:31:02
    augmented generation architecture that
  • 00:31:03
    we uh looked at before as a ready to use
  • 00:31:07
    feature of Bedrock so all you need to do
  • 00:31:09
    is you bring your knowledge base in the
  • 00:31:11
    form of some sort of document you can in
  • 00:31:14
    you can import them into knowledge bases
  • 00:31:17
    for Amazon Bedrock you can choose which
  • 00:31:20
    of the vector search engines or
  • 00:31:22
    databases you would like to use from a
  • 00:31:24
    collection of choices you can choose
  • 00:31:26
    which uh llm you want to use as part of
  • 00:31:29
    your architecture and then Bedrock does
  • 00:31:31
    everything else automatically for you
  • 00:31:33
    including the prompt engineering bit um
  • 00:31:36
    if you want to control the prompt prompt
  • 00:31:38
    you can actually add your own um
  • 00:31:40
    variation of the prompt or you can
  • 00:31:42
    simply use what is shipped inside Amazon
  • 00:31:44
    Bedrock so bedrock and knowledge bases
  • 00:31:47
    make it really easy for you to build
  • 00:31:48
    your own knowledge base uh architecture
  • 00:31:51
    your own retrieval augmented generation
  • 00:31:53
    and retrieval augmented generation is is
  • 00:31:55
    one of the most popular use cases we see
  • 00:31:57
    with customer
  • 00:31:58
    because it generates uh immediate um
  • 00:32:01
    value for your business no more
  • 00:32:03
    searching and hunting through long
  • 00:32:05
    product manuals long uh boring
  • 00:32:07
    documentation you can simply ask
  • 00:32:09
    questions get relevant answers that are
  • 00:32:11
    right from your documentation including
  • 00:32:13
    citations so that you know that you're
  • 00:32:15
    getting good answers
  • 00:32:18
    here another thing that our customers
  • 00:32:20
    like to use with llms is called agents
  • 00:32:22
    so agents and function calling um how
  • 00:32:25
    does it work well um instead of just
  • 00:32:28
    injecting documents from a search engine
  • 00:32:31
    like with retrieval augmented generation
  • 00:32:33
    you can go one step further and you can
  • 00:32:35
    tell the model hey model CLA in this
  • 00:32:38
    case right you actually have access to
  • 00:32:40
    some tools let me explain you how those
  • 00:32:42
    tools work and then you're giving it
  • 00:32:44
    like a prompt engineered version of an
  • 00:32:46
    API specification so what you're doing
  • 00:32:49
    here is um you're telling the model you
  • 00:32:51
    have access to a Weather Service you
  • 00:32:53
    have access to a Wikipedia you have
  • 00:32:54
    access to some other tool that you
  • 00:32:56
    choose how it works and then claw does
  • 00:32:59
    not really call those tools on its own
  • 00:33:02
    but it will tell you hey I would like to
  • 00:33:04
    use a Weather Service now with this
  • 00:33:06
    parameters and then you can engineer or
  • 00:33:08
    you can set up the actual function call
  • 00:33:11
    do the operation and give the result
  • 00:33:14
    back as part of the prompt so how does
  • 00:33:16
    it work well uh first of all you start
  • 00:33:19
    by putting together a prompt where you
  • 00:33:21
    describe to the model here are the tools
  • 00:33:23
    you have access to um think of this as
  • 00:33:26
    you put together your answer put them
  • 00:33:28
    into Claude as part of the prompt Claud
  • 00:33:30
    now decides whether it can answer the
  • 00:33:32
    question right away or whether it would
  • 00:33:34
    like to use one of these functions that
  • 00:33:36
    you gave it and in the case of no it
  • 00:33:38
    will probably answer with some
  • 00:33:41
    definitive answer like I don't know or
  • 00:33:43
    it says Okay I want to use those tools
  • 00:33:45
    and clot will actually output the
  • 00:33:47
    function call in the specification that
  • 00:33:50
    you told it so if you told it that
  • 00:33:52
    please use XML calls that are called
  • 00:33:54
    function calls invoke with the
  • 00:33:55
    parameters here and there it'll give you
  • 00:33:57
    the kind of XML that you expect and now
  • 00:34:00
    you can go and execute that so the
  • 00:34:02
    execution step looks like this in more
  • 00:34:04
    detail right so you get this function
  • 00:34:06
    call XML from CLA as a response you know
  • 00:34:09
    you can actually grab or you can you can
  • 00:34:11
    detect this with your traditional code
  • 00:34:12
    like a Lambda function hey Cloud wants
  • 00:34:14
    to call something you can use this XML
  • 00:34:18
    maybe validate it and then you actually
  • 00:34:20
    Implement your own client um in your own
  • 00:34:23
    code could be Lambda could be a
  • 00:34:25
    container whatever that does the
  • 00:34:27
    function call in this example it would
  • 00:34:28
    call the weather function and then you
  • 00:34:31
    inject the results back uh into their
  • 00:34:34
    own uh XML text like function results
  • 00:34:36
    and then you send the whole thing back
  • 00:34:38
    to clo right the system prompt the user
  • 00:34:41
    question the function call the function
  • 00:34:43
    results and then you let Claud decide
  • 00:34:46
    what to do next and then Claud sees oh I
  • 00:34:48
    have everything I know I want I have the
  • 00:34:50
    weather data I can now give a great
  • 00:34:52
    answer and then you get your answer so
  • 00:34:55
    that's how you implement your own
  • 00:34:56
    functions in the context of a large
  • 00:34:58
    language model by telling the model
  • 00:35:00
    these are the functions you can use
  • 00:35:02
    giving it an API specification in an
  • 00:35:04
    easy to use language such as XML Tex and
  • 00:35:07
    then you let the model decide when to
  • 00:35:09
    use which tool you are in control in how
  • 00:35:12
    you implement those tool those function
  • 00:35:14
    calls and then you give everything back
  • 00:35:15
    for the model to process and give a
  • 00:35:17
    definitive
  • 00:35:19
    answer now when you implement something
  • 00:35:21
    like this um explain the function
  • 00:35:25
    capabilities in great detail right this
  • 00:35:27
    is the same as explaining to a human
  • 00:35:30
    like your colleague how does this API
  • 00:35:32
    work um you can also provide a diverse
  • 00:35:35
    set of examples here is an example of
  • 00:35:37
    using this call to do X here is an
  • 00:35:39
    example of how the parameters might look
  • 00:35:40
    like for y and everything and um you can
  • 00:35:43
    actually use the stop tag or the end tag
  • 00:35:47
    of your function spec specification a
  • 00:35:49
    stop sequence which tells the Bedrock
  • 00:35:52
    service okay stop after this um sequence
  • 00:35:55
    here after this XML tag here because now
  • 00:35:58
    the XML part is over and you have a
  • 00:36:00
    definitive stop uh condition and um if
  • 00:36:04
    if it's if you're not getting reliable
  • 00:36:06
    results think about the prompt chaining
  • 00:36:09
    tip in the beginning don't make your
  • 00:36:11
    task too complicated break them down
  • 00:36:13
    into simple function calls and then do
  • 00:36:16
    them one by
  • 00:36:17
    one so here's an example on how it looks
  • 00:36:20
    in practice um this is how you would
  • 00:36:22
    describe the tool tool description tool
  • 00:36:25
    name is get weather uh this is the
  • 00:36:27
    descript destion um these are the
  • 00:36:29
    parameters location which is a string
  • 00:36:31
    you can actually add type declarations
  • 00:36:33
    there as well and all that stuff and and
  • 00:36:35
    you can use that as part of your system
  • 00:36:37
    prompt again you can do this all on your
  • 00:36:40
    own and it's fun the first time um but
  • 00:36:42
    you can also use a feature called agents
  • 00:36:45
    for Amazon Bedrock which allow you to
  • 00:36:47
    either programmatically set up these
  • 00:36:49
    functions with your own function code in
  • 00:36:52
    Lambda functions um or you can actually
  • 00:36:55
    go through the console and click
  • 00:36:56
    together your own headboard using
  • 00:36:58
    functions and um thereby reduce the
  • 00:37:01
    development time and the time to results
  • 00:37:04
    um to just uh a day or so instead of
  • 00:37:07
    weeks of trying out and figuring out
  • 00:37:09
    stuff and and prompt engineering and
  • 00:37:11
    everything else lastly let's take a look
  • 00:37:14
    at a different problem what to do with
  • 00:37:17
    um malicious users who want to inject
  • 00:37:20
    something into the prompt to get it to
  • 00:37:22
    do something that you don't want to uh
  • 00:37:25
    what if you have some bad user behavior
  • 00:37:27
    that you want to mitigate against the
  • 00:37:29
    good news is that anthropic clot is
  • 00:37:31
    already very resistant to jailbreaks and
  • 00:37:33
    other bad behavior uh which is one
  • 00:37:35
    reason why at Amazon we like to partner
  • 00:37:38
    very much with entropic uh because they
  • 00:37:41
    focus a lot on responsible use of AI but
  • 00:37:44
    again you can also get one step further
  • 00:37:47
    by adding an harmlessness screen to
  • 00:37:50
    evaluate the appropriateness of the
  • 00:37:52
    input prompt or the results right think
  • 00:37:54
    of this like a firewall that you're
  • 00:37:56
    putting in front of the llm that checks
  • 00:37:59
    whether input or output are really um
  • 00:38:02
    compliant to your own company rules and
  • 00:38:06
    um if a armful prompt is detected you
  • 00:38:08
    can filter it out based on uh that
  • 00:38:11
    example surprise you can use a different
  • 00:38:13
    llm or another llm to do that screen for
  • 00:38:16
    you so here's how a prompt might look
  • 00:38:18
    like for an harmless L screen uh a human
  • 00:38:21
    user would like you to continue a piece
  • 00:38:23
    of content here is the content so far if
  • 00:38:25
    the content refers to harmful graphic or
  • 00:38:28
    illegal activities reply with why so
  • 00:38:30
    you're essentially using an llm as a
  • 00:38:32
    classifier to classify whether this is a
  • 00:38:34
    malicious prompt or not and then you can
  • 00:38:37
    use that to filter it out again you can
  • 00:38:39
    do it on your own or you can use a
  • 00:38:40
    feature from Amazon Bedrock called guard
  • 00:38:43
    raids for Amazon badrock that let you
  • 00:38:45
    set up those guard rails either from
  • 00:38:47
    predefined Rules or from your own rules
  • 00:38:49
    that you bring into your
  • 00:38:53
    application so we hope this was useful
  • 00:38:55
    to you we hope you learned a lot
  • 00:38:58
    um no need to take so many photos you
  • 00:39:00
    can actually go to our helpful prompting
  • 00:39:02
    resources page that we prepared for you
  • 00:39:04
    U maybe take one more photo from this QR
  • 00:39:07
    code here and that'll guide you to a
  • 00:39:09
    page that we prepared for you with a
  • 00:39:11
    white paper with some prompting
  • 00:39:13
    resources some links to useful
  • 00:39:14
    documentation and even a workshop that
  • 00:39:16
    you can use to try out some things and
  • 00:39:19
    learn in your own pace and build your
  • 00:39:22
    own applications so with that thank you
  • 00:39:25
    very much for coming and enjoy the
  • 00:39:27
    evening of the summit thank you
Tags
  • Prompt Engineering
  • Large Language Models
  • Amazon Bedrock
  • Anthropic Claude
  • Few-Shot Prompting
  • Chain-of-Thought
  • XML Tags
  • Context Augmentation
  • Agents
  • Hallucination Mitigation