AWS Summit Berlin 2024 - Prompt engineering best practices for LLMs on Amazon Bedrock (AIM302)
概要
TLDRAWS experts detailed best practices for prompt engineering with Amazon Bedrock and Anthropic Claude. They emphasized clarity, role prompting, few-shot prompting, and chain-of-thought prompting to guide LLM responses effectively. Techniques like XML tags improve structural prompts, while retrieval-augmented generation enhances model accuracy using external knowledge bases. Amazon Bedrock simplifies deployment, supporting retrieval-augmented generation, guardrails, and agents for function integration. Examples demonstrated improving LLM behavior for tasks like generating JSON files, classifying tweets, and solving puzzles. Guidance for mitigating hallucinations and malicious prompts, alongside prompt templates, supported responsible and efficient LLM use. Tools like Amazon Bedrock's Knowledge Bases and pre-designed agents streamline workflows.
収穫
- 🧠 Prompt engineering requires clarity and creativity for effective guidance.
- 📘 Role prompting improves context relevance for models.
- 💡 Few-shot and chain-of-thought techniques enhance response accuracy.
- 📜 XML tags organize structured prompt formats.
- 📈 Amazon Bedrock simplifies deploying retrieval-augmented generation solutions.
- 🔍 Retrieval-augmented generation provides dynamically updated knowledge.
- ⚙️ Agents for Amazon Bedrock enable API integrations via modeled tools.
- 🔒 Guardrails protect against malicious or harmful prompts.
- 🛠️ LLM hallucinations can be reduced by context control and retrieval techniques.
- 📲 Amazon Bedrock accelerates the adoption of structured LLM solutions.
タイムライン
- 00:00:00 - 00:05:00
Introduction to the session on prompt engineering best practices for large language models on Amazon Bedrock, highlighting the creative aspect of prompt engineering and its role compared to traditional engineering disciplines. A simple example is provided to illustrate how different prompts can lead to varied responses from a language model.
- 00:05:00 - 00:10:00
Exploration of techniques like 'one-shot prompting', where giving a clear example helps guide the model to the desired response. The session explains this technique with practical examples, demonstrating how setting initial conditions can influence the model's output.
- 00:10:00 - 00:15:00
Further discussion on 'few-shot prompting' and 'Chain of Thought prompting,' where providing several examples and encouraging a step-by-step thought process can significantly enhance the model's accuracy and reliability. The combination of these techniques to yield precise outputs is illustrated.
- 00:15:00 - 00:20:00
Introduction to more advanced prompting strategies, such as role prompting and the use of XML tags in structuring prompts, especially for complex outputs like formatted email responses. The focus is on achieving clarity and specificity to direct the model's behavior effectively.
- 00:20:00 - 00:25:00
Explanation of using structured prompt templates and large context handling in models like CLA with up to 100,000 tokens. The segment emphasizes the importance of pre-filling expected output and using XML tags to manage prompt complexity and improve response accuracy.
- 00:25:00 - 00:30:00
Coverage of advanced concepts like 'retrieval augmented generation', where additional context is dynamically integrated into the model’s response process, and the system prompt template for setting initial conversational context for different use cases. These are tied into practical applications like building a career coach assistant.
- 00:30:00 - 00:39:35
Insight into implementing function calls and agent frameworks within LLMs to manage user input and extend functionalities. Examples are given on setting up API-like interactions and safeguards against malicious input, reinforcing the robustness of prompt engineering in creating responsible AI applications.
マインドマップ
ビデオQ&A
What is prompt engineering?
Prompt engineering involves crafting input instructions to guide large language models (LLMs) effectively for desired responses.
What are some key best practices for prompt engineering?
Best practices include clarity and specificity, role prompting, few-shot prompting, chain-of-thought prompting, and leveraging XML tags for structure.
What is Few-Shot Prompting?
Few-shot prompting provides multiple examples to the model to guide its behavior and improve accuracy.
What is Chain-of-Thought Prompting?
Chain-of-thought prompting involves instructing the LLM to think step by step for better reasoning capabilities.
How does Amazon Bedrock assist with LLM tasks?
Amazon Bedrock simplifies the use of multiple LLMs while offering features like Knowledge Bases for retrieval-augmented generation and agents for external API interactions.
What are XML tags used for in prompting?
XML tags are used to structure and organize the input prompts provided to models like Anthropic Claude.
What is retrieval-augmented generation?
This method combines external knowledge bases with LLMs by dynamically injecting relevant information into prompts based on user queries.
What are agents in Amazon Bedrock?
Agents enable the integration of external tools and APIs in conjunction with LLMs using function-based instructions.
How do you reduce LLM hallucinations?
Encourage LLMs to say 'I don't know' when uncertain, limit outputs to pre-defined contexts, and use retrieval-augmented generation for reliable context.
How can malicious prompts be mitigated?
Use harmlessness screens or guardrails like Amazon Bedrock's built-in features to filter harmful prompts and responses.
ビデオをもっと見る
4 Life Goals To STOP Chasing After 50 (and enjoy the journey to retirement more)
My $20,000 Tesla Buying Mistake | Don't Do This
¿Cómo el Salmo 23 Puede Cambiar Tu Vida PARA SIEMPRE?
3 Secret Signs the Holy Spirit Has Entered Your Body | C.H Spurgeon
This 7-second test exposes a 'Christian' Narcissist
Rick Astley - Never Gonna Give You Up (Official Music Video)
- 00:00:03so hello and welcome to aw Summit Berlin
- 00:00:06and I hope you had some great session so
- 00:00:08far my name is conine Gonzalez I'm a
- 00:00:10principal Solutions architect with AWS
- 00:00:13and my name is Elina lek and I'm an
- 00:00:15associate Solutions architect with AWS
- 00:00:18right so today we're going to talk about
- 00:00:20prompt engineering best practices for
- 00:00:22large language models on Amazon Bedrock
- 00:00:25so let's dive in we are going to start
- 00:00:27by setting the scene a little bit then
- 00:00:30then we are going to look at some useful
- 00:00:31techniques that Elina will then give you
- 00:00:35some examples for on how to apply to
- 00:00:37anthropic Cloud which is one of our
- 00:00:39favorite models we'll take a quick look
- 00:00:42Beyond prompting and finally we'll give
- 00:00:44you some resources that you can use when
- 00:00:46you start your own prompt engineering
- 00:00:49Journey so prompt engineering really is
- 00:00:52a bit of an art form and um to
- 00:00:55understand this let's take a look at
- 00:00:56what a prompt really means a prompt is
- 00:00:59the information you pass into a large
- 00:01:00language model to get some kind of
- 00:01:03response so you write some text which is
- 00:01:05the prompt you put it into the large
- 00:01:07language model and you get a res
- 00:01:10response back and um this is a bit of an
- 00:01:14uncertain science so far right if we
- 00:01:16look at a traditional job like an
- 00:01:19engineer Engineers they can depend on
- 00:01:22the rules of physics um so they always
- 00:01:25know what they're getting similarly
- 00:01:27software Engineers as a software
- 00:01:29engineer you can depend on some
- 00:01:31syntactical and semantic rules and
- 00:01:33everything is very precise but as a
- 00:01:35prompt engineer you're a bit more like
- 00:01:37an artist and you have to try out
- 00:01:41different things find your way around be
- 00:01:43creative which is a good thing so why am
- 00:01:47I saying that let's take a look at a
- 00:01:48simple example if the prompt is what is
- 00:01:5010 + 10 you might get as a response 20
- 00:01:54if you think this is a math problem but
- 00:01:57you might also get something like 10
- 00:01:59plus 10 is an addition problem so this
- 00:02:02is more like a
- 00:02:04classification answer right or you can
- 00:02:06get something like Quant SDS Ms which is
- 00:02:10it would be a translation use case so
- 00:02:12one prompt three different
- 00:02:15responses uh there are other ways you
- 00:02:17can have fun with prompting for example
- 00:02:19in this case we're instructing the model
- 00:02:21to uh with uh the beginning phrase of
- 00:02:24you are a high school physics teacher
- 00:02:26answer questions in one sentence and
- 00:02:29then we give in a question like explain
- 00:02:32quantum entanglement and we get
- 00:02:34something that makes sense from a point
- 00:02:36of view of a high school teacher quantum
- 00:02:37entanglement is a phenomenon in which
- 00:02:39two or more particles y y y that sort of
- 00:02:41thing but if we start slightly
- 00:02:43differently by telling the model you are
- 00:02:45an excited three-year-old who ate a lot
- 00:02:47of sugar answer questions in one
- 00:02:49sentence and give it the same question
- 00:02:52explain quantum entanglement we get a
- 00:02:54very different response woo woo bash bam
- 00:02:58particles go Zippy zappy when they're
- 00:02:59together together and even if you take
- 00:03:00them really far apart they still know
- 00:03:02what their friend is doing cuz they're
- 00:03:04magic connected same question different
- 00:03:08answers and the difference really comes
- 00:03:10down to prompt
- 00:03:12engineering so instructions matter and
- 00:03:16um when you put together your own
- 00:03:19prompts think about Clarity and
- 00:03:22specificity here are two ways of
- 00:03:25prompting one way is tell me a story
- 00:03:26about chaos but there are so many ways
- 00:03:29you could tell a story about cows or you
- 00:03:31can be much more specific and clear tell
- 00:03:33me a story about cows it should be
- 00:03:35roughly 2,000 words long and appropriate
- 00:03:37for 5th graders it should be
- 00:03:38entertaining but with a moral message
- 00:03:40about the value of loyalty make it
- 00:03:42amazing and
- 00:03:44memorable so the key Point here is just
- 00:03:47like humans llms cannot read your mind
- 00:03:50as prompt Engineers it is our job to
- 00:03:54bring our mind into a text prompt that
- 00:03:56tells the model how to behave so here
- 00:03:59are some useful techniques that you can
- 00:04:01use one of the earliest discovered
- 00:04:04techniques is called onot prompting
- 00:04:07which essentially boils down to giving
- 00:04:08the model an example of what you're
- 00:04:10looking for for example here we want to
- 00:04:13generate airpod codes out of text so we
- 00:04:17might start with an example of what we
- 00:04:19are really looking for and the example
- 00:04:20is I want to F fly from Los Angeles to
- 00:04:22Miami provide the airpods code only and
- 00:04:25we are already giving the expected
- 00:04:27results for this particular example
- 00:04:28which is l ax and Mia for those two um
- 00:04:32airport codes and then we follow up with
- 00:04:34the actual question we want some answer
- 00:04:36for I want to fly from Dallas to San
- 00:04:38Francisco we start the assistance
- 00:04:40response with airport Cod and square
- 00:04:43bracket and now the model knows exactly
- 00:04:45what we're expecting and it completes
- 00:04:47our sentence with DFW and SFO so this is
- 00:04:50an example of oneshot prompting we give
- 00:04:53the model one example and use that to
- 00:04:57illustrate what we are looking for so
- 00:04:59that the model is kind of guided into
- 00:05:02giving us the the right kind of response
- 00:05:04we uh response we're looking for you can
- 00:05:07also do F short prompting and F short
- 00:05:09prompting is what you would expect we
- 00:05:11give it multiple examples for example in
- 00:05:13this case we want to use a large
- 00:05:15language model to classify tweets we
- 00:05:18give it three different examples of
- 00:05:20three different classifications that we
- 00:05:21are looking for and then the fourth one
- 00:05:23is the actual thing we wanted to do
- 00:05:25where we are kind of pasting in some
- 00:05:27tweet and then we get uh hopefully what
- 00:05:29we are looking for because now we gave
- 00:05:31it some really good examples that tell
- 00:05:33the model what we what we
- 00:05:36want you can
- 00:05:38also use a technique called Chain of
- 00:05:41Thought prompting so with Chain of
- 00:05:44Thought prompting we want to make sure
- 00:05:46that the model
- 00:05:48really puts some energy into thinking
- 00:05:51clearly and we can actually do this
- 00:05:54really easily by telling it let's think
- 00:05:56step by step so on the left hand side
- 00:05:58you see an example example where we're
- 00:06:00asking the model to solve a puzzle for
- 00:06:02us a juggler can juggle 16 balls half of
- 00:06:05the balls are golf balls and half of
- 00:06:06them are blue blah blah blah and we're
- 00:06:08getting a wrong answer because the model
- 00:06:11gets kind of confused right but when we
- 00:06:13are adding the simple sentence let's
- 00:06:15think step by step now we're getting a
- 00:06:17good answer out of the same model this
- 00:06:20is a surprising trick that researchers
- 00:06:22found out in the early days of llms and
- 00:06:25it still works so think about when
- 00:06:27you're not getting the right results ask
- 00:06:29simply ask it to to uh think step by
- 00:06:32step uh which is called Chain of Thought
- 00:06:35prompting and now we can combine these
- 00:06:38two things we can use examples and use
- 00:06:41that example to teach the model how to
- 00:06:44think step by step so left hand side
- 00:06:47again uh an example that doesn't work we
- 00:06:49are using Roger has five tennis balls
- 00:06:52that's the the the first puzzle and
- 00:06:54we're giving it the answer we want like
- 00:06:56the answer is 11 but the model doesn't
- 00:06:58quite understand how to get to that
- 00:07:00answer right on the right hand side same
- 00:07:04example but now in our example answer
- 00:07:07we're kind of telling we're guiding the
- 00:07:09model along our thinking process for
- 00:07:12example in this tennis ball example the
- 00:07:13thinking process is Roger started with
- 00:07:16five balls two cans of three tennis
- 00:07:17balls each is six tennis balls 5 + 6
- 00:07:20equal 11 the answer is 11 so we adding
- 00:07:23the thinking process which is implicit
- 00:07:26Chain of Thought prompting and now we we
- 00:07:28can get when we paste the actual
- 00:07:30question that we want some answer for
- 00:07:32like the cafeteria had 23 apples and so
- 00:07:34on now we're getting the right model
- 00:07:36output including the thinking process
- 00:07:38which also has the capability for us to
- 00:07:40debug what the model was thinking so it
- 00:07:43tells us how it arrived at the answer
- 00:07:45and surprise it gets to the right answer
- 00:07:48because it knows how to think so when
- 00:07:51you craft examples think about using
- 00:07:54examples where you're actually telling
- 00:07:56the model how exactly to
- 00:07:58think so so with those initial examples
- 00:08:02out of way the way let's get way more
- 00:08:04practical and I'd like to introduce you
- 00:08:07to Elina who is going to show you how to
- 00:08:09prompt anthropic cloth which is my
- 00:08:11personal favorite model
- 00:08:13here so are you looking for some
- 00:08:16concrete examples of the best practices
- 00:08:19that we recommend impr prompt
- 00:08:21engineering this section will uh cover
- 00:08:25nine best practices and I would actually
- 00:08:29start was introducing myself so my name
- 00:08:31is Elina lassik and I joined AWS two
- 00:08:34years ago and well I wanted to follow my
- 00:08:37passion in scaling ideas as well as
- 00:08:39built machine learning products I have
- 00:08:42had an honor to work with customers from
- 00:08:44different segments of different company
- 00:08:46sizes and they all have had one
- 00:08:51question that question was what is that
- 00:08:55model that I should start experimenting
- 00:08:58with in many cases the answer was indeed
- 00:09:01entropic models that is exactly why we
- 00:09:04have decided to bring concrete examples
- 00:09:06of prompt engineering with entropic
- 00:09:10clot I would like to start
- 00:09:15with introducing you to the latest
- 00:09:17edition of entropic CLA model family
- 00:09:21namely clae 3 with the corresponding API
- 00:09:25namely messages API an important hint to
- 00:09:28prompt engine engering is to think a
- 00:09:31step back and consider how the llms were
- 00:09:35trained in the case of CLA we have the
- 00:09:38alternating dialogue between user and
- 00:09:42assistant that is built in the form of
- 00:09:45history of the conversation which in
- 00:09:47this case would be
- 00:09:49messages and that is an important hint
- 00:09:51for us to also consider once we are
- 00:09:53doing prompt
- 00:09:55engineering so if you have previously
- 00:09:58used uh models lower than three then you
- 00:10:01would uh see something like completion
- 00:10:03API and if you would like to use clot
- 00:10:06three models you would need to use
- 00:10:08messages API and on the right hand side
- 00:10:11right now you can see the example of how
- 00:10:13the input into messages API can look
- 00:10:15like and that is following this idea of
- 00:10:18the alternating dialogue between user
- 00:10:20and assistant that are given as rowes
- 00:10:23within messages and we also have
- 00:10:25something called system prompt that we
- 00:10:27will cover in several minutes minutes in
- 00:10:29more
- 00:10:31detail now I would like to start with
- 00:10:34the best practice number one and you
- 00:10:36won't be surprised by it because it is
- 00:10:40the tip on indeed being very clear and
- 00:10:43Direct in our
- 00:10:45prompts clot definitely responds better
- 00:10:48to clear and direct
- 00:10:50instructions at the same time I am
- 00:10:53personally still a big victim of using
- 00:10:55you know this polite words like could
- 00:10:57you please
- 00:10:59claw can we consider and uh similar
- 00:11:03things that are rather wake in case of
- 00:11:06prompt engineering so it is helpful for
- 00:11:09me to always recall the principle of
- 00:11:11clarity here as well as something that
- 00:11:14we could consider as rubber duck
- 00:11:16principle in software development but in
- 00:11:19case of building um LMS LM applications
- 00:11:23we can think about Golden Rule of clear
- 00:11:26prompting the idea here is that once we
- 00:11:28are in doubt we can go with our prompt
- 00:11:31to our friend or to our colleague and
- 00:11:33indeed ask them to follow something that
- 00:11:37we are pretty much prompting for if they
- 00:11:40are able to follow llm is also likely to
- 00:11:43follow on the right right uh hand side
- 00:11:46of the slide you can see our example and
- 00:11:49well if we are willing to have a hio on
- 00:11:52a certain topic and we would like to
- 00:11:54skip the Preamble that is in the form of
- 00:11:56well here is this hu the idea would be
- 00:11:59to indeed be very clear in that and
- 00:12:01instruct the model to skip this
- 00:12:04Preamble so having the clarity in mind
- 00:12:07we would proceed and do you recall
- 00:12:10Constantine's example on um the
- 00:12:12difference between explanations of
- 00:12:16quantum entanglement between PhD student
- 00:12:19and this over glucose
- 00:12:21kit well the idea here is indeed to
- 00:12:25utilize
- 00:12:26roles uh also known as Ro prompt
- 00:12:30and it is helping us to give the context
- 00:12:33to claw to understand what role it
- 00:12:35should
- 00:12:37inhibit it won't be big secret that LMS
- 00:12:40are not the best ones when it comes to
- 00:12:42math they generally don't calculate
- 00:12:45responses they generate responses at the
- 00:12:48same time when we're using Ro prompting
- 00:12:51we can help CLA to be more close to math
- 00:12:54world or the world of solving
- 00:12:57puzzles thus imp proving
- 00:13:01accuracy at the same time how I'm
- 00:13:04thinking about role prompting is the
- 00:13:05sort of dress code so if we are thinking
- 00:13:08about the parties that we should join we
- 00:13:09are checking what is the style that is
- 00:13:11expected there should it be casual
- 00:13:13should it be formal that is also the
- 00:13:15concept that we can utilize here and
- 00:13:18think about this way of speaking to llms
- 00:13:22that can change the tone and potential
- 00:13:25style of the conversation that we will
- 00:13:27be going through and on on the right
- 00:13:29hand side of the slide you can see the
- 00:13:31example of the puzzle that without any
- 00:13:34additional role is likely to be solved
- 00:13:37in a wrong way however the simple trick
- 00:13:40of saying you are a master solver of all
- 00:13:43puzzles in the world can help to improve
- 00:13:45the
- 00:13:48accuracy Constantine has introduced us
- 00:13:50to the concept of fusure prompting and
- 00:13:54in context of entropic models how this
- 00:13:57can be put into action
- 00:13:59well we can use examples and we see that
- 00:14:03examples are indeed very effective in
- 00:14:06generally speaking it is also very
- 00:14:08helpful to consider examples that are
- 00:14:11coming from different edge
- 00:14:14cases we know that the more examples the
- 00:14:18better the response is of course with
- 00:14:20certain trade-offs that we can see when
- 00:14:22it comes to the number of tokens and on
- 00:14:25the slide you can see the ex you can see
- 00:14:27the example and if our goal is indeed to
- 00:14:30get the concrete name name and surname
- 00:14:34of the author of a certain passage
- 00:14:35without anything extra we can pretty
- 00:14:38much prompt llm to give us exactly this
- 00:14:43by using
- 00:14:46example we have also thought about uh
- 00:14:48the concept of Chain of Thought and how
- 00:14:50to use it with claw and it works in a
- 00:14:54way that we can just give clot time to
- 00:14:57think if we would look into the example
- 00:15:00of the input that we are passing into
- 00:15:03llm we are using thinking as our XML TX
- 00:15:08and we are letting the model to put
- 00:15:11chain of thought into action and
- 00:15:13consider this stinking stepbystep
- 00:15:15process within thinking
- 00:15:17steps how the output looks like
- 00:15:22Well normally we would get the response
- 00:15:25from the assistant that would start with
- 00:15:27thinking and then with answering and
- 00:15:29what is interesting is that exactly
- 00:15:31something that we would find within
- 00:15:33thinking XML Tex is helpful for us to
- 00:15:36debug or to
- 00:15:39troubleshoot you can already see here
- 00:15:42some peculiar feature of CLA models what
- 00:15:46is it well it is indeed using XML Tex
- 00:15:49and that is the tip number five that we
- 00:15:51would like to
- 00:15:53give XML texts are helpful to structure
- 00:15:57the prompts that we are sending to l
- 00:15:59LS
- 00:16:01and once we would like to have certain
- 00:16:03sections within our prompts that is
- 00:16:05where XML text can help us to have the
- 00:16:08structure if we would look into the
- 00:16:11example of us willing to have the email
- 00:16:14of well if we are the person that should
- 00:16:16ask our team to show up at some early uh
- 00:16:20time of the day as uh 6:00 a.m. and
- 00:16:23draft this email in some formal response
- 00:16:25if we will just throw this into model in
- 00:16:27this form of well just show up at 600
- 00:16:29a.m. because I say so the model can
- 00:16:33sometimes misjudge and not understand
- 00:16:36that we are actually speaking about the
- 00:16:38email that we would like to send to uh
- 00:16:40to our team and it will start responding
- 00:16:42as if it was sent to
- 00:16:45it and once we are introducing XML TXS
- 00:16:49of well here is the context of the email
- 00:16:52we can get the response that we were
- 00:16:53looking
- 00:16:55for we can already see here that we are
- 00:16:59explicitly talking about the output that
- 00:17:01we are willing to see and indeed we can
- 00:17:04help clot with this output by
- 00:17:08speaking and prefilling of our assistant
- 00:17:12for getting the output that we are
- 00:17:16expecting how can it be done well it is
- 00:17:18done very often in the form of preing
- 00:17:22within assistant so very common use case
- 00:17:25that we observe from our customers is
- 00:17:27some Json files or some y files that are
- 00:17:30having certain structure and we would
- 00:17:31like to get exactly files in this
- 00:17:33structure once we are getting the
- 00:17:35response from llms and we are here in
- 00:17:38the example preing for Json files just
- 00:17:41by using a first curly bracket which is
- 00:17:44of course incre increasing the
- 00:17:46likelihood of us getting Json file as
- 00:17:48the
- 00:17:51response another very popular feature of
- 00:17:54cloud models is Big context base of two
- 00:17:59100,000
- 00:18:01tokens we would really encourage you to
- 00:18:03use this context base efficiently
- 00:18:06because well what is 200,000 tokens it
- 00:18:10can be a
- 00:18:11book it can be a book that we will just
- 00:18:14pass to llm
- 00:18:17directly and to utilize this way of uh
- 00:18:22using context base effectively what we
- 00:18:24recommend is again to consider XML text
- 00:18:27to separate what we are passing and
- 00:18:31instructions we would also think about
- 00:18:34using quotes for us to have the response
- 00:18:37that is closer to the content that we
- 00:18:39are passing as well as well as it also
- 00:18:42works with us as humans we can prompt
- 00:18:45for Claude to read the document
- 00:18:48carefully because it will be asked
- 00:18:50questions
- 00:18:51later and the last but not the least tip
- 00:18:54here is again to reconsider fusure
- 00:18:57prompting and to use the
- 00:18:59questions and answer pairs in the form
- 00:19:01of
- 00:19:05examples when it comes to the prompts
- 00:19:07that are having certain steps in between
- 00:19:11what is also helpful is to consider
- 00:19:13prompt
- 00:19:14chaining I will uh uh jump straight into
- 00:19:17the example so if we would be having the
- 00:19:20task of extracting the names from
- 00:19:23certain text and we would like to have
- 00:19:25them in alphabetical order we can
- 00:19:27definitely start with was saying well
- 00:19:29just extract those names in alphabetical
- 00:19:31order or we can consider splitting this
- 00:19:34into two
- 00:19:35prompts so that in the first one we
- 00:19:38would have the extraction of names first
- 00:19:42and in the second prompt we would have
- 00:19:45them
- 00:19:47alphabetized with this way of chaining
- 00:19:51prompts we are increasing the likelihood
- 00:19:53of getting the output that we are
- 00:19:55looking for and you can also recall here
- 00:19:58the example of prefilling the assistant
- 00:20:00was this names XML
- 00:20:03tag so the last but not the least tip
- 00:20:07from us today would be on using
- 00:20:09structured prom templates if we are
- 00:20:12considering the example with the SL
- 00:20:14documents that can be as I mentioned
- 00:20:17books it is very helpful to put this
- 00:20:20input Data before our
- 00:20:24instruction if we are looking into a
- 00:20:26different example of using certain input
- 00:20:30data that is having the effect of
- 00:20:33iteration or certain um dictionary or
- 00:20:36List uh and in this case in the example
- 00:20:38it would be the um types of different
- 00:20:42animals the idea here would be to also
- 00:20:44structure our uh prompt in a way that we
- 00:20:47would have the iteration uh within this
- 00:20:49input data and thus our output would be
- 00:20:53having also different types of animals
- 00:20:57put into it
- 00:21:00so we have started talking here about
- 00:21:02some structures that we are giving to
- 00:21:03our prompts and now we would like to
- 00:21:07cover something different we would like
- 00:21:10to cover system prompts so those prompts
- 00:21:13that are going into llm as initial ones
- 00:21:18those ones that are setting the scene
- 00:21:20for the
- 00:21:22conversation you have already seen
- 00:21:24different beats and pieces of them
- 00:21:26within our previous best practices
- 00:21:29at the same time now we would like to
- 00:21:30bring them all together into one system
- 00:21:33prompt template so to say you will see
- 00:21:36nine sections here and again you can
- 00:21:39definitely avoid using some of them if
- 00:21:41they don't if that doesn't fit uh your
- 00:21:43use case at the same time what we uh
- 00:21:46strongly recommend here is for you to
- 00:21:48follow the order that we are having
- 00:21:52here I will jump straight into the
- 00:21:55example of how this system uh prompt can
- 00:21:58be built and our use case would be to
- 00:22:01have a certain career coach that would
- 00:22:03be helping users with their career
- 00:22:06aspirations so we start with the element
- 00:22:09of our system uh prom template which is
- 00:22:12in giving the
- 00:22:13role so here the F first things first we
- 00:22:16are uh explicitly saying that uh you
- 00:22:19will be acting as an AI career coach
- 00:22:22with certain name and we would probably
- 00:22:24like to maintain the same name for this
- 00:22:26career coach during our conversation
- 00:22:29and at the same time what is your goal
- 00:22:31your goal is to give career advice to
- 00:22:33users the second element would be here
- 00:22:36to use certain tone context or style of
- 00:22:39the conversation so you should maintain
- 00:22:41a friendly customer service
- 00:22:45tone the third tip is utilizing the
- 00:22:48context base and is passing certain
- 00:22:51background data for our career coach to
- 00:22:54process and again it can be included
- 00:22:57within XML tax of in this case
- 00:23:01guide the next element would be to give
- 00:23:05more detailed in task description as
- 00:23:08well as certain rules and here uh you
- 00:23:10can also pay attention to how we are
- 00:23:12letting Claud say that well if you don't
- 00:23:15know you can tell that as well as uh if
- 00:23:19uh someone is asking something that is
- 00:23:20not relevant to the topic you can
- 00:23:22also say that well I am here only to
- 00:23:25give career advice
- 00:23:28The Next Step would be to use
- 00:23:31examples on potential also common um
- 00:23:34common uh cases as well as edge cases as
- 00:23:37well as uh give some immediate data in
- 00:23:40this case it can be probably the history
- 00:23:42of the conversation or for example the
- 00:23:44information of the profile of the user
- 00:23:45that our career coach is talking to the
- 00:23:49seventh step would be to indeed
- 00:23:51reiterate on what is the immediate task
- 00:23:54here and allow clot or another llm of
- 00:23:58your choice to think step by step and
- 00:24:01even take a deep breath because we've
- 00:24:03heard it to work efficiently from
- 00:24:05scientific
- 00:24:07papers as the last but not the least
- 00:24:09element here would be to set certain
- 00:24:14output and again we are utilizing XML
- 00:24:17tags here and that has been the
- 00:24:20suggested elements of our system prom
- 00:24:22template let me bring them all together
- 00:24:25for
- 00:24:26you and would be the overview and I see
- 00:24:30that some of you are willing to take the
- 00:24:34picture so now we are having the system
- 00:24:37promt
- 00:24:38template how can we move even further
- 00:24:41and how can we get even more
- 00:24:44professional and consider what can be
- 00:24:46done after we have gone through
- 00:24:49prompting well we do consider prompt
- 00:24:53engineering to be an art form at the
- 00:24:56same time with Constantine we have
- 00:24:57decided to try to add science to this
- 00:25:00art form and with simple steps help to
- 00:25:04make art work for your use
- 00:25:07cases so the first step would normally
- 00:25:10in our journey would be to of course
- 00:25:13understand the use case as well as
- 00:25:15develop certain test
- 00:25:17cases I live in Munich so I will walk
- 00:25:20you through the example that would be
- 00:25:22helping me with understanding Bavarian
- 00:25:24culture or it can be also the culture of
- 00:25:26b vonach um so let's think about this
- 00:25:28use case and um develop the test the
- 00:25:32test use case that well maybe I will be
- 00:25:35asking about certain dishes that are
- 00:25:37relevant to a certain area of
- 00:25:41Germany thus my um preliminary prompt
- 00:25:46would be on cooking a certain dish so
- 00:25:48how do I how do I make haish P or how do
- 00:25:52I cook o
- 00:25:54BDA and of course afterwards I'm going
- 00:25:57into the ative phases with
- 00:26:01evaluations that would be then going
- 00:26:03into Loops in this case the first Loop
- 00:26:06would be for example to consider adding
- 00:26:08the role such as well you are an
- 00:26:10experienced ethnographer or you an
- 00:26:12experienced
- 00:26:13cook and thus we will be refining The
- 00:26:16Prompt that will be coming to the point
- 00:26:18of being polished to the extent that it
- 00:26:21fulfills our
- 00:26:23goals a tricky part here is indeed in
- 00:26:27setting evaluation
- 00:26:30and if we have some test cases that are
- 00:26:33closed ended questions having yes or no
- 00:26:36answer it is possible to evaluate that
- 00:26:40was not um many issues but if it is
- 00:26:43open-ended
- 00:26:45question let's again look into the
- 00:26:47example of K sple and if we would have
- 00:26:50again this uh first prompt of how do I
- 00:26:52make his we would get of course LM
- 00:26:55response that would be telling us what
- 00:26:57elements do and what ingredients we
- 00:26:59would have in our dish what is
- 00:27:01interesting here is that we can utilize
- 00:27:05rubrics and Define what we would like to
- 00:27:07include in the response for it to be
- 00:27:09evaluated as positive or negative so I
- 00:27:12would prefer to have certain ingredients
- 00:27:15and once we have the response that
- 00:27:18fulfills the criteria we would have the
- 00:27:21uh positive result and we would be
- 00:27:23knowing that well our prompt engineering
- 00:27:26and LM response were indeed true for
- 00:27:31us many customers come to us
- 00:27:34and uh ask about uh what can be done for
- 00:27:38reducing hallucinations and that is also
- 00:27:40something that
- 00:27:42uh we would like to uh encourage you to
- 00:27:45do with uh dealing with gallins and what
- 00:27:49can be done is indeed to give lm's
- 00:27:51permission to say I don't
- 00:27:53know as well as answer only if it is
- 00:27:56very confident in its response
- 00:28:00it is also possible to ask about uh
- 00:28:03relevant quotes from the context that we
- 00:28:06have passed and once our context is
- 00:28:09growing once we are having certain
- 00:28:12iterations and changes to the context
- 00:28:14that we would still like to pass to an
- 00:28:17llm we can consider using something
- 00:28:20called retrieval augmented
- 00:28:23generation retrieval augmented
- 00:28:26generation would look in a certain way
- 00:28:28way that instead of having the typical
- 00:28:31flow of user asking the question from
- 00:28:33the LM and getting the the the response
- 00:28:36we would be also passing a certain
- 00:28:39additional
- 00:28:40context that would be in the form of
- 00:28:43dynamically changing knowledge basis or
- 00:28:45certain
- 00:28:47internal sources we can consider product
- 00:28:50descriptions we can consider fq pages
- 00:28:53and in this way the flow would be
- 00:28:56changing in a way that all all the in
- 00:28:59all the documents um would be passed
- 00:29:02into some uh Vector database and once
- 00:29:05the user is sending the question we
- 00:29:08first get everything relevant from the
- 00:29:12context let's say three uh pieces of um
- 00:29:16certain uh fq description and then that
- 00:29:19would be passed as the context together
- 00:29:21with the question to llm thus creating
- 00:29:25the answer for
- 00:29:26us and while the are different uh moving
- 00:29:29blocks in this architecture so
- 00:29:32Constantine shall we help the audience
- 00:29:35to consider different tools for building
- 00:29:37uh retrieval augmented generation sure
- 00:29:39thank you Elina so I like to think of
- 00:29:42retrieval augmented Generation Um
- 00:29:44similar to giving the llm a cheat sheet
- 00:29:48right so as you know from school
- 00:29:50probably uh when you don't know the
- 00:29:52answer you tend to make up something uh
- 00:29:54but if you have a cheat sheet it's very
- 00:29:56easy to solve the question so what we're
- 00:29:58really doing here is we're using the
- 00:29:59knowledge base and some way of searching
- 00:30:01the knowledge base like a vector
- 00:30:03database or search engine or anything
- 00:30:05you can use to provide the relevant
- 00:30:07information and then we generate a cheat
- 00:30:09sheet that we inject into the prompt as
- 00:30:12as part of the context so that increases
- 00:30:14the likelihood of the llm giving the
- 00:30:16right answer because now it has all the
- 00:30:18data it needs um you can um set up this
- 00:30:22thing on your own by putting together
- 00:30:25the components like the knowledge base
- 00:30:27and setting up the or frustration on all
- 00:30:29all that um that is fun but it also can
- 00:30:32become old quickly especially when your
- 00:30:35uh manager is brething down your neck
- 00:30:36and say hey when is my solution ready
- 00:30:38right so um our job at AWS is to make it
- 00:30:41easy for you so first um while ago we
- 00:30:44introduced Amazon Bedrock which is makes
- 00:30:46it super easy to use a selection of
- 00:30:48different llms um with an easy to ous
- 00:30:51API in a secure way uh so that you stay
- 00:30:54in control with your data and then we
- 00:30:56added an an additional feature called
- 00:30:58knowledge bases for Amazon Bedrock which
- 00:31:00essentially gives you the retrieval
- 00:31:02augmented generation architecture that
- 00:31:03we uh looked at before as a ready to use
- 00:31:07feature of Bedrock so all you need to do
- 00:31:09is you bring your knowledge base in the
- 00:31:11form of some sort of document you can in
- 00:31:14you can import them into knowledge bases
- 00:31:17for Amazon Bedrock you can choose which
- 00:31:20of the vector search engines or
- 00:31:22databases you would like to use from a
- 00:31:24collection of choices you can choose
- 00:31:26which uh llm you want to use as part of
- 00:31:29your architecture and then Bedrock does
- 00:31:31everything else automatically for you
- 00:31:33including the prompt engineering bit um
- 00:31:36if you want to control the prompt prompt
- 00:31:38you can actually add your own um
- 00:31:40variation of the prompt or you can
- 00:31:42simply use what is shipped inside Amazon
- 00:31:44Bedrock so bedrock and knowledge bases
- 00:31:47make it really easy for you to build
- 00:31:48your own knowledge base uh architecture
- 00:31:51your own retrieval augmented generation
- 00:31:53and retrieval augmented generation is is
- 00:31:55one of the most popular use cases we see
- 00:31:57with customer
- 00:31:58because it generates uh immediate um
- 00:32:01value for your business no more
- 00:32:03searching and hunting through long
- 00:32:05product manuals long uh boring
- 00:32:07documentation you can simply ask
- 00:32:09questions get relevant answers that are
- 00:32:11right from your documentation including
- 00:32:13citations so that you know that you're
- 00:32:15getting good answers
- 00:32:18here another thing that our customers
- 00:32:20like to use with llms is called agents
- 00:32:22so agents and function calling um how
- 00:32:25does it work well um instead of just
- 00:32:28injecting documents from a search engine
- 00:32:31like with retrieval augmented generation
- 00:32:33you can go one step further and you can
- 00:32:35tell the model hey model CLA in this
- 00:32:38case right you actually have access to
- 00:32:40some tools let me explain you how those
- 00:32:42tools work and then you're giving it
- 00:32:44like a prompt engineered version of an
- 00:32:46API specification so what you're doing
- 00:32:49here is um you're telling the model you
- 00:32:51have access to a Weather Service you
- 00:32:53have access to a Wikipedia you have
- 00:32:54access to some other tool that you
- 00:32:56choose how it works and then claw does
- 00:32:59not really call those tools on its own
- 00:33:02but it will tell you hey I would like to
- 00:33:04use a Weather Service now with this
- 00:33:06parameters and then you can engineer or
- 00:33:08you can set up the actual function call
- 00:33:11do the operation and give the result
- 00:33:14back as part of the prompt so how does
- 00:33:16it work well uh first of all you start
- 00:33:19by putting together a prompt where you
- 00:33:21describe to the model here are the tools
- 00:33:23you have access to um think of this as
- 00:33:26you put together your answer put them
- 00:33:28into Claude as part of the prompt Claud
- 00:33:30now decides whether it can answer the
- 00:33:32question right away or whether it would
- 00:33:34like to use one of these functions that
- 00:33:36you gave it and in the case of no it
- 00:33:38will probably answer with some
- 00:33:41definitive answer like I don't know or
- 00:33:43it says Okay I want to use those tools
- 00:33:45and clot will actually output the
- 00:33:47function call in the specification that
- 00:33:50you told it so if you told it that
- 00:33:52please use XML calls that are called
- 00:33:54function calls invoke with the
- 00:33:55parameters here and there it'll give you
- 00:33:57the kind of XML that you expect and now
- 00:34:00you can go and execute that so the
- 00:34:02execution step looks like this in more
- 00:34:04detail right so you get this function
- 00:34:06call XML from CLA as a response you know
- 00:34:09you can actually grab or you can you can
- 00:34:11detect this with your traditional code
- 00:34:12like a Lambda function hey Cloud wants
- 00:34:14to call something you can use this XML
- 00:34:18maybe validate it and then you actually
- 00:34:20Implement your own client um in your own
- 00:34:23code could be Lambda could be a
- 00:34:25container whatever that does the
- 00:34:27function call in this example it would
- 00:34:28call the weather function and then you
- 00:34:31inject the results back uh into their
- 00:34:34own uh XML text like function results
- 00:34:36and then you send the whole thing back
- 00:34:38to clo right the system prompt the user
- 00:34:41question the function call the function
- 00:34:43results and then you let Claud decide
- 00:34:46what to do next and then Claud sees oh I
- 00:34:48have everything I know I want I have the
- 00:34:50weather data I can now give a great
- 00:34:52answer and then you get your answer so
- 00:34:55that's how you implement your own
- 00:34:56functions in the context of a large
- 00:34:58language model by telling the model
- 00:35:00these are the functions you can use
- 00:35:02giving it an API specification in an
- 00:35:04easy to use language such as XML Tex and
- 00:35:07then you let the model decide when to
- 00:35:09use which tool you are in control in how
- 00:35:12you implement those tool those function
- 00:35:14calls and then you give everything back
- 00:35:15for the model to process and give a
- 00:35:17definitive
- 00:35:19answer now when you implement something
- 00:35:21like this um explain the function
- 00:35:25capabilities in great detail right this
- 00:35:27is the same as explaining to a human
- 00:35:30like your colleague how does this API
- 00:35:32work um you can also provide a diverse
- 00:35:35set of examples here is an example of
- 00:35:37using this call to do X here is an
- 00:35:39example of how the parameters might look
- 00:35:40like for y and everything and um you can
- 00:35:43actually use the stop tag or the end tag
- 00:35:47of your function spec specification a
- 00:35:49stop sequence which tells the Bedrock
- 00:35:52service okay stop after this um sequence
- 00:35:55here after this XML tag here because now
- 00:35:58the XML part is over and you have a
- 00:36:00definitive stop uh condition and um if
- 00:36:04if it's if you're not getting reliable
- 00:36:06results think about the prompt chaining
- 00:36:09tip in the beginning don't make your
- 00:36:11task too complicated break them down
- 00:36:13into simple function calls and then do
- 00:36:16them one by
- 00:36:17one so here's an example on how it looks
- 00:36:20in practice um this is how you would
- 00:36:22describe the tool tool description tool
- 00:36:25name is get weather uh this is the
- 00:36:27descript destion um these are the
- 00:36:29parameters location which is a string
- 00:36:31you can actually add type declarations
- 00:36:33there as well and all that stuff and and
- 00:36:35you can use that as part of your system
- 00:36:37prompt again you can do this all on your
- 00:36:40own and it's fun the first time um but
- 00:36:42you can also use a feature called agents
- 00:36:45for Amazon Bedrock which allow you to
- 00:36:47either programmatically set up these
- 00:36:49functions with your own function code in
- 00:36:52Lambda functions um or you can actually
- 00:36:55go through the console and click
- 00:36:56together your own headboard using
- 00:36:58functions and um thereby reduce the
- 00:37:01development time and the time to results
- 00:37:04um to just uh a day or so instead of
- 00:37:07weeks of trying out and figuring out
- 00:37:09stuff and and prompt engineering and
- 00:37:11everything else lastly let's take a look
- 00:37:14at a different problem what to do with
- 00:37:17um malicious users who want to inject
- 00:37:20something into the prompt to get it to
- 00:37:22do something that you don't want to uh
- 00:37:25what if you have some bad user behavior
- 00:37:27that you want to mitigate against the
- 00:37:29good news is that anthropic clot is
- 00:37:31already very resistant to jailbreaks and
- 00:37:33other bad behavior uh which is one
- 00:37:35reason why at Amazon we like to partner
- 00:37:38very much with entropic uh because they
- 00:37:41focus a lot on responsible use of AI but
- 00:37:44again you can also get one step further
- 00:37:47by adding an harmlessness screen to
- 00:37:50evaluate the appropriateness of the
- 00:37:52input prompt or the results right think
- 00:37:54of this like a firewall that you're
- 00:37:56putting in front of the llm that checks
- 00:37:59whether input or output are really um
- 00:38:02compliant to your own company rules and
- 00:38:06um if a armful prompt is detected you
- 00:38:08can filter it out based on uh that
- 00:38:11example surprise you can use a different
- 00:38:13llm or another llm to do that screen for
- 00:38:16you so here's how a prompt might look
- 00:38:18like for an harmless L screen uh a human
- 00:38:21user would like you to continue a piece
- 00:38:23of content here is the content so far if
- 00:38:25the content refers to harmful graphic or
- 00:38:28illegal activities reply with why so
- 00:38:30you're essentially using an llm as a
- 00:38:32classifier to classify whether this is a
- 00:38:34malicious prompt or not and then you can
- 00:38:37use that to filter it out again you can
- 00:38:39do it on your own or you can use a
- 00:38:40feature from Amazon Bedrock called guard
- 00:38:43raids for Amazon badrock that let you
- 00:38:45set up those guard rails either from
- 00:38:47predefined Rules or from your own rules
- 00:38:49that you bring into your
- 00:38:53application so we hope this was useful
- 00:38:55to you we hope you learned a lot
- 00:38:58um no need to take so many photos you
- 00:39:00can actually go to our helpful prompting
- 00:39:02resources page that we prepared for you
- 00:39:04U maybe take one more photo from this QR
- 00:39:07code here and that'll guide you to a
- 00:39:09page that we prepared for you with a
- 00:39:11white paper with some prompting
- 00:39:13resources some links to useful
- 00:39:14documentation and even a workshop that
- 00:39:16you can use to try out some things and
- 00:39:19learn in your own pace and build your
- 00:39:22own applications so with that thank you
- 00:39:25very much for coming and enjoy the
- 00:39:27evening of the summit thank you
- Prompt Engineering
- Large Language Models
- Amazon Bedrock
- Anthropic Claude
- Few-Shot Prompting
- Chain-of-Thought
- XML Tags
- Context Augmentation
- Agents
- Hallucination Mitigation