Generative AI Challenge - Implement RAG with Azure OpenAI Service

00:35:13
https://www.youtube.com/watch?v=2qoHkHRR958

Summary

TLDRThe session provides insights into AI development using .NET and Azure, focusing specifically on how to use the Retrieval-Augmented Generation (RAG) pattern with Azure OpenAI services. Hosted by Microsoft experts Aaron and Justin, it covers the process of integrating the RAG pattern, which enriches large language models (LLMs) with business-specific data to provide more contextually relevant responses. The session offers a comparison between fine-tuning AI models and utilizing the RAG pattern, demonstrating through practical coding how to leverage Azure AI Search to gather data contextually relevant to user queries. The session showcases the simplicity of integrating additional data sources into AI models via demos, emphasizing using managed identities for secure access. Additional resources such as Microsoft’s learning modules and past session recordings were also highlighted.

Takeaways

  • 🌐 Understand the Retrieval-Augmented Generation pattern with Azure OpenAI.
  • 👥 Hosted by Microsoft’s Aaron and Justin.
  • 🔍 Learn differences between RAG and fine-tuning.
  • 📄 Use Azure AI Search to enrich AI models with PDFs and documents.
  • 🤖 Demonstration on integrating contextual business data into AI.
  • 🔑 Monitor use of managed identities for access security.
  • 📊 Emphasis on using AI responsibly in applications.
  • 🔗 Access to related past sessions on YouTube.
  • 🏆 Earn digital badges through learning modules.
  • 💡 Key Vault recommended for secure API key storage.

Timeline

  • 00:00:00 - 00:05:00

    The session, introduced by Aaron and Justin, focuses on learning about AI, Azure, and their application using the Microsoft Learn environment. Participants are encouraged to engage with learn modules to enhance their journey in AI development, specifically with Azure and .NET, aiming for a certificate of completion. They will explore the retrieval augmented generation (RAG) pattern with Azure OpenAI service during this session.

  • 00:05:00 - 00:10:00

    The concept of retrieval-augmented generation (RAG) is explained as a method to enrich large language models with additional data for more contextual responses related to specific business needs. Instead of costly re-training, RAG allows attaching contextual information from databases to prompts, enhancing the model's output relevant to business contexts. The session is a continuation of previous modules with another session upcoming, emphasizing proper time conversion for non-US participants.

  • 00:10:00 - 00:15:00

    A discussion ensues about the differences between fine-tuning and RAG in AI development. Fine-tuning adjusts the underlying model to incorporate specific knowledge but requires re-training for updates, making it time-consuming. In contrast, RAG does not alter the base model but supplements it with contextual information at query time, offering a more flexible and adaptive solution without retraining. The conversation underlines when each approach might be appropriate, though detailed scenarios are beyond the session’s scope.

  • 00:15:00 - 00:20:00

    The RAG approach involves certain steps: starting with a user prompt, processing it to determine relevant context, consulting a data source to retrieve contextualized information, and then sending this enriched query to the Azure open AI service, which uses the additional data to produce more accurate responses. The demo will showcase this workflow, highlighting its simplicity and effectiveness.

  • 00:20:00 - 00:25:00

    Introduction to the Polyglot notebook, a tool in VS Code for scripting and prototyping, especially convenient for AI work without building full applications. Aaron demonstrates using Azure OpenAI service through scripts, starting with setup involving NewGet package, Azure endpoint, and deployment key. He shows querying the model without context, yielding general information about London, illustrating the need for RAG to provide business-specific insights.

  • 00:25:00 - 00:30:00

    Justin demonstrates using Azure AI Search to index uploaded PDF documents related to travel information. The AI search extracts and vectorizes text, making it usable as a contextual source for queries, enhancing RAG implementations. Aaron then adapts initial code to include this search engine, linking queries with indexed business-specific documents. The result is more relevant information with citations from internal data sources, showing how minimal code changes significantly improve output.

  • 00:30:00 - 00:35:13

    The session concludes with encouragement to continue learning through Microsoft's modules, next focusing on responsible AI. It highlights tooling for access management best practices, recommending managed identities and Azure Key Vault for security. Questions from the audience are addressed, clarifying authentication mechanisms. Signing off, Aaron and Justin thank participants and remind them of the next session in two days, promoting further engagement in the learning series.

Show more

Mind Map

Video Q&A

  • What was the main focus of the session?

    The session primarily focused on the RAG (Retrieval-Augmented Generation) pattern using Azure OpenAI services in AI development with .NET.

  • Who hosted the session?

    The session was hosted by Aaron, a Principal Cloud Advocate at Microsoft, with his colleague Justin.

  • What is the RAG pattern?

    RAG stands for Retrieval-Augmented Generation, a method to enrich large language models with additional data to enhance contextual responses.

  • How can participants utilize Azure AI search in their AI program?

    Participants can upload structured or unstructured data like PDFs into Azure AI search, which vectorizes this information for contextual searches.

  • What is the benefit of using the RAG pattern in AI?

    The RAG pattern allows AI models to provide more contextually accurate responses by integrating and learning from specific business data, without needing to create new models.

  • What is the main difference between fine-tuning and using the RAG pattern?

    Fine-tuning adjusts a model with specific data, requiring retraining for updates, while RAG adds context through data at runtime without retraining.

  • How were the demonstrations executed?

    The demonstrations showed how to retrieve contextual information from Azure AI search and incorporate it into responses via Azure OpenAI, using basic coding examples and setup within VS code.

  • How can participants access past sessions?

    Past sessions can be accessed through the provided YouTube channel links shared during the session.

  • What additional learning resources were mentioned?

    Microsoft's learn modules and a digital badge opportunity for completing challenges were mentioned as further learning avenues.

  • How should users handle identity and access management?

    It is recommended to use managed identities or Azure key vault for secure access management with Azure OpenAI resources.

View more video summaries

Get instant access to free YouTube video summaries powered by AI!
Subtitles
en
Auto Scroll:
  • 00:00:00
    e
  • 00:00:07
    [Music]
  • 00:00:20
    [Music]
  • 00:00:36
    [Music]
  • 00:00:49
    [Music]
  • 00:00:58
    [Music]
  • 00:01:19
    good morning or good afternoon or good
  • 00:01:21
    evening depending where you are in the
  • 00:01:23
    world um welcome everyone to our session
  • 00:01:26
    today um I hope you're ready to learn a
  • 00:01:29
    bunch more about AI uh about Azure and
  • 00:01:32
    how we can do that with net um we're
  • 00:01:35
    going to be covering off a bunch of
  • 00:01:37
    different stuff today we're going to be
  • 00:01:38
    going through one of our learn modules
  • 00:01:41
    on um the Microsoft learn environment uh
  • 00:01:43
    to teach you like how to keep getting uh
  • 00:01:46
    keep going on your journey that you're
  • 00:01:47
    having as you um learning more about
  • 00:01:50
    this uh AI development and how you're
  • 00:01:52
    going to be able to get that um get that
  • 00:01:55
    award at the end of this series that
  • 00:01:57
    we've been running here my name is Aaron
  • 00:02:00
    I'm a principal Cloud Advocate on theet
  • 00:02:02
    on aure advocacy team here at Microsoft
  • 00:02:05
    and I'm joined here with my colleague
  • 00:02:06
    Justin Justin do you want to introduce
  • 00:02:08
    yourself to the audience sure hello
  • 00:02:10
    everyone I'm I'm Justin and with the
  • 00:02:13
    same team with Aon and um today we are
  • 00:02:16
    going to run um the poost um the as open
  • 00:02:19
    service with um the r PN so I'm GNA be
  • 00:02:23
    so excited to share this information
  • 00:02:25
    with you and I hope you do so
  • 00:02:30
    all right
  • 00:02:31
    let's kick this session off
  • 00:02:34
    whoops forgot that we've got a producer
  • 00:02:37
    back backstage that uh bringing screens
  • 00:02:39
    up for us and I I clicked the button we
  • 00:02:41
    just discussed that they were doing that
  • 00:02:42
    for us so you can tell that this is
  • 00:02:43
    being done live um yes as Justin said
  • 00:02:46
    we're going to be talking about the rag
  • 00:02:48
    pattern with Azure open AI service rag
  • 00:02:51
    standing for retrieval augmented
  • 00:02:53
    generation patter it's a it's an
  • 00:02:55
    approach to helping enrich our large
  • 00:02:58
    language models our llms with additional
  • 00:03:00
    data to make it a bit more contextual
  • 00:03:02
    about what we're
  • 00:03:03
    doing um but we've been running a series
  • 00:03:07
    over the last uh last week or so around
  • 00:03:10
    getting started with uh net and AI if
  • 00:03:13
    you want to go back and C catch any of
  • 00:03:15
    the sessions that you've missed in the
  • 00:03:16
    past you can get to those on our YouTube
  • 00:03:18
    channel via the the links there on
  • 00:03:20
    screen uh and remember at the end of
  • 00:03:22
    this if you complete the challenge and
  • 00:03:23
    take the assessment uh there's a virtual
  • 00:03:25
    uh certificate that you can get uh for
  • 00:03:28
    all of the hard work that you've gone
  • 00:03:30
    through um the previous sessions that's
  • 00:03:32
    the list of them you see there's been a
  • 00:03:34
    couple beforehand that kind of each of
  • 00:03:37
    them sort of build on this knowledge
  • 00:03:38
    that um that we'll be touching on today
  • 00:03:40
    uh but they're also uh in conjunction
  • 00:03:42
    with our learn modules that we've got
  • 00:03:44
    we've got one more session coming up in
  • 00:03:45
    a couple of days time now those times
  • 00:03:48
    and dates are us Centric so if you are
  • 00:03:51
    like Justin and myself not in the US um
  • 00:03:54
    just make sure you translate those into
  • 00:03:56
    uh your local time and local uh uh local
  • 00:03:59
    day of the
  • 00:04:01
    week so what are we going to be doing
  • 00:04:04
    today Justin do you want to tell the
  • 00:04:05
    audience a bit more about the module
  • 00:04:07
    that we're going to be covering off sure
  • 00:04:09
    so when you uh visit that website
  • 00:04:12
    aka.ms Genai Ro module then you will be
  • 00:04:16
    able to see um the Microsoft Lear module
  • 00:04:20
    screen so um if I'm going to share my
  • 00:04:24
    screen um could you please do that M
  • 00:04:27
    okay so here we go so this is the the
  • 00:04:29
    Microsoft learn module of on what we are
  • 00:04:31
    going to work through today so that's
  • 00:04:34
    say retrieval augmented generation
  • 00:04:37
    pattern with a openi service so it takes
  • 00:04:40
    about half an hour so you can easily
  • 00:04:43
    follow um this the information with um
  • 00:04:46
    the a few um the exercises so this will
  • 00:04:51
    be really um great example for
  • 00:04:57
    us so um what we are going to do today
  • 00:05:01
    so this is the Microsoft Lear module so
  • 00:05:04
    if we go to the next slide
  • 00:05:09
    then eror so what is
  • 00:05:15
    r sorry I put myself on mute there was
  • 00:05:17
    uh we had a bit of noise in the
  • 00:05:19
    background um so if we think about it
  • 00:05:22
    and we'll go through this in a demo in a
  • 00:05:24
    moment we can kind of see the workflow
  • 00:05:25
    that you might go through but when we're
  • 00:05:27
    talking about working with large
  • 00:05:29
    language models well they've been
  • 00:05:31
    trained on a whole bunch of information
  • 00:05:33
    that's what makes them work the way that
  • 00:05:35
    they work but that information is not
  • 00:05:37
    really contextualized around your
  • 00:05:39
    business problems or um your business's
  • 00:05:42
    specific bits of information and that
  • 00:05:45
    can result in well the models uh when we
  • 00:05:48
    get it to do completions for us well
  • 00:05:50
    they're just going to give us answers
  • 00:05:51
    based off of what they've been trained
  • 00:05:52
    on but how do I make sure that I give it
  • 00:05:54
    answers that are accurate and consistent
  • 00:05:56
    with the sorts of things that I want the
  • 00:05:58
    users of my system uh to to know about
  • 00:06:02
    well this is where the retrieval augment
  • 00:06:04
    and generation pattern comes in so
  • 00:06:06
    rather than trying to then build our own
  • 00:06:08
    large language models that are just
  • 00:06:09
    embedded with our organizational
  • 00:06:11
    knowledge that's costly that's time um
  • 00:06:14
    consuming to produce and then we have
  • 00:06:16
    something that we have to keep
  • 00:06:17
    maintaining well instead when we send up
  • 00:06:20
    a a prompt the the question from the
  • 00:06:23
    user to our large language model well we
  • 00:06:26
    can attach some additional
  • 00:06:28
    contextualized information we might go
  • 00:06:30
    out to a database and pull back you know
  • 00:06:32
    potential product records and product
  • 00:06:34
    information relative to the query that
  • 00:06:36
    they're um that they've asked and then
  • 00:06:39
    have the large language model use that
  • 00:06:41
    in producing its outcomes rather than
  • 00:06:43
    having it just try and fabricate up
  • 00:06:46
    something or as we're going to see
  • 00:06:47
    through this uh learn module and through
  • 00:06:50
    the um the the demos that Justin and I
  • 00:06:52
    are going to tackle shortly we can
  • 00:06:54
    leverage something like Azure AI search
  • 00:06:56
    to be more of a complex search index of
  • 00:06:59
    that sits behind this and we can bring
  • 00:07:01
    in say PDFs Word documents and stuff
  • 00:07:03
    like that not just structured data
  • 00:07:04
    inside of a
  • 00:07:06
    database I have a question about that oh
  • 00:07:09
    yes go ahead yeah people say people say
  • 00:07:13
    um usually people say um the the
  • 00:07:14
    difference they wanted to know the
  • 00:07:17
    differences between the fine tuning and
  • 00:07:19
    round so what are
  • 00:07:22
    they yeah so it's a good one and and
  • 00:07:24
    it's a um it's it's a question that
  • 00:07:26
    you're going to end up asking yourself
  • 00:07:28
    as you try and build out a system that
  • 00:07:29
    might be using some of these these
  • 00:07:31
    Technologies so when we're doing a
  • 00:07:33
    fine-tuned model this is where we're
  • 00:07:35
    taking an existing model you know let's
  • 00:07:38
    say something like a gbt model from open
  • 00:07:40
    Ai and then we're going to create our
  • 00:07:43
    own customized version on top of that
  • 00:07:46
    we're going to then provide it with that
  • 00:07:48
    information and we're going to retrain
  • 00:07:50
    that model so that it knows about the
  • 00:07:52
    stuff that's contextually relevant to
  • 00:07:54
    our business or to what our customer
  • 00:07:56
    needs are now that can be a good way to
  • 00:08:00
    embed that knowledge inside of that
  • 00:08:02
    large language model that we're working
  • 00:08:03
    with but the downside of that is that
  • 00:08:05
    we've now created a new static point in
  • 00:08:07
    time so what happens if we've you know
  • 00:08:10
    trained it on a product database and
  • 00:08:11
    we've added some more products or we've
  • 00:08:12
    deprecated some products or a product
  • 00:08:14
    description we want to update well then
  • 00:08:16
    we have to retrain that model to you
  • 00:08:18
    know to tune it again based off that new
  • 00:08:21
    information rag on the other hand
  • 00:08:23
    instead of training the model we're
  • 00:08:25
    going to provide it with that
  • 00:08:26
    information at the point in time where
  • 00:08:29
    asking it to do a completion so this is
  • 00:08:32
    It's Kind it's not the same as training
  • 00:08:34
    but it's embedding additional
  • 00:08:35
    information additional contextual
  • 00:08:37
    knowledge to the request that's going
  • 00:08:39
    there going to the model and then the
  • 00:08:41
    completion that it's producing can use
  • 00:08:43
    that information we've given it to give
  • 00:08:45
    the results this means that we don't
  • 00:08:47
    have to take anything other than a base
  • 00:08:49
    model like I said the like a gbt 34 a
  • 00:08:52
    gbt 35 model or GPT 40 model or
  • 00:08:54
    something that we can get out of azure
  • 00:08:55
    open service and we just use that that
  • 00:08:59
    as though it is that same base model and
  • 00:09:01
    we just provide additional point in time
  • 00:09:03
    knowledge at as we're making those
  • 00:09:05
    requests this means that we're not doing
  • 00:09:07
    any additional training of the model but
  • 00:09:09
    it mean but we do get the values that
  • 00:09:11
    that represents now there are pros and
  • 00:09:13
    cons that are um of which one you should
  • 00:09:15
    use and like when you would use for
  • 00:09:17
    different scenarios and they're kind of
  • 00:09:18
    beyond the scope of what we've got time
  • 00:09:19
    to cover off here we could use the
  • 00:09:21
    entire hour to talk about just the
  • 00:09:22
    differences and use cases of fine tuning
  • 00:09:24
    versus rag but at least that should give
  • 00:09:27
    you a starting point on making that
  • 00:09:28
    decision yourselves and testing out
  • 00:09:30
    which is going to be the right one for
  • 00:09:31
    the problems that you're working
  • 00:09:33
    with thanks everyon for um the
  • 00:09:39
    explanation so the basic steps that we
  • 00:09:41
    have when it comes to working with rag
  • 00:09:43
    is well we start with the prompt that's
  • 00:09:45
    come from the user so this is the
  • 00:09:47
    question that they've asked the thing
  • 00:09:48
    that you want them to to get the answer
  • 00:09:50
    from our llm for we're then going to do
  • 00:09:53
    some processing of that um this might
  • 00:09:55
    actually be asking a llm to give us a um
  • 00:09:59
    a some insights onto their their
  • 00:10:01
    question to determine what might be
  • 00:10:03
    relevant for the things that they're
  • 00:10:04
    looking for we don't want to upload an
  • 00:10:06
    entire product database every single
  • 00:10:09
    time the users making a request let's
  • 00:10:11
    say that they're wanting to know about a
  • 00:10:12
    particular uh a particular city is we're
  • 00:10:15
    representing a travel organiz travel
  • 00:10:17
    company but we don't want to upload
  • 00:10:20
    every single possible City every single
  • 00:10:22
    time they're making a request we have a
  • 00:10:24
    finite number of tokens that we can
  • 00:10:25
    provide in an llm request so instead
  • 00:10:28
    we're going to find out well what's
  • 00:10:29
    going to be the right City are they only
  • 00:10:30
    wanting to know about London for example
  • 00:10:32
    then we'll find the information that
  • 00:10:34
    we've tagged is relevant to London and
  • 00:10:36
    attach that to the original prompt that
  • 00:10:39
    the users provided maybe some other
  • 00:10:41
    contextual information around it for our
  • 00:10:43
    system so that we can guide the model to
  • 00:10:45
    give us the kinds of answers that we
  • 00:10:47
    want to get back and then send all of
  • 00:10:49
    that up to Azure open AI service and
  • 00:10:52
    then the response will know about that
  • 00:10:54
    additional information to get back to
  • 00:10:56
    the
  • 00:10:57
    user sounds right yeah those six steps
  • 00:11:01
    are usually um the the the pro the
  • 00:11:04
    workflow of how we use Rock pattern on
  • 00:11:08
    your larg language model precisely and
  • 00:11:12
    like I said it's it's just as simple as
  • 00:11:14
    those six steps and we'll we'll show you
  • 00:11:16
    in the Demos in a moment just how easy
  • 00:11:18
    it is to go
  • 00:11:19
    sure speaking of shall we jump over to
  • 00:11:23
    demos let's get started um and before we
  • 00:11:27
    actually dive into look at how we can
  • 00:11:30
    Implement rag with Azure open AI service
  • 00:11:32
    and Azure AI search I want to show you
  • 00:11:35
    just why we would want to be using rag
  • 00:11:37
    in this kind of a situation so what I've
  • 00:11:40
    got here is I've got a polygot notebook
  • 00:11:42
    um this is an extension for vs code if
  • 00:11:44
    you haven't used it I recommend go
  • 00:11:45
    checking it out it's a great way to
  • 00:11:47
    create these little scripts um that you
  • 00:11:48
    can check into Source control um of just
  • 00:11:51
    like steps that you want to execute um I
  • 00:11:54
    I use a lot for prototyping particularly
  • 00:11:56
    when we're doing some of this AI stuff
  • 00:11:58
    uh because it just I means I don't have
  • 00:11:59
    to spin up their whole console
  • 00:12:00
    applications or anything like that I can
  • 00:12:02
    just execute a bunch of um sequential
  • 00:12:04
    steps now the first thing I'm going to
  • 00:12:06
    need to do is I'm going to need to getb
  • 00:12:07
    the new get package for Azure open a
  • 00:12:09
    server so I'm just going to require that
  • 00:12:11
    in here I'm going to use the pre-release
  • 00:12:13
    U package so the version two um new get
  • 00:12:16
    package which is in pre-release so we'll
  • 00:12:18
    just um execute that step which is going
  • 00:12:19
    to download that package from newg get
  • 00:12:21
    see there we go that package is
  • 00:12:22
    installed and then I can add some using
  • 00:12:25
    statements so I'm going to need theet
  • 00:12:26
    interactive just so that we can do a few
  • 00:12:28
    things and a moment I also want the
  • 00:12:30
    Azure open Ai namespace and the open AI
  • 00:12:33
    chat Nam space so I'll bring those in
  • 00:12:37
    and now um I need to get some
  • 00:12:39
    information from well from me as the
  • 00:12:42
    user about how do I connect to my open
  • 00:12:43
    AI service so what's the open a endpoint
  • 00:12:46
    what's the key that we're going to be
  • 00:12:47
    accessing um and then I'm going to
  • 00:12:49
    create an Azure openai client using that
  • 00:12:52
    endpoint and that key and then I'm going
  • 00:12:55
    to get the uh the deployment name with
  • 00:12:56
    inside of azure open I that we're going
  • 00:12:58
    to be working with
  • 00:12:59
    so let's get our key and I'll copy that
  • 00:13:02
    from my Azure
  • 00:13:04
    portal and then I'm going to need my
  • 00:13:06
    endpoint copy that and you'll see here
  • 00:13:10
    that I've I've created a password um for
  • 00:13:13
    uh password key uh password prompt
  • 00:13:15
    inside of um the polyot notebooks so
  • 00:13:17
    that well I didn't want to publish my
  • 00:13:19
    open AI key for everyone that's watching
  • 00:13:21
    online today or anyone who's watching
  • 00:13:23
    the recording later um and then the
  • 00:13:25
    endpoint is just obviously a normal
  • 00:13:27
    string and now I need to get my my
  • 00:13:29
    deployment name so I just come over to
  • 00:13:31
    here and copy that one
  • 00:13:33
    in oh I'm there it
  • 00:13:37
    is I can copy and paste there we go so
  • 00:13:41
    for this we're going to be using a gbd
  • 00:13:43
    35 turbo 16k bottle so it's um this is a
  • 00:13:47
    good fast model and it's a good general
  • 00:13:48
    purpose
  • 00:13:50
    model all right so now that we've got
  • 00:13:52
    our o Azure openai client um we can get
  • 00:13:56
    well we can start working with that but
  • 00:13:58
    first I need to well I need to find out
  • 00:14:00
    what's the prompt that I want the user
  • 00:14:02
    to complete well as I said before let's
  • 00:14:04
    pretend that we're a travel agency so we
  • 00:14:06
    want to give contextual information
  • 00:14:07
    about um a a city we might want to visit
  • 00:14:10
    so I'm G to ask it a question around um
  • 00:14:13
    tell me some tell me about London we're
  • 00:14:15
    then going to create a uh a chat client
  • 00:14:18
    using that open AI deployment that I've
  • 00:14:21
    provided so the gbd 35 um model and then
  • 00:14:24
    I'm going to ask the chat client to
  • 00:14:26
    complete that chat and I'm just giving
  • 00:14:28
    it a user message I haven't provided it
  • 00:14:31
    with any grounding information I haven't
  • 00:14:32
    provided with a system prompt or
  • 00:14:33
    anything I'm just saying here is that
  • 00:14:35
    message just give me a completion based
  • 00:14:37
    off of that so tell me about
  • 00:14:43
    London and now this is going to then
  • 00:14:45
    send that up to our model and because we
  • 00:14:47
    haven't provided with any additional
  • 00:14:49
    information we've done no grounding um
  • 00:14:51
    we've done no uh fine tuning of this
  • 00:14:53
    model or anything like that it is just
  • 00:14:55
    going to tell us some generic
  • 00:14:57
    information that it know that the model
  • 00:14:58
    knows about London London is the capital
  • 00:15:02
    of the largest city in England in the
  • 00:15:03
    United Kingdom so on and so forth and
  • 00:15:05
    you can see it's given us some some good
  • 00:15:08
    general purpose knowledge about London
  • 00:15:11
    which is good but you know I run a
  • 00:15:13
    travel agency I want people to know
  • 00:15:15
    about things they can go and see and I
  • 00:15:17
    want to recommend them stuff that they
  • 00:15:18
    can do through brochures that we have um
  • 00:15:21
    and through uh the information that we
  • 00:15:23
    have with inside of our organizational
  • 00:15:25
    knowledge base so Justin how would we go
  • 00:15:29
    about providing that knowledge here
  • 00:15:31
    using Azure AI
  • 00:15:32
    search that's really good question so
  • 00:15:35
    when Aon um the uses the larger language
  • 00:15:38
    model as a uh as open AI it Returns the
  • 00:15:43
    generic information in based on their um
  • 00:15:47
    the training so if we want to as a
  • 00:15:50
    travel agency uh agency we provide uh we
  • 00:15:53
    want to provide a few more specific
  • 00:15:55
    information so let's change um the
  • 00:15:58
    screen and um I will show you how to do
  • 00:16:01
    it in our as AI search so in here um I
  • 00:16:05
    have a storage account and there is a
  • 00:16:09
    storage browser then there is a
  • 00:16:10
    container I have uploaded a few um files
  • 00:16:14
    already so there are
  • 00:16:18
    the there are PDF files about the cities
  • 00:16:20
    in Dubai Las Vegas London and so on so
  • 00:16:24
    these are all the information we provide
  • 00:16:26
    for the search now so so we
  • 00:16:30
    have I also have the AI search so we
  • 00:16:34
    have and when we open up the AI search
  • 00:16:37
    instance and there is one the menu
  • 00:16:39
    called input and vectorize data so so
  • 00:16:42
    vectorizing means we are the it input
  • 00:16:45
    the um data and it convert into the um
  • 00:16:50
    numbers so that um so that the um the
  • 00:16:54
    large language model can understand
  • 00:16:55
    those kind of data so there is
  • 00:16:59
    is
  • 00:17:02
    okay there is one here we go so there is
  • 00:17:05
    the storage account and there is a the
  • 00:17:08
    container so I can just um use it so and
  • 00:17:12
    I give the AI search of the source of
  • 00:17:15
    data and I will give um the language
  • 00:17:19
    model which and the larger language
  • 00:17:21
    model we can use and also there is text
  • 00:17:24
    embedding the large language model so
  • 00:17:27
    then we can just click it and and yeah
  • 00:17:32
    just click next so which is really um
  • 00:17:34
    the simple and then now I'm going to
  • 00:17:36
    give it a name with mag travel index
  • 00:17:40
    then I will create it so it takes a bit
  • 00:17:42
    of time and to get them all import all
  • 00:17:44
    the data and factorize it so we have
  • 00:17:47
    already provided um old information in
  • 00:17:51
    um in the the other instance and with
  • 00:17:54
    the information Aon will be improving
  • 00:17:57
    our um the capability of our our um our
  • 00:18:02
    the text the AI search capability so
  • 00:18:05
    Aron are you ready
  • 00:18:09
    yep so what's important to note about um
  • 00:18:12
    the the data that Justin has just
  • 00:18:14
    uploaded to AI search there through the
  • 00:18:17
    the a portal is that those were PDF
  • 00:18:19
    documents um that have been uploaded so
  • 00:18:22
    AI search is going to be extracting all
  • 00:18:24
    of the text out of those PDFs and then
  • 00:18:27
    as you mentioned vectorized
  • 00:18:29
    which is essentially turning into you
  • 00:18:30
    know large series of numbers that we can
  • 00:18:33
    then perform searching searches against
  • 00:18:35
    and that's how um that's a step in the
  • 00:18:38
    rag pipeline that we are going to go
  • 00:18:40
    through we're going to try and find
  • 00:18:41
    things that are semantically similar
  • 00:18:43
    based off of those numbers to the prompt
  • 00:18:45
    that we've initially asked it for and
  • 00:18:47
    then once we're finding those we can
  • 00:18:49
    then attach that information into the
  • 00:18:51
    overall um prompt that we're sending up
  • 00:18:54
    to the model to get our completion and
  • 00:18:56
    hopefully get some more contextually
  • 00:18:58
    relevant information about London or any
  • 00:19:00
    other the cities that we um that we
  • 00:19:02
    could have also searched for there um
  • 00:19:05
    and the result like the fact that we've
  • 00:19:06
    done that entirely through just
  • 00:19:08
    uploading PDFs like we we haven't had to
  • 00:19:10
    write a PDF paer or anything like that
  • 00:19:12
    we haven't even had to tell it the
  • 00:19:14
    structure of the PDFs we're able to use
  • 00:19:17
    the um the power of aure AI search to
  • 00:19:20
    just extract that because it understands
  • 00:19:22
    how to work with
  • 00:19:23
    PDFs so let's go back to to my screen
  • 00:19:26
    and have a look at what we would change
  • 00:19:28
    in the code now that we're wanting to
  • 00:19:30
    work with the AI search as an additional
  • 00:19:33
    component with inside of our um with
  • 00:19:36
    inside of our AI request pipeline so
  • 00:19:39
    I've got a new polyot notebook open here
  • 00:19:41
    and this one um we we'll go through the
  • 00:19:43
    steps again so I'm going to bring in
  • 00:19:44
    that new get package I'm going to bring
  • 00:19:46
    in um our using statements and then I'm
  • 00:19:48
    going to set up my open a key and
  • 00:19:51
    everything like that let's find my key
  • 00:19:55
    whoops where's my mouse on there's my
  • 00:19:57
    mouse
  • 00:19:59
    paste in the
  • 00:20:01
    key paste in our
  • 00:20:04
    endpoint and then paste in our
  • 00:20:08
    deployment name okay so I've got my uh
  • 00:20:11
    my AI client all ready to go again but
  • 00:20:14
    because I want to connect it to that
  • 00:20:16
    search index I'm going to get some
  • 00:20:18
    additional information from uh from the
  • 00:20:20
    user or like so this is the kind of
  • 00:20:22
    thing you might be pulling from Key VA
  • 00:20:24
    or you might be using a manage identity
  • 00:20:25
    to get connect to in a an actual
  • 00:20:27
    production application obviously I'm
  • 00:20:29
    just running a little scripting um
  • 00:20:30
    notebook here so this time I want to get
  • 00:20:32
    the uh the endpoint for my Azure AI
  • 00:20:35
    search I want to get the key to connect
  • 00:20:37
    to Azure AO search and lastly the index
  • 00:20:40
    that Justin has just uploaded those
  • 00:20:41
    documents in there uh and and the
  • 00:20:44
    vectors are going to exist with ins side
  • 00:20:45
    of so let me go ahead and grab these
  • 00:20:50
    Where is My
  • 00:20:52
    URL there it is H oops got to click
  • 00:20:55
    around so we'll get our endpoint
  • 00:20:59
    grab my
  • 00:21:00
    key and again this one's a password so
  • 00:21:03
    that you can't see it and then lastly
  • 00:21:05
    we're going to grab our
  • 00:21:10
    index there we go
  • 00:21:13
    okay
  • 00:21:15
    now the difference to when we were
  • 00:21:17
    creating our um our chat request before
  • 00:21:21
    is that we were not providing with any
  • 00:21:23
    kind of options for the chat request so
  • 00:21:26
    this time I am going to create a new set
  • 00:21:28
    of chat completion options and I'm going
  • 00:21:30
    to add to that options a data source
  • 00:21:32
    which is the Azure search Chat data
  • 00:21:34
    source now there are a couple of
  • 00:21:35
    different data sources that you can work
  • 00:21:37
    with um so understands databases and
  • 00:21:39
    things like that but I'm going to be
  • 00:21:40
    working with Azure search because well
  • 00:21:42
    that's what we're working with here and
  • 00:21:45
    then I provide it with the end endpoint
  • 00:21:47
    the index and I'm telling it how to
  • 00:21:49
    authenticate to that again I'm using a
  • 00:21:52
    key here but you could be using a
  • 00:21:53
    managed identity which U we' be more
  • 00:21:56
    likely to use in production environment
  • 00:21:58
    now I will note that this is um a
  • 00:22:01
    preview feature that uh is with inside
  • 00:22:03
    of the SDK so it does produce a warning
  • 00:22:05
    just to note that so I'm just disabling
  • 00:22:07
    that warning that it is a preview
  • 00:22:09
    feature otherwise um it would bring up a
  • 00:22:12
    whole bunch of like squigglies inside of
  • 00:22:13
    here and warnings and we want to we want
  • 00:22:16
    to hide those warnings where we at and
  • 00:22:18
    where we can so let's run that and we'll
  • 00:22:21
    create our options
  • 00:22:23
    there now when it comes to um our uh and
  • 00:22:28
    getting the completion back well again
  • 00:22:30
    this is very similar to what we
  • 00:22:31
    originally had we're going to get the
  • 00:22:32
    input from the user we're going to
  • 00:22:34
    create our chat client using the
  • 00:22:36
    particular deployment that we um
  • 00:22:38
    specified and we're going to get a uh
  • 00:22:40
    we're going to ask that client to
  • 00:22:41
    complete the chat we're giving it a user
  • 00:22:43
    message which is the text but we're
  • 00:22:46
    providing those options as a second
  • 00:22:48
    argument to the chat uh to the
  • 00:22:50
    completion uh complete chat request that
  • 00:22:52
    we're doing and let's just output what
  • 00:22:55
    we get this time so we'll execute that
  • 00:22:57
    and say tell me about
  • 00:23:02
    London now this time remember we've
  • 00:23:04
    provided those options so it's going to
  • 00:23:05
    connect to our data source and get back
  • 00:23:07
    some additional information so here
  • 00:23:09
    we'll see that we've got a slightly
  • 00:23:11
    shorter um uh response back from our
  • 00:23:14
    large language model and we'll also
  • 00:23:15
    notice that inside of here we've got
  • 00:23:17
    things like Doc two citations see doc
  • 00:23:20
    one and Doc two it's giving us a link to
  • 00:23:22
    um a travel website which is our
  • 00:23:24
    company's travel website and it's giving
  • 00:23:27
    us some um spe speciic specific things
  • 00:23:29
    like you know um about the the river
  • 00:23:32
    river themes um there there a particular
  • 00:23:35
    hotel that we recommend and things like
  • 00:23:38
    that so these citations that we um that
  • 00:23:41
    we're getting well obviously they that's
  • 00:23:43
    not particularly useful what like what
  • 00:23:44
    does Doc two mean in the context of um
  • 00:23:48
    the uh this chunk of text it doesn't
  • 00:23:50
    mean anything but what we can ask is
  • 00:23:54
    that completion well let's unpack some
  • 00:23:57
    additional messages context from there
  • 00:24:00
    and this case where from the completion
  • 00:24:02
    that we got back so that was the
  • 00:24:03
    response value our chat completion from
  • 00:24:06
    the the complete chat request we can ask
  • 00:24:08
    it using the uh Azure SDK I want to ask
  • 00:24:11
    it for the Azure message context again
  • 00:24:15
    this is a preview feature so I'm just um
  • 00:24:17
    disabling that warning and from this
  • 00:24:19
    Azure message context we can inspect
  • 00:24:22
    through it and look at some additional
  • 00:24:24
    stuff that is provided about um the the
  • 00:24:27
    the the bag side of it so um any intents
  • 00:24:31
    that it determined from um the request
  • 00:24:33
    that went up and in particular we want
  • 00:24:36
    to have a look for citations so we're
  • 00:24:38
    going to iterate through citations and
  • 00:24:40
    then output those let's kick a run from
  • 00:24:44
    that so um because I'm just outputting
  • 00:24:46
    the the the content of the citation but
  • 00:24:48
    you can um from the citation object
  • 00:24:50
    there's a whole bunch of different
  • 00:24:51
    properties that we can get I should have
  • 00:24:53
    brought those up before uh we looked at
  • 00:24:56
    it so um the the the chunk so this is um
  • 00:24:58
    referencing that like that that dock
  • 00:25:00
    point in uh in time the path to that
  • 00:25:02
    file so if we wanted people to be able
  • 00:25:04
    to download the appropriate PDF that
  • 00:25:05
    we've uploaded we can give that um the
  • 00:25:08
    title of the PDF a URL where they maybe
  • 00:25:10
    could get that from instead if it was um
  • 00:25:12
    like a website we've index versus uh a
  • 00:25:14
    document on obviously the the textual
  • 00:25:17
    content so here we are here are the
  • 00:25:20
    different um bits of information that
  • 00:25:22
    were found from our documents that we
  • 00:25:25
    uploaded that are relevant to London and
  • 00:25:28
    then we can see here the various hotels
  • 00:25:30
    so buckham hotel that it referenced
  • 00:25:34
    um near buckham Palace City hotel
  • 00:25:37
    Kensington Hotel uh and there is our
  • 00:25:40
    website and ultimately additional
  • 00:25:42
    information such as you know the the the
  • 00:25:44
    URL um of the The Blob that was there
  • 00:25:47
    that we can go and have a look at whoops
  • 00:25:48
    and I realized that I've scrolled like
  • 00:25:50
    way past where I thought I was where
  • 00:25:52
    there we are um so that all that
  • 00:25:55
    information like I said it's been
  • 00:25:56
    extracted from the PDF uh that had and I
  • 00:26:00
    just realized I I accidentally scrolled
  • 00:26:02
    much further down here than I meant uh
  • 00:26:04
    before showing the results because there
  • 00:26:06
    there is the intent you know tell me
  • 00:26:08
    information about London information
  • 00:26:10
    about London facts about London so these
  • 00:26:12
    are the things that it was attempting to
  • 00:26:14
    find with inside of larg language model
  • 00:26:17
    the the problems that it was trying to
  • 00:26:19
    solve for us and then these citations
  • 00:26:21
    are the things that it was using to
  • 00:26:24
    complete that out and that's how we're
  • 00:26:25
    able to get back relevant information
  • 00:26:28
    such as particular document points
  • 00:26:30
    particular hotels that we recommend as
  • 00:26:32
    our organization so on and so
  • 00:26:35
    forth but how is was that like we we
  • 00:26:38
    really only added a couple of lines of
  • 00:26:40
    code and uploaded some PDFs to a search
  • 00:26:43
    indexer and all of a sudden we've gone
  • 00:26:45
    from some very generic response about a
  • 00:26:49
    a city and location there no references
  • 00:26:52
    really to a particular Hotel you might
  • 00:26:54
    want to stay at or anything like
  • 00:26:57
    that to being very contextualized to our
  • 00:27:00
    organization to things that we can
  • 00:27:02
    provide uh through through our services
  • 00:27:05
    as a company so like I said pretty easy
  • 00:27:08
    to get up and running isn't it Justin
  • 00:27:10
    that's correct so I'm really surprised
  • 00:27:12
    that actually within the other open
  • 00:27:15
    service and we can um the analyze um
  • 00:27:18
    when we send a the very simple The
  • 00:27:20
    Prompt and it automatically um provide
  • 00:27:23
    some sort of intent and what kind of um
  • 00:27:25
    the analysis within the document and
  • 00:27:28
    based on that um it returns our the
  • 00:27:31
    structured response it's really
  • 00:27:34
    good yeah exactly and like I said this
  • 00:27:38
    this allows us to avoid that complexity
  • 00:27:40
    of retraining a model um and we can then
  • 00:27:43
    always reupload PDFs we can add more
  • 00:27:45
    PDFs about uh or or more like you know
  • 00:27:48
    supplementary documents about um the
  • 00:27:51
    locations that we're supporting as we
  • 00:27:53
    expand into other cities we could just
  • 00:27:55
    add that into a search index and it's
  • 00:27:58
    knowledge that is immediately becoming
  • 00:27:59
    available through the calls that are
  • 00:28:01
    happening with inside of our
  • 00:28:04
    module so what's the next things that
  • 00:28:06
    people should do um if they want to
  • 00:28:08
    learn a bit more about this um through
  • 00:28:10
    our learn
  • 00:28:11
    module um so we have
  • 00:28:16
    um so we have uh the this is
  • 00:28:22
    oh so let me share my screen so this is
  • 00:28:25
    the entire the LA module of our the uh
  • 00:28:28
    Microsoft dnet focus on AI the first one
  • 00:28:31
    is the AI open service and we are on
  • 00:28:35
    here and finally we have one more
  • 00:28:38
    session left which is responsible AI so
  • 00:28:41
    that means we are going to uh use this
  • 00:28:45
    AI model with with more care so that
  • 00:28:48
    will be covered by next next
  • 00:28:52
    session exactly yeah and and that's a
  • 00:28:54
    really important one anyone is looking
  • 00:28:56
    to use um the these AI tools with inside
  • 00:28:59
    of applications they building definitely
  • 00:29:01
    join in for the responsible AI session
  • 00:29:04
    or Capt recording if it's not in a
  • 00:29:07
    appropriate time zone for yourselves
  • 00:29:08
    because that's that's a really critical
  • 00:29:11
    part about building AI systems that's
  • 00:29:13
    right so when you um go back to the last
  • 00:29:17
    slide um we have um so yeah let's get
  • 00:29:22
    back let's um go to the last slide then
  • 00:29:25
    yeah this is the one that's the most
  • 00:29:27
    important thing I think for you to um
  • 00:29:29
    for everyone to um get get involved the
  • 00:29:32
    first one is um if you are missing any
  • 00:29:35
    of the our past live streaming then
  • 00:29:37
    that's the the right place to go and um
  • 00:29:41
    this is um the second one is the the our
  • 00:29:43
    Challenge and the last one is the most
  • 00:29:45
    important part so if you are um really
  • 00:29:49
    The Confident ah what I have learned and
  • 00:29:51
    I want to take my assessment then that's
  • 00:29:53
    the one and you will get a digital badge
  • 00:29:55
    and you you can like break what your
  • 00:29:58
    achievement to your social
  • 00:30:02
    media sounds good well um if there are
  • 00:30:06
    any questions feel free to drop them
  • 00:30:08
    into the chat um for us um but I
  • 00:30:12
    ultimately that covers off everything
  • 00:30:14
    that we were looking to to um to go
  • 00:30:16
    through today um the the learn module
  • 00:30:19
    that we have there as part of um part of
  • 00:30:21
    today's session uh you can go through
  • 00:30:23
    that and uh you can spin up a lab
  • 00:30:26
    environment where you can actually play
  • 00:30:27
    around with this with Azure open AO
  • 00:30:29
    service if you've got um an Azure
  • 00:30:31
    account that has open AO service um
  • 00:30:34
    available within it um you can spin up a
  • 00:30:36
    lab environment that will uh set up a
  • 00:30:38
    the the codebase and provide you with
  • 00:30:40
    all the the documents um and everything
  • 00:30:41
    that you want you can also do that
  • 00:30:43
    without the lab environment the um the
  • 00:30:45
    learn module does have the links to this
  • 00:30:47
    if you want to do that self-paced
  • 00:30:49
    outside of um our environment or you
  • 00:30:51
    want to use your own machine or whatever
  • 00:30:52
    the case may be um but we we do have a a
  • 00:30:56
    essentially we've um what we've had here
  • 00:30:58
    in the these polyot notebooks running
  • 00:31:01
    with inside of a console application so
  • 00:31:02
    you can play around with them the PDFs
  • 00:31:04
    that we used with Azure AI search
  • 00:31:07
    they're available to download and upload
  • 00:31:08
    to your own instances and things like
  • 00:31:11
    that um and uh I do see that we we do
  • 00:31:14
    have a question that's just come in on
  • 00:31:17
    um the uh on the chat there um so the
  • 00:31:21
    there's a question about um the best
  • 00:31:23
    practices for using um uh identity and
  • 00:31:26
    access management um so I am uh with a
  • 00:31:29
    open AO resources uh our recommendation
  • 00:31:32
    is is to use managed identities wherever
  • 00:31:35
    possible to work with these um resources
  • 00:31:37
    um with the services that we we
  • 00:31:40
    connected to today obviously we we went
  • 00:31:41
    down the the part of using um uh a
  • 00:31:44
    keybase authentication but that was just
  • 00:31:46
    for for Simplicity sake without having
  • 00:31:48
    to go through the process of setting up
  • 00:31:49
    manage identity with inside of this kind
  • 00:31:51
    of time constraint um session but you
  • 00:31:54
    can definitely do this with managed
  • 00:31:55
    identity uh in fact that's um that's
  • 00:31:58
    normally how we we have these demos and
  • 00:32:00
    if you check out some of our more
  • 00:32:02
    complex sample applications they'll be
  • 00:32:04
    using managed identity um if you don't
  • 00:32:06
    want to or for whatever reason can't use
  • 00:32:09
    managed identities to do uh access
  • 00:32:12
    control with inside of your Azure
  • 00:32:14
    environment to resources uh I would
  • 00:32:16
    recommend using um Azure key volt to
  • 00:32:19
    store the credentials for uh your
  • 00:32:23
    services that you're connecting to so
  • 00:32:24
    the key the value uh sorry the the the
  • 00:32:27
    the API key the endpoint value um and
  • 00:32:31
    possibly even the model deployment name
  • 00:32:33
    using those with inside of key volt uh
  • 00:32:35
    then key volt can be consumed by um apps
  • 00:32:38
    deployed to Azure very easily um whether
  • 00:32:40
    it's directly uh through the key volt
  • 00:32:42
    connections you can have in app service
  • 00:32:45
    or things like that or by bringing it
  • 00:32:47
    into the um the the net configuration um
  • 00:32:50
    pipeline yeah I manage identity would
  • 00:32:54
    always be my first point of call but if
  • 00:32:56
    for whatever reason I need to fall back
  • 00:32:57
    it would be making sure I'm using key VA
  • 00:32:59
    as the storage model for that kind of
  • 00:33:01
    stuff
  • 00:33:03
    yeah that's really
  • 00:33:06
    good I think um that's it from
  • 00:33:10
    us yep sounds good well everyone thanks
  • 00:33:15
    for joining us this morning this evening
  • 00:33:17
    or this afternoon whatever it is for you
  • 00:33:19
    um we we look forward to seeing you uh
  • 00:33:21
    on the next one of these episodes uh in
  • 00:33:25
    two-ish days time again depends what
  • 00:33:27
    time zone in uh to exactly how that
  • 00:33:30
    close that is to two
  • 00:33:32
    days yeah thanks for joining us thank
  • 00:33:34
    you Justin for joining me today thanks
  • 00:33:37
    thanks for
  • 00:33:38
    watching bye everyone
  • 00:33:48
    [Music]
  • 00:33:53
    [Music]
  • 00:34:10
    [Music]
  • 00:34:24
    [Music]
  • 00:34:56
    [Music]
  • 00:34:58
    oh
  • 00:35:01
    [Music]
Tags
  • Azure
  • AI development
  • RAG pattern
  • Azure OpenAI
  • Microsoft learn
  • fine-tuning
  • Azure AI search
  • AI integration
  • AI model training
  • managed identities