#11. Event based triggers in azure data factory | Azure data factory

00:11:14
https://www.youtube.com/watch?v=3IRmuPaauYU

摘要

TLDRThe video tutorial discusses how to set up event-based triggers in Azure Data Factory (ADF), focusing on automation by running pipelines in response to events. The content covers defining event-based triggers, such as those responding to file operations in Azure Blob Storage. A practical example is provided where a CSV file upload to an Azure storage triggers a pipeline converting the CSV to JSON. The process involves creating storage and setting up pipelines within Azure Data Factory. Key steps include selecting and configuring create triggers that respond to storage events. The tutorial emphasizes the importance of wildcard file paths for processing specific file types, like CSV. It also guides on managing triggers including creation, editing, and deletion. This topic is noted as crucial for technical interviews related to Azure Data Factory.

心得

  • 💡 Event-based triggers automate pipeline execution.
  • 🔄 Converts CSV to JSON in Azure Data Factory.
  • ⚙️ Key feature for automation and interviews.
  • 📂 Utilize Azure Blob Storage for events.
  • 🖇️ Linked service connects pipelines to storage.
  • 🔍 Use wildcard paths for file selection.
  • 📈 Crucial for efficient data workflows.
  • 🗂️ Set triggers for specific file events.
  • 🔧 Edit and manage triggers in Azure.
  • 📜 Step-by-step practical guide included.

时间轴

  • 00:00:00 - 00:05:00

    In this video, the focus is on understanding event-based triggers in Azure Data Factory, which are crucial for interviews as questions often arise about them. An event-based trigger is a type of trigger that allows the automatic execution of pipelines in response to specific events, such as a file being uploaded or deleted from a blob storage. For instance, if a file is uploaded to a blob storage, an event is generated which can trigger a pipeline that transfers data from CSV to JSON. The video outlines the steps involved in setting up such triggers, including naming the trigger, selecting the type, and specifying storage-related information. It also explains how different events, like 'blob created' or 'blob deleted', can be used to activate the pipeline.

  • 00:05:00 - 00:11:14

    The video proceeds with a practical demonstration, showing the setup of Azure Data Factory to handle files in storage. The process includes creating an input and output container for handling CSV to JSON data transformation through a pipeline. The pipeline is configured to recognize events (file uploads) and transform the file format from CSV to JSON. The setup of the trigger in Azure Data Factory is detailed, focusing on selecting storage events that activate the pipeline when a new file is uploaded. Finally, after publishing the configurations, a CSV file is uploaded to test the functionality. The success of the pipeline execution is confirmed by a JSON output in the set destination container, thereby validating the working of event-based triggers.

思维导图

Mind Map

常见问题

  • What is an event-based trigger in Azure Data Factory?

    An event-based trigger allows you to automatically execute a pipeline in response to a specific event or condition.

  • What business scenario was used in the video?

    The scenario involves uploading a CSV file to Azure storage, triggering a pipeline that converts it to JSON.

  • How do I create a storage account in Azure?

    The video assumes prior knowledge of this but involves searching for 'storage' in Azure and setting up an account.

  • What is the purpose of a wildcard file path?

    It is used to specify which files to process, in this case, only CSV files.

  • How can I delete a trigger in Azure Data Factory?

    Go to 'Manage', select 'Triggers', and delete the desired trigger.

  • What file types are being converted in the example?

    CSV files are converted to JSON.

  • Why are triggers important in interviews?

    Triggers are often discussed in interviews as they are a key feature of automation in Azure Data Factory.

  • How can I ensure only certain types of files trigger the pipeline?

    By using a wildcard file path, such as '*.csv'.

  • What happens after a CSV file is uploaded in the example project?

    The upload triggers a pipeline that reads the CSV and converts it to JSON.

  • Can the same linked service be used for different data sets?

    Yes, if both data sets are within the same storage account.

查看更多视频摘要

即时访问由人工智能支持的免费 YouTube 视频摘要!
字幕
en
自动滚动:
  • 00:00:00
    hello everyone so in this particular
  • 00:00:01
    video we are going to understand event
  • 00:00:03
    based trigger in AO data Factory
  • 00:00:05
    triggers are very much important from my
  • 00:00:07
    interview perspective if you'll go for
  • 00:00:10
    any interview one question definitely
  • 00:00:12
    you will get asked regarding a trigger
  • 00:00:15
    so they will ask some scenario based on
  • 00:00:17
    the trigger so in this video we are
  • 00:00:19
    going to understand how event based
  • 00:00:21
    trigger works so first we will
  • 00:00:23
    understand what is event based trigger
  • 00:00:24
    how actually it works and also we have
  • 00:00:26
    one of the business requirements So
  • 00:00:28
    based on the business requirement we
  • 00:00:29
    will try to do one project how event
  • 00:00:32
    based trigger works so that is the
  • 00:00:34
    agenda so let us first understand what
  • 00:00:36
    is event based trigger in Azure data
  • 00:00:38
    Factory event trigger is a type of
  • 00:00:40
    trigger that allows you to automatically
  • 00:00:43
    execute the pipeline in response of any
  • 00:00:47
    specific event or condition so in high
  • 00:00:50
    level you can understand based on
  • 00:00:52
    certain event if you want to run any
  • 00:00:54
    pipeline if you want to run any pipeline
  • 00:00:57
    then you will use a event based trigger
  • 00:00:59
    so so trigger we use to execute the
  • 00:01:01
    pipeline or to automate the pipeline and
  • 00:01:04
    if you want to execute any pipeline
  • 00:01:06
    based on certain event then you will go
  • 00:01:08
    for the event based trigger now let's
  • 00:01:10
    understand what are the events I'll give
  • 00:01:12
    you one of the example suppose I have
  • 00:01:14
    one of The Blob storage in this blob
  • 00:01:17
    storage I'm getting a file from external
  • 00:01:19
    resources so if any file is getting
  • 00:01:22
    uploaded into this particular blob then
  • 00:01:25
    one events will be get generated if
  • 00:01:27
    anyone is deleting any file which is a
  • 00:01:29
    La into this Blobby storage that is also
  • 00:01:32
    nothing but a one event for me now so
  • 00:01:35
    based on this all the event if you want
  • 00:01:37
    to run any of the pipeline that is
  • 00:01:40
    nothing but a event based trigger so if
  • 00:01:42
    anyone uploaded any file into this blob
  • 00:01:45
    storage then you want to run one of the
  • 00:01:47
    copy data pipeline which will transfer a
  • 00:01:49
    data from a CSV to the Json so that kind
  • 00:01:52
    of the business uh use case uh we can
  • 00:01:55
    understand so based on the certain event
  • 00:01:58
    if you want to run any of the pipeline
  • 00:02:00
    then we use a event based trigger we'll
  • 00:02:02
    come to the this project requirement
  • 00:02:04
    first let us uh understand in aure data
  • 00:02:08
    Factory how trigger we can set up so
  • 00:02:11
    first you have to go into the aure data
  • 00:02:13
    Factory you have to first create a
  • 00:02:16
    pipeline now once you'll click on the
  • 00:02:19
    pipeline you can see add trigger so once
  • 00:02:22
    you click on the add trigger we have a
  • 00:02:24
    two option trigger now and the new edit
  • 00:02:26
    once you'll click on the new
  • 00:02:28
    edit here you you will get one option to
  • 00:02:30
    create a new trigger now if I'll click
  • 00:02:33
    on the plus new button now first you
  • 00:02:35
    have to give the trigger name and in
  • 00:02:38
    type you can set select the storage
  • 00:02:40
    event custom event these are the event
  • 00:02:42
    based trigger we have so if I will
  • 00:02:44
    select on the storage events so it will
  • 00:02:47
    ask you about the your storage related
  • 00:02:50
    information so you can select your
  • 00:02:52
    storage related information you can pass
  • 00:02:54
    your container name and now you can see
  • 00:02:57
    here we have a event now if you'll see
  • 00:02:59
    see one is the blob created it is
  • 00:03:02
    nothing but any of the file is getting
  • 00:03:05
    updated into that particular container
  • 00:03:07
    any of the file which is getting deleted
  • 00:03:09
    from The Blob so these are the events we
  • 00:03:12
    have now based on this event if you want
  • 00:03:15
    to run any of the pipeline then we use
  • 00:03:18
    this event based trigger so I hope you
  • 00:03:20
    got like how actually we can create a
  • 00:03:22
    trigger for the event based trigger now
  • 00:03:24
    let's do a practical and let's do this
  • 00:03:26
    all the step one by
  • 00:03:28
    one so as of let me discard this now let
  • 00:03:31
    us understand what is our business
  • 00:03:34
    requirement so we have one of the Azure
  • 00:03:36
    storage in this Azure storage if anyone
  • 00:03:39
    is uploading any file then what actually
  • 00:03:41
    I want to do I want to trigger one
  • 00:03:43
    pipeline I want to trigger this copy
  • 00:03:46
    data pipeline what this copy data
  • 00:03:48
    pipeline will do it will read the data
  • 00:03:50
    from this azori storage so the CSC file
  • 00:03:54
    so if anyone is uploading any CSC file
  • 00:03:55
    into the Azure storage based on that I
  • 00:03:58
    want to trigger a pipeline and in this
  • 00:04:00
    copy data what actually it will happen
  • 00:04:02
    it will read a data from Azure storage
  • 00:04:04
    this CSV file and it will convert into
  • 00:04:07
    the Json like CSV to the Json conversion
  • 00:04:09
    this particular pipeline will do so we
  • 00:04:12
    have we will create one pipeline which
  • 00:04:14
    will convert CSV to the Json whatever
  • 00:04:17
    the CSV file to the Json conversion it
  • 00:04:19
    will do and when actually this pipeline
  • 00:04:21
    will run whenever anyone will upload any
  • 00:04:23
    file into the Azure stor then
  • 00:04:25
    automatically this particular pipeline
  • 00:04:27
    should trigger so that is the use case
  • 00:04:29
    we are going going to do
  • 00:04:32
    now now let me open first I will show
  • 00:04:35
    you the storage account so I have
  • 00:04:36
    created one storage account so first you
  • 00:04:40
    have to search for the
  • 00:04:43
    storage how to create a storage account
  • 00:04:45
    we have already covered so this is the
  • 00:04:47
    storage I have already created click on
  • 00:04:49
    that we are going to create a container
  • 00:04:52
    so I'll click on the
  • 00:04:53
    container I'll create a two container
  • 00:04:56
    one is the input container so if anyone
  • 00:04:58
    is uploading any c C file into this
  • 00:05:00
    input
  • 00:05:02
    container based on that I will try to
  • 00:05:04
    run my
  • 00:05:05
    Pipeline and I will create another
  • 00:05:08
    output
  • 00:05:10
    CSV output
  • 00:05:12
    Json this is another container I'm
  • 00:05:15
    creating so if anyone is uploading any
  • 00:05:18
    file into this input CSV container then
  • 00:05:21
    my Azure data Factory pipeline will run
  • 00:05:24
    it will read the data from input CSV and
  • 00:05:26
    it will convert into Json and it will
  • 00:05:29
    store into this output J that is the
  • 00:05:31
    Practical we will do I'll go to the data
  • 00:05:33
    Factory now what is the first step we'll
  • 00:05:36
    do we will first create a pipeline so
  • 00:05:38
    click on this pipeline click on the new
  • 00:05:41
    pipeline after that you had to click on
  • 00:05:45
    the move and transform click on the copy
  • 00:05:49
    data after that we will create a data
  • 00:05:51
    set so I'll select the source click on
  • 00:05:54
    the new where actually our source is
  • 00:05:57
    source is aure blob storage so I will
  • 00:05:59
    select that that continue file is CSV
  • 00:06:03
    file so I'll select
  • 00:06:05
    that after that we will click on the
  • 00:06:08
    link service so we are going to create a
  • 00:06:09
    link service for that we will select the
  • 00:06:12
    a
  • 00:06:14
    subscription we will select our Azure
  • 00:06:16
    storage
  • 00:06:17
    account and we'll click on the
  • 00:06:22
    create after that we will select the
  • 00:06:24
    file path so I will select the input CSV
  • 00:06:27
    that is our source I'll click on the
  • 00:06:29
    okay
  • 00:06:30
    and I'll just click on the
  • 00:06:32
    okay means our source is ready so we
  • 00:06:36
    have given only the input container so
  • 00:06:38
    what actually I will do I want to have a
  • 00:06:42
    condition if anyone is uploading any CSV
  • 00:06:46
    file then CSV file only should be
  • 00:06:49
    converted into the jsn so that is my
  • 00:06:51
    requirement in that case to read only
  • 00:06:53
    the CSC file from the container what I
  • 00:06:55
    will use I will use a wild card file
  • 00:06:58
    path in this wild card file path you can
  • 00:07:01
    see input CSC this is the path right
  • 00:07:04
    here you can pass star. CSV star means
  • 00:07:07
    it will read all the file st. CSC means
  • 00:07:10
    it will read only the CSC file so if
  • 00:07:13
    anyone is uploading any CSC file that CS
  • 00:07:15
    file will get converted into the Json so
  • 00:07:18
    that is the work we will
  • 00:07:21
    do now our source is now ready now go to
  • 00:07:25
    the sync for sync also we will create a
  • 00:07:27
    data set what is our t Target location
  • 00:07:30
    is blob storage so I'll select that I
  • 00:07:33
    want to convert into the Json so I'll
  • 00:07:34
    select
  • 00:07:35
    that I'll use the same link service so
  • 00:07:38
    if you have a same blob storage account
  • 00:07:42
    then you can use the same link service
  • 00:07:43
    because we are connecting to the same
  • 00:07:45
    storage account right so I can use the
  • 00:07:47
    same for both only the file path you
  • 00:07:49
    have to select different what is Target
  • 00:07:51
    location Target is output J so I select
  • 00:07:53
    that click on the okay and click on the
  • 00:07:57
    okay now our source and and the sync is
  • 00:08:00
    ready my copy data pipeline is ready now
  • 00:08:04
    what we will do we will create a trigger
  • 00:08:06
    so when actually this particular
  • 00:08:07
    pipeline should run whenever anyone will
  • 00:08:09
    upload any file into the storage account
  • 00:08:11
    so for that I will click on the add
  • 00:08:13
    trigger click on the new edit we are
  • 00:08:15
    going to create a new trigger so click
  • 00:08:17
    on the choose trigger click on the
  • 00:08:20
    new select the type as a storage
  • 00:08:24
    event subscription you have to select
  • 00:08:27
    storage account you have to select and
  • 00:08:29
    the container name you have to
  • 00:08:31
    select drop down you will get the
  • 00:08:34
    container name also so input CSV this is
  • 00:08:36
    the container I will select blob created
  • 00:08:38
    so if anyone will upload any file into
  • 00:08:41
    this particular bucket then only I want
  • 00:08:43
    to run the copy data pipeline so that is
  • 00:08:46
    why actually I'm selecting The Blob
  • 00:08:47
    created and after that just click on the
  • 00:08:50
    continue
  • 00:08:52
    continue
  • 00:08:55
    okay once you do this all right after
  • 00:08:57
    that we have to click on the publish
  • 00:09:00
    all click on the publish all click on
  • 00:09:03
    the
  • 00:09:05
    publish you can see publishing once the
  • 00:09:08
    publish is completed then we will try to
  • 00:09:12
    upload one file into the Azure storage
  • 00:09:14
    and based on that our data pipeline
  • 00:09:17
    should run you can see it is still
  • 00:09:19
    publishing once the publish will
  • 00:09:21
    complete then only we will try to run
  • 00:09:24
    the pipeline so we have schedule our
  • 00:09:25
    trigger okay you can see publish is now
  • 00:09:28
    completed now what we will do as of now
  • 00:09:31
    I will show you the monitor I'll try to
  • 00:09:35
    refresh you can see there is no pipeline
  • 00:09:37
    right now I will go to the storage
  • 00:09:39
    account now input CSV I will try to
  • 00:09:42
    upload any CSV file so if I will upload
  • 00:09:45
    any CSC file into this particular uh
  • 00:09:48
    bucket then my pipeline should run so
  • 00:09:50
    that is
  • 00:09:51
    the
  • 00:09:53
    criteria employee what I will do I will
  • 00:09:56
    try to upload any CSC file so this is
  • 00:09:58
    the one of the CSC file I am
  • 00:10:01
    uploading click on the upload now once I
  • 00:10:04
    up I have uploaded a file now I will go
  • 00:10:06
    to the data Factory if I'll
  • 00:10:09
    refresh now our pipeline should run
  • 00:10:12
    because we have uploaded right you can
  • 00:10:14
    see our pipeline is running and how
  • 00:10:16
    actually we can verify trigger by
  • 00:10:18
    trigger one this is the trigger we have
  • 00:10:19
    created right so I have uploaded any
  • 00:10:22
    file and this particular pipeline
  • 00:10:24
    started running now what this will do it
  • 00:10:26
    will copy a data from CSV to the Json
  • 00:10:29
    and it store into the output
  • 00:10:31
    container now let's wait you can see it
  • 00:10:34
    is succeed now now to
  • 00:10:37
    verify whether it got converted into the
  • 00:10:40
    Json this particular folder you can see
  • 00:10:42
    imply 1. Json the CSC file got converted
  • 00:10:45
    into the Json means our pipeline is
  • 00:10:48
    running now if you want to delete the
  • 00:10:51
    trigger what actually you can do click
  • 00:10:54
    on the
  • 00:10:55
    manage here we have option trigger from
  • 00:10:58
    here also you can create a new or if you
  • 00:11:01
    want to delete you can delete from here
  • 00:11:04
    otherwise it will be keep on uh
  • 00:11:06
    completing and after that you have to
  • 00:11:08
    click on the publish s so I hope you got
  • 00:11:10
    the idea how actually we can create a
  • 00:11:12
    pipeline yeah that's it in this thank
  • 00:11:14
    you
标签
  • Azure Data Factory
  • Event-based Trigger
  • Automation
  • Pipeline Execution
  • Blob Storage
  • File Conversion
  • CSV
  • JSON
  • Cloud Storage
  • Data Workflow