00:00:00
[Music]
00:00:11
hello good morning
00:00:12
my name is peggy sai from big id today i
00:00:15
will be talking about
00:00:17
ai power data discovery for building a
00:00:19
modern data governance program
00:00:22
before i joined big idea a year ago i
00:00:24
was in your position
00:00:26
where i helped to run and operationalize
00:00:28
data management programs i worked mainly
00:00:30
in the financial services industry
00:00:32
my most recent role was in morgan
00:00:34
stanley where i helped operationalize
00:00:36
their data governance program for wealth
00:00:38
management
00:00:39
i also worked on other regulatory and
00:00:42
business initiatives
00:00:43
that focused on understanding their
00:00:46
critical data elements
00:00:47
ensuring data remediation and monitoring
00:00:50
for data quality
00:00:52
i also recently co-authored the ai book
00:00:55
that was published last year in may 2020
00:00:59
so a lot of the things i'll be talking
00:01:01
about today is based on my experience as
00:01:03
a data steward in working in various
00:01:05
financial industries
00:01:07
and the complexities that we i
00:01:09
personally felt
00:01:10
when building out and broadening our
00:01:13
data governance program
00:01:16
so when we talked to customers today and
00:01:17
even based on my experience
00:01:19
the main challenges with data governance
00:01:21
is
00:01:22
finding your data knowing where your
00:01:25
data is
00:01:26
and being able to understand the quality
00:01:28
of that data
00:01:30
now many organizations i like their data
00:01:32
sources
00:01:33
i like to see it as an iceberg
00:01:37
it the iceberg that's above the water is
00:01:39
normally
00:01:40
much smaller than what's actually
00:01:42
beneath the water
00:01:44
on the iceberg that's on top of the
00:01:45
water is the known data sources that's
00:01:48
in your organization that's
00:01:49
inventoried and cataloged and that's
00:01:52
mostly
00:01:53
the structured data and that's what a
00:01:56
lot of your technology teams
00:01:57
and data management teams are are
00:02:00
working with are
00:02:01
able to understand because it's all
00:02:03
inventory and cataloged
00:02:05
but what about the data that resides
00:02:07
underneath
00:02:08
the water and that's where a lot of the
00:02:11
risks
00:02:12
lie because the data that's not known
00:02:16
is oftentimes provides a larger risk to
00:02:19
your organization
00:02:21
these are most oftentimes unstructured
00:02:24
data
00:02:25
dark data data that hasn't been um
00:02:28
labeled or classified or fully
00:02:30
understood
00:02:31
the reason why it's difficult to
00:02:33
understand this is just because
00:02:35
the mass quantity
00:02:38
of this data and really the sheer amount
00:02:41
of
00:02:42
resources it's really going to take to
00:02:44
find and identify
00:02:45
that data but there's really an
00:02:47
opportunity here
00:02:49
not just a challenge the opportunity to
00:02:51
really be able to
00:02:52
break down the silos of the data whether
00:02:55
it's within
00:02:56
organization or whether it's data that's
00:02:59
just
00:02:59
not um been used is to bring it all
00:03:02
together
00:03:03
into one cohesive data governance
00:03:05
program
00:03:06
and management managing it for all
00:03:09
business purposes
00:03:12
now the way that technology companies
00:03:15
today
00:03:16
in technology teams traditionally find
00:03:19
data is
00:03:19
you know based on um you know writing a
00:03:22
code writing a query
00:03:24
to extract and find that data and again
00:03:26
that's really very simple
00:03:28
when it comes to data that's you know
00:03:31
known and structured format
00:03:34
but again the challenges and the risk
00:03:36
are with where
00:03:37
the data is more in a document form
00:03:41
or it hasn't been fully classified yet
00:03:47
um when we talk to customers on
00:03:50
you know finding their critical data or
00:03:52
their sensitive data
00:03:54
you know across all their data sources
00:03:56
um
00:03:57
they rely on uh traditional approaches
00:04:01
to finding the data in addition to
00:04:04
query data in a straightforward manner
00:04:08
there are other techniques that are used
00:04:10
in the marketplace today to find the
00:04:11
data whether it's
00:04:13
through patterns for example a credit
00:04:16
card number has a specific
00:04:19
pattern in the sequence of numbers
00:04:22
regular expression is another way to
00:04:27
find things like um locations
00:04:30
and you know for an address uh the
00:04:33
the this city state zip code things like
00:04:36
that
00:04:37
um and another approach as a former data
00:04:40
steward is really
00:04:41
building out a data dictionary
00:04:44
that helps find the data this involves
00:04:48
manually tagging that data and being
00:04:51
able to put labels in it again
00:04:54
these are very time intensive resource
00:04:57
intensive
00:04:58
activities in order to um
00:05:01
put the metadata to to describe your
00:05:04
actual data
00:05:05
um you know there's various levels of
00:05:08
categories
00:05:09
in regards to the risk level
00:05:12
the usage level and actually the content
00:05:15
level so
00:05:17
these are many manual approaches to
00:05:20
tagging the data now when i think about
00:05:24
a data governance program which is uh
00:05:27
usually led by
00:05:28
a chief data officer and many
00:05:29
organizations
00:05:31
have a chief data officer today
00:05:34
the way they run their program whether
00:05:36
or not they're whether they're looking
00:05:38
at you know their data quality
00:05:40
or their business glossary assets
00:05:43
or their data issues they normally have
00:05:46
to go to
00:05:47
different applications um to see this
00:05:50
information again it's
00:05:52
uh it's disjointed it leads to an
00:05:55
ability
00:05:55
inability to really make a decision and
00:05:58
understand the health
00:05:59
of the entire data organization so
00:06:02
what's lacking today
00:06:03
is you know what i call a control center
00:06:06
a comprehensive data dashboard
00:06:08
that can really bring together the
00:06:11
aspects of
00:06:12
a catalog data quality
00:06:16
and remediation issue data issues into
00:06:18
one singular place
00:06:20
in order for the chief data officer and
00:06:22
the data teams
00:06:23
to gain the right insights into their
00:06:26
data and also to take the right actions
00:06:29
now i'm going to be talking about some
00:06:31
of the parts of data governance program
00:06:34
that i myself have personally been
00:06:36
involved in the past and
00:06:38
um data catalog um people sometimes call
00:06:42
this
00:06:42
a business dictionary a data dictionary
00:06:45
or um a precursor to an
00:06:49
inventory so it's really a single
00:06:52
place or a listing of all the data
00:06:55
assets that you have an organization has
00:06:57
across um their their landscape
00:07:01
um this can include um you know the
00:07:04
business name the definition
00:07:06
and also identify the actual location of
00:07:09
this data and that's
00:07:10
really important as part a of a
00:07:14
data governance team to understand not
00:07:16
only do you know what you have in terms
00:07:18
of your logical data assets you also
00:07:20
know
00:07:21
where it resides as well
00:07:24
now this single view of data that i
00:07:27
think a lot of organizations
00:07:28
really looking to build involves
00:07:31
collaboration
00:07:32
um mainly these are folks
00:07:35
call with the title of data steward that
00:07:38
works to
00:07:39
collaborate to integrate um you know
00:07:42
enrich
00:07:43
these um logical assets and really bring
00:07:46
more value to the data because they're
00:07:48
enriching the data they're providing
00:07:50
uh more context into how the data is
00:07:54
actually used in the business sense
00:07:56
so all that information and knowledge
00:07:58
should be um
00:07:59
collected and documented in a single
00:08:01
place
00:08:02
another big component of a data
00:08:04
governance program is data quality and
00:08:06
sometimes
00:08:07
um people see this as the most important
00:08:10
component and the reason is it's very
00:08:12
measurable
00:08:13
it's the most visual part of seeing the
00:08:16
progress of a data governance program
00:08:18
so whether it's trends or seeing the
00:08:21
data quality score
00:08:23
increase or decrease or change it really
00:08:26
gives something
00:08:27
actionable for the chief data officer
00:08:29
and the data team
00:08:30
to to take action on so data quality and
00:08:33
being able to see that holistically
00:08:35
is very important now
00:08:38
to begin with in any data governance
00:08:40
program you know before you actually
00:08:42
find the data and know your data um we
00:08:45
focus
00:08:46
a lot on data discovery um i think it's
00:08:49
an important concept to really having a
00:08:52
strong data management
00:08:54
program um because you really need to
00:08:57
know
00:08:58
where all your assets are in in your
00:09:00
organization
00:09:01
so chief data officers i think a lot of
00:09:04
the
00:09:04
biggest pain points they have and that
00:09:06
they shared with us
00:09:08
is the fact that um you know they they
00:09:11
stay up late at night and these are
00:09:12
things they worry about
00:09:13
is what new data is being created or
00:09:16
being ingested
00:09:17
in my organization that i'm just not
00:09:20
aware of
00:09:21
so again referencing back to the iceberg
00:09:24
what
00:09:24
is the data that's below the water line
00:09:27
that i'm just
00:09:28
not sure i have i'm not sure if it's
00:09:31
part of my governance processes
00:09:33
and just making sure that they have
00:09:35
complete understanding of the coverage
00:09:37
of the data with that's within the
00:09:39
organization
00:09:41
and secondly with automated data
00:09:43
discovery
00:09:44
really ability to automate a lot of the
00:09:47
manual tasks that goes on today within a
00:09:50
governance team
00:09:51
so for example you know not only do they
00:09:54
have to know what their data is they
00:09:56
have to know where
00:09:57
their data is so being able to link
00:10:00
their logical assets to the physical
00:10:03
objects
00:10:04
where what table and columns um
00:10:07
something like an email address is is
00:10:10
saved in
00:10:11
whether it's in your customer database
00:10:13
your sales and marketing
00:10:15
your financial database may have all
00:10:16
this information you need to be able to
00:10:19
document
00:10:19
all the instantiations of this data but
00:10:22
you don't want to do this manually and
00:10:24
oftentimes this is a joint manual effort
00:10:28
by your teams and this is where
00:10:30
automation
00:10:31
can can really help to modernize and
00:10:34
reduce a lot of these activities
00:10:37
um and bring by bringing in automation
00:10:39
as well you're able to put in a process
00:10:41
where it's
00:10:42
continuous monitoring you don't have to
00:10:45
rely
00:10:45
on a person a resource to actually be
00:10:49
doing the checking and the curation and
00:10:52
this type of validation
00:10:53
on new data that may have popped up in
00:10:56
your organization so
00:10:58
being able to show to audit that you
00:11:01
have
00:11:01
a automated continuous monitoring
00:11:03
process certainly helps to
00:11:05
increase the overall maturity of your
00:11:07
program
00:11:10
so i'm going to talk about a really
00:11:14
big id data discovery um this will be
00:11:17
the one or two slides um talking about
00:11:19
more
00:11:19
of the specific product in itself and
00:11:22
you know to use this as a comparison
00:11:24
to how we do data discovery so um
00:11:28
you know first of all its most important
00:11:30
thing is extensible data coverage i mean
00:11:33
being able to
00:11:34
connect to all data sources you don't
00:11:37
want to
00:11:37
build a governance program that's very
00:11:40
silo that only focuses on one or two
00:11:42
data sources
00:11:43
um just because it's not going to be
00:11:46
leverageable and scalable
00:11:48
in in your entire organization so being
00:11:50
able to connect
00:11:52
to all your you know your hadoop's
00:11:56
your saps your work days um if you're
00:11:59
using google cloud
00:12:00
all that all the data's possible sources
00:12:04
need to be connected and built and
00:12:06
produced
00:12:07
in a what we call a catalog so a
00:12:10
catalog is really our ability to
00:12:13
not only connect collect your metadata
00:12:16
your technical
00:12:17
business operational metadata but bring
00:12:20
able
00:12:20
being able to bring together your
00:12:23
structured and unstructured data into
00:12:24
one view
00:12:26
classification and this is really
00:12:28
important because i remember as a data
00:12:30
steward
00:12:32
spending manual time to look through
00:12:34
each of the data values
00:12:36
and identifying the sensitivity level
00:12:40
so being able to um automate
00:12:43
the sensitivity level the risk level
00:12:46
and the actual content and classify that
00:12:49
and label that
00:12:50
through machine learning um today is you
00:12:53
know
00:12:54
is such a time saver and it's much more
00:12:56
efficient and
00:12:57
leveraging um machine learning to do
00:13:00
this
00:13:01
will help complete the classification of
00:13:05
all the data that's in your organization
00:13:08
so it's data that's in your structured
00:13:10
and unstructured so including
00:13:13
documents that can be classified and
00:13:15
it's really important so
00:13:17
it can provide faster time to value
00:13:19
within your organization
00:13:21
it can be consumed faster and earlier in
00:13:24
the day
00:13:25
of data life cycle to your analytics and
00:13:27
data science teams
00:13:29
so it's really great big benefits that
00:13:32
we see
00:13:32
in terms of classification a cluster
00:13:36
analysis
00:13:36
is a one of our patented methodologies
00:13:39
to leverage machine learning
00:13:42
to understand groupings of your data
00:13:44
that
00:13:45
can be that's probably duplicate or very
00:13:48
similar
00:13:48
um and right now we see it
00:13:52
in uh in the unstructured world where
00:13:55
you want to know how many copies of a
00:13:58
document or an excel file that you have
00:14:00
saved
00:14:01
and it's really important when you're
00:14:02
talking about uh
00:14:04
saving down saving um the the memory
00:14:08
in terms of the database space you want
00:14:12
to cut down
00:14:12
on how many duplicates you have or
00:14:14
you're doing
00:14:16
a data lake or a cloud migration you
00:14:19
really want to be able to
00:14:20
uh only keep the golden copies
00:14:23
of your data so being able to do that
00:14:26
analysis
00:14:27
in a smart way is where cluster analysis
00:14:31
can can really come in
00:14:33
and lastly here correlation correlate
00:14:36
this is where it's really critical for
00:14:40
compliance to privacy regulations like
00:14:43
gdpr
00:14:44
and in the united states uh california
00:14:46
consumer privacy act
00:14:48
it's really helpful because not only do
00:14:50
you have to identify within your data
00:14:52
organization
00:14:53
what is personal information there's a
00:14:56
concept of
00:14:56
personal identifiable information so
00:14:59
information like your your health and
00:15:01
medical records
00:15:02
to to your cookie settings and ip
00:15:05
records
00:15:06
those are identifiable information that
00:15:10
may not
00:15:10
necessarily be tied to um an individual
00:15:14
and it's saved in your database
00:15:16
it's probably information that's um
00:15:18
saved in
00:15:20
other databases but it still describes a
00:15:23
single person or
00:15:24
entity therefore needs to be produced
00:15:28
when uh the data subject access requests
00:15:32
are required
00:15:33
so that's where it's tricky for
00:15:35
technology teams to be
00:15:38
to be able to find this information
00:15:40
directly because this is these are
00:15:42
indirect
00:15:42
attributes that's ties to a person so
00:15:45
correlation
00:15:46
has been really key in in helping many
00:15:48
organizations
00:15:49
um be compliant with privacy protection
00:15:52
laws
00:15:53
but also being able to find and
00:15:55
correlate
00:15:56
all related information that's tied to a
00:15:58
person
00:16:01
now when chief data officers and data
00:16:05
management programs have this foundation
00:16:09
the single source of truth that's built
00:16:11
on
00:16:11
automation machine learning smarter
00:16:15
insights
00:16:16
on top of that foundation you know we
00:16:18
can build
00:16:19
very specific capabilities that's for
00:16:22
privacy
00:16:22
security and when it comes to governance
00:16:25
you know
00:16:26
smarter data quality data retention
00:16:29
rules because we're able to
00:16:31
already read through the files we know
00:16:34
when a file has last been updated or
00:16:37
opened
00:16:37
and we can um compare it to the relevant
00:16:41
policy
00:16:42
so things like um that are already being
00:16:45
done today when it comes to
00:16:47
a catalog and to a data quality rule
00:16:51
is can run more efficiently and can be
00:16:54
more scalable
00:16:55
with um with the ai discovery
00:16:58
foundation so just to sum up a bit
00:17:02
machine learning ai led discovery um
00:17:05
you know leverages machine learning to
00:17:07
um content
00:17:09
um identify the content uh faster and
00:17:12
smarter
00:17:13
and label it based used on using natural
00:17:16
language processing
00:17:18
in terms of enriching the data it's
00:17:20
automated
00:17:21
based on our ability to leverage these
00:17:23
classifiers
00:17:25
and also be able to take action
00:17:28
on your data in a smarter way because
00:17:31
you know the clusters that are
00:17:32
duplicates
00:17:34
and you know everything as correlated or
00:17:36
related
00:17:37
tied back to a person or entity
00:17:41
so it's really about intelligent
00:17:44
classification
00:17:46
and some of the benefits that we've
00:17:48
really seen
00:17:49
um from our customers and from for
00:17:52
many people that we've spoken to is well
00:17:55
what are the business benefits
00:17:57
um so in terms of a catalog
00:18:00
you know many organizations have to you
00:18:03
know document
00:18:04
the processes or you know document the
00:18:06
information
00:18:07
instead of doing it manually um a
00:18:10
catalog
00:18:11
that's um you know can be that can be
00:18:14
automatically
00:18:14
um collected and updated can reduce the
00:18:18
the work that needs to go into this
00:18:20
documentation process
00:18:22
and also the fact that our catalog um
00:18:25
you know you want to look at a catalog
00:18:26
that not only has structured information
00:18:28
but you want to be able to see all your
00:18:31
data assets holistically in one place
00:18:34
and
00:18:34
if if that's possible then certainly
00:18:37
being able to see the impacts of one
00:18:40
change can
00:18:40
do on other applications and other
00:18:44
documents is
00:18:45
it's something that's much easier to see
00:18:47
if you have a full understanding of that
00:18:50
and also the ability to within your
00:18:52
catalog not only
00:18:53
are you seeing the classifiers but
00:18:56
really one view
00:18:58
of that dashboard view that i spoke
00:18:59
about where chief data officers can
00:19:01
see their assets see the classifications
00:19:04
of what the data is
00:19:05
and also the profiling statistics of
00:19:09
within uh data quality to understand
00:19:12
the completeness of the data is it some
00:19:14
is this um
00:19:15
a data value that's missing information
00:19:18
that
00:19:18
i should be alerted to to take further
00:19:21
action on
00:19:22
so that's a value of this type of
00:19:24
information
00:19:25
um and again classification what is one
00:19:29
of the biggest
00:19:30
um you know pain points i think as a
00:19:32
data stored
00:19:33
to not really being able to understand
00:19:36
the data
00:19:36
taking the time um to to review the data
00:19:40
value
00:19:41
and then to confirm it with a subject
00:19:42
matter expert that takes time and
00:19:44
imagine doing it for every single data
00:19:47
set
00:19:48
and data element that's you know really
00:19:50
time consuming
00:19:52
so really being able to leverage ai to
00:19:54
to your benefit
00:19:56
and to have these labels applied earlier
00:20:00
in the life cycle management so that it
00:20:03
can be consumed by your data science the
00:20:05
analytics teams or any other
00:20:07
business user so that and everyone is
00:20:11
clear
00:20:11
exactly on what this data is and what it
00:20:14
should be
00:20:15
used for i spoke briefly about um
00:20:20
clustering and certainly it's certainly
00:20:22
been used
00:20:23
in our cloud migration use cases
00:20:27
um and if you think about um the
00:20:30
many organizations are beginning or on
00:20:32
their way to journey to the cloud
00:20:35
really it's not a simple lift and shift
00:20:37
activity
00:20:38
it's uh it's takes careful analysis to
00:20:42
prioritize the data sets that should be
00:20:44
migrated over and then within each data
00:20:46
set figuring out exactly which ones
00:20:49
should go into the cloud and before that
00:20:52
um journey can be successful one you
00:20:55
need to actually have a very
00:20:57
complete inventory of what you have
00:20:59
where it is and exactly where it's being
00:21:01
migrated
00:21:02
you also want to take this opportunity
00:21:04
to clean up your data so being able to
00:21:06
see
00:21:07
all the duplicates you want to reduce
00:21:09
the redundancies
00:21:11
and fix any data quality issues before
00:21:13
you do the migration
00:21:15
so these are the type of activities that
00:21:17
need to be done
00:21:18
but how can you do that in a smart way
00:21:22
um so certainly these um a technique
00:21:25
like clustering can help you
00:21:27
reduce the risk and making sure that
00:21:30
you're adhering to your organization's
00:21:33
data policies as well as you whether
00:21:35
you're doing
00:21:35
the cloud migration and then once the
00:21:38
data is migrated to the cloud
00:21:40
being able to connect to that data
00:21:44
to your other data sources and
00:21:46
structures and making sure
00:21:48
that everything is in alignment as well
00:21:50
so that's another consideration to think
00:21:52
about
00:21:53
is the maintenance part of your
00:21:56
cloud journey as well even after it's
00:21:58
been done
00:22:00
so i'm going to talk about some of the
00:22:02
um use cases that we've seen
00:22:04
specifically
00:22:05
i think um many of the organizations
00:22:08
that we've seen
00:22:10
you know started with the need uh
00:22:13
to do a privacy or some type of
00:22:16
regulatory
00:22:17
driven initiative so in this first use
00:22:20
case
00:22:20
it's a global athletic brand retail
00:22:23
company
00:22:24
that has over 150 billion dollars in
00:22:26
revenue
00:22:27
73 000 employees and really looking at
00:22:31
over 1200 data sources
00:22:34
and having a need to comply with gdpr
00:22:38
and their local privacy laws
00:22:42
being able to look in their entire
00:22:44
landscape and identify
00:22:45
what is sensitive um
00:22:48
being able to classify and label
00:22:51
specifically their personal
00:22:53
information that's tied to a customer
00:22:55
employee and also the personally
00:22:57
identifiable information
00:22:59
so having a need to classify i see that
00:23:02
in
00:23:03
use cases whether it's in privacy
00:23:05
security and data governance
00:23:07
or other organizations that really have
00:23:10
a project for
00:23:11
classification and then from there um
00:23:14
putting it into a catalog that can
00:23:17
display the classification
00:23:19
bring together the different product
00:23:23
lines
00:23:23
in the retail company being able to
00:23:26
align and manage
00:23:28
not only their products but their
00:23:31
customer data their employee data
00:23:33
and making sure that they are um
00:23:36
in compliance with privacy but over the
00:23:40
last few years
00:23:41
that this company has also looked into
00:23:44
expanding and
00:23:46
their data governance program because
00:23:48
they've seen the benefits of this data
00:23:50
catalog
00:23:51
um how it's helped them identify
00:23:53
sensitive private data
00:23:55
um you know being able to expand that to
00:23:58
um
00:23:58
the definition of sensitivity you know
00:24:00
other types of critical
00:24:02
data um when it comes to their product
00:24:05
to shipping their product
00:24:07
other types of critical um and
00:24:09
confidential data
00:24:11
that they want to make sure is also
00:24:13
being monitored
00:24:14
um and to be able to classify and tag
00:24:17
that
00:24:17
it's been quite important for them
00:24:21
the second use case is much larger in
00:24:24
scale it's a global marketing
00:24:26
and digital brand portfolio company so
00:24:28
within this portfolio company
00:24:31
they have assets in the finance and
00:24:35
healthcare
00:24:36
and retail so it's really a
00:24:37
multi-billion dollar company with
00:24:39
multiple brands
00:24:40
that they have to keep a siloed but also
00:24:43
being able to
00:24:44
understand their privacy initiatives as
00:24:47
well
00:24:48
because of their status as a global
00:24:51
global company
00:24:52
there's multiple regulations that they
00:24:55
have to be
00:24:56
compliant with so this is how do they
00:24:59
juggle and manage
00:25:00
the fact that they're sometimes um you
00:25:02
know differentiating and
00:25:04
distinct regulations in each local
00:25:06
region
00:25:07
that they have to be mindful of so being
00:25:09
able to comply
00:25:11
with hipaa in healthcare and also
00:25:14
finding their personal information and
00:25:17
bringing together
00:25:18
their structure data and unstructured
00:25:21
data
00:25:21
you know have multiple
00:25:24
data sources across the globe that they
00:25:27
have to be too mindful of so
00:25:29
in terms of you know federation being
00:25:32
able to see
00:25:33
a singular view but also having the the
00:25:36
local view
00:25:37
in which the globe the local offices can
00:25:40
manage their data as well
00:25:44
so in really summary here
00:25:47
by having a modern data governance
00:25:49
program
00:25:51
starting with data discovery to know
00:25:54
your data
00:25:54
will lead to know where your data is and
00:25:58
leading to know the quality of your data
00:26:00
all those three
00:26:01
basic component parts you know starts
00:26:03
with data discovery
00:26:05
um you know so advocating for
00:26:08
you know expanding beyond you know
00:26:10
structured as your source it's really
00:26:12
unstructured where we see a lot of the
00:26:16
risk and compliance teams
00:26:17
really raising their hands and saying
00:26:19
that their issues
00:26:21
in terms of being able to have a
00:26:23
complete holistic view
00:26:25
of their data and being able to identify
00:26:28
it
00:26:29
through classification and also being
00:26:31
able to take action on it based on
00:26:34
clustering and correlation this is
00:26:36
really where
00:26:37
we see machine learning and ai driving
00:26:40
these analysis
00:26:41
helping achieve data officer
00:26:45
modernize and scale and grow their
00:26:48
data governance capabilities much faster
00:26:52
so we see a lot of traction in this area
00:26:55
and we hope to really have this
00:26:59
opportunity
00:27:00
to anyone that's interested in learning
00:27:02
more about
00:27:03
big id or data discovery or even
00:27:06
some of the you know the use cases that
00:27:08
we see that have been
00:27:10
beneficial for financial services and
00:27:13
insurance healthcare retail companies so
00:27:16
thank you everyone uh for your time
00:27:18
today and be sure to visit
00:27:20
a big id at our virtual booth thank you
00:27:35
you