What are the two controversial statements made by the speaker?

The first statement is that language models are beneficial and innovative. The second is that AI and language models have negative societal impacts such as environmental harm and misinformation.

RDF stands for Resource Description Framework, a model to represent data about resources in the form of relations, using triples: subject, predicate, and object.

Why does the speaker have mixed feelings about language models?

The speaker appreciates their technological advancement but is concerned about their environmental impact, effect on art and labor, and the negative culture around AI.

What is the relation between RDF and language models?

RDF provides a framework for structured data which can be useful when integrated with language models, facilitating precise queries and data handling.

What is a language model?

A language model is a computational model that predicts the next token in a sequence, capturing grammar, syntax, semantics, and pragmatics of language.

What are the difficulties with RDF today?

RDF is seen as outdated by some, it started with complex formats like RDF/XML, and is associated with overarchitected libraries and enterprise-grade costly tools.

How does the speaker propose to make AI technology beneficial?

By being mindful of impacts, avoiding hasty discussions on platforms like social media, and striving to make the technology as good as possible within its trade-offs.

What is the 'semantic web'?

The semantic web is an idea to link data globally, allowing data to be easily traversed and united. It was overhyped and didn’t gain traction due to manual effort and lack of immediate benefits.

"RDF and the future of LLMs" by Luke VanderHart

00:41:53

https://www.youtube.com/watch?v=OxzUjpihIH4

Sintesi

TLDRThis talk explores the intersection of RDF (Resource Description Framework) and language models, delving into their potential integration and utility. Initially, the speaker presents opposing controversial views: language models are a technological advance and benefit but also have harmful societal impacts. The discussion transitions to RDF's capability in tackling knowledge representation, linking data, and semantic web ideals. The challenges with RDF, historically tethered to complex formats and costly tools, are acknowledged. Language models, powered by the Transformer architecture, revolutionized natural language processing by structuring grammar, semantics, and pragmatics. Emphasizing how RDF's structure aligns with language modeling, the speaker suggests that RDF aids in reasoned data handling, validating inquiries, and enabling inferential logic. This insightful approach is part of a broader neurosymbolic AI agenda, aiming to combine neural networks with logical symbols. The talk concludes with a call to ethically and innovatively utilize AI technology, ensuring benefits outweigh drawbacks and enriching societal value.

Punti di forza

📊 RDF tackles complex knowledge representation issues.
🧠 Language models revolutionize natural language processing.
📉 RDF perceived as outdated but still valuable.
🔗 Semantic web overhyped, underdelivered due to effort and incentives.
🛠️ Language models function by predicting the next token in sequences.
🔍 RDF aids in precise data queries and reasoning.
🌐 Integration of RDF and language models enhances data handling.
💡 Neurosymbolic AI aims to merge neural and logical systems.
🤔 Mixed feelings about AI's societal impacts.
🔨 Ethical and beneficial use of AI remains crucial.

Linea temporale

00:00:00 - 00:05:00
The speaker introduces the talk by acknowledging that it has been previously given to a smaller group and was well received. They express their interest in explaining the basic concepts of RDF and language models and their integration, starting with controversial statements about the value and challenges of AI technology, particularly its societal impacts.
00:05:00 - 00:10:00
The speaker discusses the Resource Description Framework (RDF) as a solution for knowledge representation and reasoning, conceived by the same group responsible for internet specifications. They outline the technical achievements of RDF and its core components, despite its perceived obsolescence compared to newer technologies.
00:10:00 - 00:15:00
In this segment, the speaker explores criticisms and challenges RDF faces, such as its association with outdated formats and costly enterprise solutions. However, they note RDF's continued utility in complex data modeling across various industries, and address the misconceptions stemming from the overhype of the semantic web.
00:15:00 - 00:20:00
The speaker describes the elemental RDF concepts—resources and their descriptions—and how unique identifiers (IRIs) help overcome natural language ambiguity. They extend the discussion by linking RDF's principles to philosophical concepts of meaning and semiotics, highlighting symbolic connections and precision offered by RDF.
00:20:00 - 00:25:00
RDF's use of triples is explained as architecturally simple yet capable of detailed and precise data representation, which can integrate with various database formats. The speaker also touches on the ability to compose datasets without losing meaning, exemplifying RDF's flexibility and utility in unifying disparate data sources.
00:25:00 - 00:30:00
Attention is turned to entailment in RDF, a way of reasoning over data to derive new insights and ensure data validity. The talk diverges into a historical context, linking RDF development to symbolic AI's history, and touches upon its aspirations to encompass logical reasoning within interconnected systems referencing standards like first-order logic.
00:30:00 - 00:35:00
The speaker transitions to language models, starting with the breakthrough 'Attention is All You Need' paper that enabled modern NLP. Language models' applications are briefly showcased, along with their operational simplicity and underlying complexity. The speaker encourages exploring their operation via educational resources for deeper understanding.
00:35:00 - 00:41:53
The talk concludes with a practical view on using language models, positing that all advancements revolve around refining model inputs to produce superior outputs. By integrating RDF data in prompts, models can potentially bridge the gap between structured data and language processing, ultimately enhancing utility and reasoning capabilities.

Mostra di più

Mappa mentale

Video Domande e Risposte

What are the two controversial statements made by the speaker?
The first statement is that language models are beneficial and innovative. The second is that AI and language models have negative societal impacts such as environmental harm and misinformation.
What is RDF?
RDF stands for Resource Description Framework, a model to represent data about resources in the form of relations, using triples: subject, predicate, and object.
Why does the speaker have mixed feelings about language models?
The speaker appreciates their technological advancement but is concerned about their environmental impact, effect on art and labor, and the negative culture around AI.
What is the relation between RDF and language models?
RDF provides a framework for structured data which can be useful when integrated with language models, facilitating precise queries and data handling.
What is a language model?
A language model is a computational model that predicts the next token in a sequence, capturing grammar, syntax, semantics, and pragmatics of language.
What are the difficulties with RDF today?
RDF is seen as outdated by some, it started with complex formats like RDF/XML, and is associated with overarchitected libraries and enterprise-grade costly tools.
How does the speaker propose to make AI technology beneficial?
By being mindful of impacts, avoiding hasty discussions on platforms like social media, and striving to make the technology as good as possible within its trade-offs.
What is the 'semantic web'?
The semantic web is an idea to link data globally, allowing data to be easily traversed and united. It was overhyped and didn’t gain traction due to manual effort and lack of immediate benefits.

Visualizza altre sintesi video

Ottenete l'accesso immediato ai riassunti gratuiti dei video di YouTube grazie all'intelligenza artificiale!

Sottotitoli

Scorrimento automatico:

00:00:04
um thanks for coming everyone this is a
00:00:05
version of a talk that I gave to a much
00:00:07
smaller group um several months ago uh
00:00:09
was pretty well received um it's what
00:00:11
I'm working on right now and so I'm just
00:00:14
hopefully explain some of the the basic
00:00:16
concepts and ideas we'll go over what
00:00:18
rdf is uh what language models are um
00:00:21
and then why they go well together which
00:00:23
I think is unintuitive to most people
00:00:25
unless you've already been thinking
00:00:26
along these lines um yeah so we'll go
00:00:28
from there going to start with a
00:00:30
controversial statement
00:00:33
um and I'm going to make two
00:00:34
controversial statements that are
00:00:35
mutually exclusive so you'll find
00:00:37
yourself in agreement with one and in
00:00:38
disagreement with the other most likely
00:00:40
or maybe you're like me and you feel the
00:00:41
tension so the first one is uh language
00:00:44
models are pretty neat and I like them
00:00:47
uh we never imagine that we'd be able to
00:00:49
compute over unstructured text like this
00:00:52
uh and it really feels like a technical
00:00:54
advance and I like technical advances I
00:00:56
think that technology on net is a real
00:00:59
Force for good um especially if you can
00:01:01
get the societal structures aligned so
00:01:03
definitely go vote in a few days um but
00:01:07
the interest I have in this specific
00:01:08
Tech is also deeply personal um my
00:01:11
academic background is in philosophy and
00:01:14
Linguistics so and then all my work has
00:01:17
been in computers and data so having
00:01:19
them merge like this it's you know very
00:01:22
interesting personally we have a real
00:01:24
serial Chinese room in the room with us
00:01:26
right now that's wild uh for those of
00:01:28
you are familiar to philosophy uh but
00:01:31
the second controversial statement is
00:01:34
language models suck or rather AI sucks
00:01:37
and specifically the way our culture has
00:01:39
been using it um it's bad for the
00:01:41
environment um it's bad for art it's bad
00:01:44
for labor um or at least artists and um
00:01:47
labor are upset about many aspects of it
00:01:49
um we are drowning in slop and spam and
00:01:52
disinformation um and the ecosystem at
00:01:54
large has some good people in it but it
00:01:55
also attracts a lot of the absolute
00:01:56
worst sorts and so I have very mixed
00:01:59
feelings about working on this stuff um
00:02:01
so here's how I've decided to approach
00:02:03
it first of all just be mindful of the
00:02:05
impact I think we don't as technologists
00:02:06
have the luxury of doing something just
00:02:08
because it's cool or just because we can
00:02:09
you really have to think about how the
00:02:11
products we build are going to affect
00:02:12
the world um also do not talk about this
00:02:15
stuff on social media most for the most
00:02:17
part um nobody there has a reasoned well
00:02:19
thought out opinion um these are for for
00:02:22
long more thoughtful
00:02:23
conversations and the final thing what
00:02:25
this talk is about is how can we make
00:02:26
the tech actually good if it comes with
00:02:28
all these trade-offs if there's all
00:02:29
these negative externalities how good
00:02:31
does it have to be before it becomes net
00:02:33
good for the world in a positive force
00:02:35
and it seems like it's here people are
00:02:37
going to be using it so let's make it as
00:02:39
good as it can possibly be but how um
00:02:44
technology is good language models our
00:02:46
technology how do we make language
00:02:47
models useful and my answer is the
00:02:50
resource description framework so
00:02:53
hopefully um at the end of this talk you
00:02:55
will sort of see what I see but first
00:02:58
let's talk a little bit about rdf what
00:02:59
it is is where it came from so rdf is an
00:03:02
attempt to tackle some of the hardest
00:03:04
problems of knowledge representation and
00:03:06
reasoning um it came about just after
00:03:08
the internet um from the same group of
00:03:10
people that put together all the
00:03:11
internet specifications the w3c um and a
00:03:14
lot of people that came out of the
00:03:15
symbolic AI um boom of the 80s so we're
00:03:18
trying to you know reach AI through
00:03:20
logic
00:03:22
um these people are all giant nerds and
00:03:24
they're really trying to do the right
00:03:26
thing and to build like the ultimate
00:03:28
data framework and
00:03:30
you know it's got it pros and cons um
00:03:32
but I think to a large degree they
00:03:34
succeeded um this is the kind of
00:03:37
definitive document the the concepts an
00:03:39
abstract syntax because of course it's
00:03:40
abstract um I really love this document
00:03:43
it's quite readable um if you're enough
00:03:45
of a nerd you might find this document
00:03:47
sort of inspiring um and I think rdf in
00:03:50
general at its core is a pretty
00:03:52
brilliant technical achievement um for
00:03:54
clarity and vision around how we can or
00:03:57
how we might work with knowledge on
00:04:00
computer systems they're trying to solve
00:04:02
this at a very high level however rdf
00:04:04
today um and many of you may be
00:04:06
surprised to see this in a talk um at a
00:04:09
contemporary Tech conference a lot of
00:04:10
people see it the same way they see soap
00:04:12
or ejbs or Visual Basic or react um it's
00:04:16
one of these tactics you know it's just
00:04:17
not that cool anymore um and of course
00:04:20
it's gone through the whole hype cycle
00:04:21
and now it's very much not in the hype
00:04:23
cycle uh and there's a few reasons for
00:04:25
this that I do want to kind of go over
00:04:26
before we get more into the meat of it
00:04:28
uh one is rdf XML which is one of the
00:04:31
initial formats um I don't know how many
00:04:33
of you remember the early 2000s well um
00:04:36
but everyone was just doing lines of XML
00:04:38
at work all the time it
00:04:41
was there was a lot going on there um
00:04:44
this is a verbos complex format it's
00:04:46
just honestly not great but for the
00:04:48
early days of rdf it kind of got really
00:04:49
strongly associated with this even
00:04:51
though this is only one of many many
00:04:52
formats you can use also a product of
00:04:55
its times that all the libraries that
00:04:57
were written to use it were in the early
00:05:00
2000s um and what are libraries in the
00:05:02
early 2000s like they're all massively
00:05:05
over architected object-oriented
00:05:06
programming um this is a historical
00:05:08
accident that all the library all the
00:05:10
rdf libraries look this way because it's
00:05:14
just that they co-occurred in history
00:05:15
there's nothing about rdf that makes it
00:05:17
a good fit for object-oriented
00:05:18
programming and closure is actually a
00:05:20
way better fit um so I wish there was a
00:05:21
good closure library for
00:05:25
this another reason is that there
00:05:27
actually are a lot of pretty good really
00:05:29
solid robust um Enterprise grade uh rdf
00:05:32
implementations out there and they all
00:05:34
cost call me money um you cannot just
00:05:38
start using these so there's a real Gap
00:05:39
in the market for Quality accessible
00:05:41
tools um but the existence of all these
00:05:45
really heavyweight Enterprise uh tools
00:05:47
for rdf does tell us something which is
00:05:49
there's actually a quite established
00:05:51
market for this stuff rdf is being used
00:05:53
really productively in science heavy
00:05:55
industry government anywhere you need to
00:05:58
really model super complex information
00:05:59
information with Precision um so there's
00:06:02
a surprising number of domains where rdf
00:06:04
based standards are the standard uh for
00:06:06
modeling data another reason is that the
00:06:09
semantic web was quite simply overhyped
00:06:11
uh for the time it's a great idea we'll
00:06:13
link all our data we can Traverse all
00:06:14
the data everything will be unified and
00:06:15
we will live in this data Utopia um but
00:06:18
the problem was it required so much
00:06:19
manual effort to publish and consume
00:06:21
constantly um and all the benefits were
00:06:24
like abstract there was no immediate
00:06:25
financial incentive for anyone to
00:06:27
actually go Implement all their data as
00:06:29
link data and expose it in this way um
00:06:31
and so despite tons and tons of books
00:06:33
because it's a great idea everyone got
00:06:34
super excited about it and then nobody
00:06:36
actually did it and then people just
00:06:37
kind of came to the conclusion that this
00:06:38
is a bad
00:06:40
idea um so this is really what I want um
00:06:44
you know I wish this book got popular um
00:06:47
that I lived in that history uh but the
00:06:50
truth is rdf is all good parts because
00:06:52
it's all separate all the things that
00:06:54
are not good parts are not they're just
00:06:56
kind of the surrounding ecosystem which
00:06:57
we can replace or or not choose to use
00:07:00
um so hopefully I can convince you that
00:07:04
rdf is great and what is the elegant
00:07:06
core uh I'll describe what the actual
00:07:08
technology is and how you use it um it's
00:07:11
the resource description framework so
00:07:13
let's talk about resources first um what
00:07:16
is a resource a resource is anything in
00:07:18
the world that you can talk about people
00:07:21
fictional entities abstract Concepts
00:07:23
material objects um data anything that
00:07:26
can be the subject of language can be
00:07:28
the subject of rdf
00:07:30
how do we represent anything in a
00:07:33
computer computers can't represent
00:07:35
anything they have bits well literals
00:07:37
are easy because they are bits so
00:07:38
anything that is itself um can be a
00:07:41
resource you know then a computers do
00:07:43
know how to actually have the number
00:07:44
numbers and and strings as
00:07:46
resources um we can also represent
00:07:48
things kind of abstractly without
00:07:50
talking about what they are these are
00:07:51
like variables or or pronouns that you
00:07:53
can talk about something without ever
00:07:55
trying to say exactly what it is um so
00:07:59
we can talk about something without
00:08:01
using its name which can be useful
00:08:03
sometimes and then sometimes we do want
00:08:04
to name things and this is very this is
00:08:07
a lot of letters to say a unique
00:08:09
ID um and really that's all it is is it
00:08:12
needs to be unique ID so what are some
00:08:14
Iris resource identifiers um U IDs right
00:08:17
they're unique we know that um anything
00:08:20
that has a known naming
00:08:23
Authority so we know the ISBN system
00:08:25
this is guaranteed to be unique um most
00:08:28
commonly you'll see urls and URLs are
00:08:30
great for uniqueness because they
00:08:31
establish in their domain an authority
00:08:33
and whoever owns that domain takes on
00:08:34
responsibility and Authority for making
00:08:36
sure there's no duplications or um
00:08:39
ambiguity in the rest of the URL and
00:08:42
these uis can be resolvable you can
00:08:43
actually hit this in your browser and
00:08:44
it'll ask you what format you want to
00:08:46
download Abraham Lincoln's data in and
00:08:48
they can also be non-resolvable so for
00:08:50
example this is URI that I made just now
00:08:53
this morning and it defines my concept
00:08:56
of Excellence I haven't written this
00:08:58
down anywhere nobody knows what it is is
00:08:59
except for me um but I still can make a
00:09:01
U iri for it and talk about it um and
00:09:04
it's the name of a um something you
00:09:07
can't resolve this but I can still use
00:09:09
it to talk about that Concept in the rdf
00:09:11
vocabulary and of course because Iris
00:09:13
are pretty long and cumbersome you can
00:09:15
shorten them every syntax has shortened
00:09:17
prefixes um kind of like closure
00:09:18
keywords so when you see a bunch of Iris
00:09:21
around in practice you can ually it
00:09:24
looks a lot
00:09:26
cleaner so why Iris why do we have why
00:09:28
is unique identifier so important um why
00:09:31
do we put so much focus into making sure
00:09:33
resources are unique well it's because
00:09:35
in language context is everything take
00:09:37
the sentence my friend Joe what I just
00:09:39
start a sentence that way what do you
00:09:41
need to know in order to make sense of
00:09:42
that sentence you need to know who's
00:09:45
speaking you need to know uh do you know
00:09:47
Joe or not that could change the context
00:09:50
um Joe who I haven't said his last name
00:09:52
I just said Joe right so natural
00:09:54
language is really inherently ambiguous
00:09:57
and we rely a ton on context to fix it
00:10:00
and the problem is we do this with
00:10:01
programming too in most programming
00:10:02
systems when you get data it comes to
00:10:04
you like this you say I have a name and
00:10:06
I have a handle and I have an ID and now
00:10:10
I can process it but in order for me to
00:10:12
process as a programmer I need to supply
00:10:13
the context I need to understand the
00:10:15
system I need to understand where the
00:10:16
day is coming from I need to understand
00:10:17
what it means and then maybe I can write
00:10:19
code against
00:10:20
it the goal of Iris an rdf is that they
00:10:24
carry their context with them right um
00:10:29
my friend Joe his handle well that's his
00:10:30
LinkedIn handle and when I see that if
00:10:32
you just hand me a scrap of paper that
00:10:34
says oh well it's a LinkedIn handle and
00:10:36
oh that's his social security number
00:10:38
that's the ID it's not his real one
00:10:41
um
00:10:42
um you know it it carries its own
00:10:45
context the data brings its own
00:10:48
values and this also gets super
00:10:50
philosophical really quick which is
00:10:51
probably why I like it um Iris are very
00:10:54
closely related to the philosophical
00:10:56
field of semiotics which is really
00:10:58
important for logic philosophy
00:11:00
Linguistics and literature lotss of
00:11:01
fields use this um there's a ton of
00:11:03
thought about how a sign or a symbol can
00:11:05
be about something in the real world
00:11:06
that's what this famous painting is
00:11:08
about like is it a is it a pipe is it a
00:11:09
picture of a pipe is it me talking about
00:11:11
a picture of a pipe um what are the
00:11:13
layers of indirection how do you
00:11:14
dreference a pointer in your brain to
00:11:16
Something in the real world that's the
00:11:18
field of semiotics and for rdf it's how
00:11:21
do you dreference
00:11:22
a identifier in a computer to Something
00:11:25
in the real world and of course you
00:11:26
don't actually reference it but you can
00:11:27
use it in systems with the understanding
00:11:29
that it does dfference conceptually
00:11:32
something this is also very similar to
00:11:34
the work of Ferdinand D he was uh one of
00:11:36
the foundational figures in modern
00:11:38
Linguistics he wrote A Course in general
00:11:40
Linguistics in 1916 his concept of
00:11:42
meaning was that everything is a
00:11:44
semantic Network every sign or every
00:11:47
word gains meaning by its opposition and
00:11:49
relation to every other sign in the
00:11:52
vocabulary um and that was his
00:11:54
definition of meaning he's like there's
00:11:55
really nothing much more to meaning than
00:11:56
that except it's this network and
00:11:58
everything is defined in relation into
00:11:59
everything else so it's like a densely
00:12:02
connected graph of all the words which
00:12:03
sounds kind of familiar um sounds a
00:12:05
little bit like link data Maybe and uh
00:12:07
it also maybe sounds like some other
00:12:08
things that we'll talk about later okay
00:12:11
so we have resources how do we have
00:12:14
descriptions um well we've got resources
00:12:17
in Iris so how do we express relations
00:12:19
between different resources I think
00:12:21
everyone in this room is pretty
00:12:22
comfortable with triples subject
00:12:24
predicate object entity attribute value
00:12:26
if you're in English it's subject verb
00:12:28
object it is a one of the most granular
00:12:31
ways of representing a singular piece of
00:12:33
information in kind of the you can't
00:12:35
really decompose it further you can have
00:12:37
a single resource but then it's just I
00:12:39
say a name but if I want to say anything
00:12:40
about that name this is about as small
00:12:42
as you can get
00:12:45
um yeah so here's a bunch of things I'm
00:12:47
saying about Joe and there are all
00:12:49
different things I can say his real name
00:12:51
I can say his name according to LinkedIn
00:12:52
I can say he knows me I can say he's the
00:12:54
same as this other we can just say a
00:12:56
bunch of stuff about jono
00:12:59
and it's important that both the subject
00:13:00
and the predicate and the object of an
00:13:02
rdf triple all three parts of it um are
00:13:05
very very precise we can be more precise
00:13:07
than English so sorry the text is a
00:13:09
little bit small but um there are two
00:13:11
statements here and one is Luke
00:13:14
loves their child and the other one is
00:13:16
Luke loves closure and normally in
00:13:18
English again semantic ambiguity but in
00:13:21
rdf I've actually have the iri of two
00:13:23
separate dictionary subheadings of the
00:13:25
word love so I can be very very precise
00:13:27
about what I mean and if one was not in
00:13:29
the dictionary I could go make up my own
00:13:30
iri that had the nuances that I wanted
00:13:32
to attach to that statement so we're
00:13:35
we're packing a lot of meaning into the
00:13:36
iris and we can be much much more
00:13:38
precise than English and actually much
00:13:40
much more precise than most other data
00:13:44
formats this generality of triples also
00:13:47
means that you can go back and forth
00:13:50
between almost any other data format
00:13:51
relational databases key value column
00:13:53
stores document based all of these can
00:13:55
be converted to triples and the
00:13:57
operation is actually conceptually
00:13:58
similar add the context so in example in
00:14:01
a relational database the context is
00:14:02
like oh what are all the tables and what
00:14:04
is the structure and where does the
00:14:04
database live you can kind of compress
00:14:06
that all that information into the
00:14:09
semantics of the iris themselves and
00:14:12
then of course you can go back in the
00:14:13
opposite
00:14:13
direction what does this mean it means
00:14:16
that an rdf data set is a set in the
00:14:19
closure sense or the mathematical sense
00:14:21
it's just a bunch of triples that are
00:14:22
not duplicated because every iri and
00:14:24
every triple is guaranteed to be unique
00:14:27
and we don't need to know anything else
00:14:28
about what table there are or what
00:14:30
folders there are or what trees or
00:14:31
directory structures or anything we just
00:14:34
have
00:14:35
sets and that means that we can also
00:14:37
safely Union sets so this is where the
00:14:39
concept of the semantic web comes from I
00:14:42
can I can take your data I can take my
00:14:43
data I can just slam them together with
00:14:44
a set Union and it's still something
00:14:46
meaningful and intelligible which is not
00:14:48
true of most other database systems it's
00:14:50
Federation for
00:14:53
free
00:14:55
so we've described the core of rdf um
00:14:59
that's really it resources and triples
00:15:03
uh not nothing conceptually difficult
00:15:04
there but it's worth saying something
00:15:06
else about what the designers Invision
00:15:07
you know they call it a framework and
00:15:08
with with a framework they want you to
00:15:10
build things with a framework right and
00:15:12
we can describe data but what more is
00:15:15
there so this guy is actually the
00:15:17
primary contributor to the rdf standard
00:15:19
um Aristotle um I'm not even joking he
00:15:24
invented the word subject predicate and
00:15:25
object in the context in which we're
00:15:27
using them now um and this entire book
00:15:30
is about you know the first chapter of
00:15:31
this is listing all the types of things
00:15:32
that exist in the world according to
00:15:33
Aristotle and the rest of the book is
00:15:35
the foundation of modern Western logic
00:15:39
um he builds it all starting from here
00:15:41
what what what can you say about what
00:15:42
kinds of things um and rdf is really
00:15:46
bigger than just data or storing data
00:15:49
like we normally think of it's more than
00:15:50
just a spreadsheet or a table or a
00:15:51
bucket that I can put data it's about
00:15:53
representing knowledge and knowledge is
00:15:56
not limited to just things I've written
00:15:58
down it's also limited to the things I
00:16:00
know because of the things I wrote down
00:16:02
right it's basically rdf as a
00:16:06
system is designed to make it possible
00:16:08
to talk about all the things I know
00:16:10
whether I know them concretely or
00:16:12
abstractly or in theory but I never
00:16:13
bother to actually think about them in
00:16:14
calculate them the actual data that is
00:16:16
actually sitting on bits in a database
00:16:17
is largely incidental um to a lot of
00:16:20
uses of
00:16:21
rdf so this is all about entailment
00:16:23
entailment means that given a set of
00:16:25
triples I can derive other triples from
00:16:28
them con ctually either lazily or
00:16:30
proactively doesn't matter um or if I
00:16:32
have a set of triples I can tell if it
00:16:34
is valid According to some definition of
00:16:38
validity and that can be really
00:16:41
useful because there's so many different
00:16:42
ways to do this the rdf uh Community has
00:16:45
a number of what they call different
00:16:46
entailment profiles um the best and kind
00:16:49
of gold standard for entailment profiles
00:16:50
is the entirety of first order logic
00:16:52
which is beyond the scope of this stent
00:16:54
expain but those are the symbols on the
00:16:55
other sides the full Suite of if then
00:16:57
else not composed with any level of
00:17:00
complexity first sorder of logic is
00:17:02
great um it's not great to use as a
00:17:05
programmer because it happens to be NP
00:17:07
complete in fact it is the NP complete
00:17:08
problem it is the definitional problem
00:17:11
for very very hard problems to solve um
00:17:13
efficiently in computer science so we
00:17:15
have a lot of other profiles that do
00:17:17
less um and are less expressive but are
00:17:19
also calculable over large data sets you
00:17:22
know before the heat death of the
00:17:23
universe and the most important thing we
00:17:25
use these for is to get back some sort
00:17:27
of level of a schema kind of tell what
00:17:29
sort of statements are meaningful and
00:17:30
what aren't you know my date of birth
00:17:33
cannot be the color purple so if I have
00:17:35
a data set that says my date of birth is
00:17:36
the color purple I can use entailment
00:17:39
over schema to say no it's not I don't
00:17:41
accept that data into my database um and
00:17:44
same as is important because it really
00:17:46
helps with Federation I can say hey
00:17:47
these two concepts they start out as
00:17:48
separate Concepts but now I'm bringing
00:17:50
after the fact a third uh statement
00:17:53
which is these are actually the same
00:17:54
concept and then that means that I can
00:17:55
now query across that and reason across
00:17:57
that really effectively
00:18:00
and really there's a large sense in
00:18:01
which rdf and and all the entailment and
00:18:04
logic associated with the rdf ecosystem
00:18:07
is the cumulation of 20th century AI
00:18:10
which was all about symbol manipulation
00:18:12
formal logic rules based expert systems
00:18:14
you know you had psych trying to build a
00:18:16
database of every fact in the universe
00:18:17
and and make you know intelligence would
00:18:19
emerge and people were very optimistic
00:18:21
about that and all these things were
00:18:22
getting very funded uh and then it
00:18:24
turned out that that didn't actually
00:18:25
lead to general intelligence it's very
00:18:27
useful in programming systems l is
00:18:29
useful um but it's not it doesn't lead
00:18:30
to intelligence so these people all went
00:18:32
and built rdf
00:18:35
instead and they they brought these
00:18:36
Concepts specifically in a way that
00:18:39
works with the internet era where
00:18:40
everything is networked and everything
00:18:41
has the potential to be linked so that's
00:18:44
rdf 20 years later little paper out of
00:18:47
Google attention is all you need this
00:18:50
paper defines the Transformer
00:18:52
architecture which is the underlying
00:18:53
breakthrough that allows all the
00:18:55
language models to work um it has an
00:18:57
intention mechanism which basically
00:18:59
allows it to train on tokens like taking
00:19:03
their position into consideration but
00:19:05
also independent of their position in a
00:19:06
sequence and once you can train with
00:19:08
that kind of flexibility it just unlocks
00:19:10
everything that language models can do
00:19:12
today so National L uh natural language
00:19:15
processing as a discipline was
00:19:16
immediately revolutionized chat GPT came
00:19:18
out just five years later which is
00:19:19
lightning speed this was like on a tiny
00:19:21
little test demo data set and then they
00:19:24
built something uh giant off of it
00:19:26
really really fast and I don't need to
00:19:28
to describe how big a Mania it is right
00:19:30
now they're they're eating the world at
00:19:31
least from a hype point of view if not
00:19:33
from a actual productivity point of view
00:19:35
yet how do they work I can't tell you
00:19:38
well I can tell you but I cannot tell
00:19:39
you in the next 20 minutes so if you
00:19:42
want to do it go to this URL there's a
00:19:43
really great carpy walks through about
00:19:45
16 hours of dense video where he live
00:19:47
codes a mini GPT I followed through I
00:19:49
did it enclosure you can too you will
00:19:51
deeply understand this when you're done
00:19:53
um yeah I'm not going to talk anymore
00:19:56
about the internals of how the model
00:19:57
actually works scope um what I do care
00:20:00
about is like defining them and like how
00:20:02
do we use them and how should we think
00:20:03
about them as software
00:20:04
developers um
00:20:07
so the atmology is actually very
00:20:09
straightforward you take a measure and
00:20:11
you have the diminutive form of it a
00:20:13
small measure a model is a small measure
00:20:16
of something
00:20:18
and this is actually really um important
00:20:21
for what these things are what is a
00:20:23
model it's a measurement of something
00:20:26
what we're doing is we're taking
00:20:27
language we're measuring it we're
00:20:29
analyzing every aspect of language and
00:20:31
we're quantifying it as much as we can
00:20:33
and we're specifying the distances
00:20:34
between all the different concepts we're
00:20:35
putting language on a bench and building
00:20:37
a small copy and measuring it along
00:20:38
every Dimension there turns out to be
00:20:40
about you know hundreds of billions of
00:20:41
Dimensions which is why there's hundreds
00:20:42
of billions of
00:20:44
parameters
00:20:46
um so the act of generating from a
00:20:48
generative language model is to create
00:20:50
replicas based on those measurements hey
00:20:52
let's emit some language but if it fits
00:20:54
up with these measurements that's kind
00:20:56
of what the real language is or then it
00:20:57
looks and and acts like language because
00:20:59
it's based off of the same measurements
00:21:00
we
00:21:02
took and interestingly an rdf data set
00:21:04
is also called a model um kind of fits
00:21:06
more in the second definition here um
00:21:08
but it's also a set of measurements
00:21:10
about the world or a set of things I've
00:21:11
chosen to say about the
00:21:14
world so what are we modeling is a model
00:21:17
what aspects of language are we
00:21:18
measuring what are we capturing we're
00:21:19
capturing grammar and syntax and we've
00:21:21
actually been modeling grammar and
00:21:22
syntax since long before we had
00:21:24
computers um you can build a simple
00:21:26
rule-based generative grammar we'll talk
00:21:28
about that more later um and and build a
00:21:30
model but language models absolutely do
00:21:32
capture the grammar and the syntax as
00:21:33
well they also capture a lot of the
00:21:36
semantics
00:21:38
um how the words stand in relation to
00:21:40
each other remember C it's almost like
00:21:42
we've built a model of that definition
00:21:45
of words with respect to each other
00:21:48
because it captures a lot of semantics
00:21:49
and actually the attention mechanism
00:21:50
lets you capture the semantics
00:21:53
contextually right which matters you
00:21:55
like it's not just defining the words
00:21:56
it's also if you have the semantics and
00:21:58
then the itics of the word in different
00:21:59
situations that's language that's um
00:22:03
what we're building a model of also the
00:22:04
pragmatics how you use it in practice
00:22:06
what the colloquialisms are how people
00:22:07
tend to
00:22:08
talk it also captures a lot of patterns
00:22:11
and this is where we can get in trouble
00:22:13
and I don't want to talk about too much
00:22:14
about this because it's a it's a bit of
00:22:16
a rabbit hole but it will pick up on
00:22:18
fact patterns uh if it sees a pattern
00:22:21
enough in the wild it will be able to
00:22:22
reproduce it pretty reliably but that's
00:22:24
not the same thing as knowing a pack a
00:22:25
fact as to be trained on a fact pattern
00:22:28
and has reasoning patterns if it sees
00:22:29
like a certain way of thinking enough in
00:22:31
its training data it can reproduce those
00:22:33
with some fair amount of accuracy or
00:22:35
even produce things by analogy or
00:22:36
extensions of them does that count as
00:22:38
true reasoning um not in the way
00:22:41
somebody writing an INF engine would
00:22:43
think of it um and it's certainly not
00:22:45
100%
00:22:47
reliable for a programmer what's the API
00:22:50
I want to use them we have a model we
00:22:52
have all the measurements of language
00:22:53
how do I how do I take measurements how
00:22:54
do I get things out of here uh this is
00:22:56
the entire API of language model it's a
00:22:58
pure function that predicts the next
00:23:00
token um all I've done is emitted a bit
00:23:02
of high school level algebra and um in
00:23:05
those three dots there but then I get
00:23:07
the probabilities of the next token
00:23:09
given a sequence of all the previous
00:23:11
tokens um and that really works and if I
00:23:13
want to get many tokens I just iterate
00:23:15
over that and choose the most probable
00:23:17
one at each step simple recursive
00:23:20
function generating text you know what I
00:23:22
admitted is that it's a very true
00:23:24
statement that I emitted some high
00:23:26
school level algebra in there um it
00:23:28
happens to be a trillion is floating
00:23:30
Point operations and about uh you know
00:23:32
hundreds of billions of constants um so
00:23:34
it really doesn't fit on the slide but
00:23:36
that's all model is it's a pure function
00:23:38
and it's a pure function that does math
00:23:39
and has some constants in
00:23:41
it that's what training is is finding
00:23:43
the constants for the
00:23:46
function so how does it work I give it a
00:23:48
sequence I say Mary had a little and
00:23:50
because it sees in patterns all over the
00:23:52
Internet it says lamb that is by far the
00:23:54
most likely answer because that's a very
00:23:56
common little rhyme in the English
00:23:58
language
00:24:01
langage I say Mozart had a
00:24:03
little well it's it's not lamb that
00:24:05
doesn't make sense it's a sister why
00:24:08
does it say sister I don't know could
00:24:10
have been anything bit star those are
00:24:12
less common than sister but you know
00:24:13
they're very they're up there also star
00:24:16
is up there because you know moart wrote
00:24:18
the music for Twinkle Twinkle Little
00:24:20
Star which is probably captured in the
00:24:22
internet it turns out that the reason
00:24:23
sister is up there and brother is not is
00:24:25
that moart does have a sister and he did
00:24:27
not have a brother
00:24:28
so you can see we're start of capturing
00:24:31
fact patterns from the training data but
00:24:33
also not in a 100% reliable
00:24:36
way it's just kind of all in the
00:24:39
stats incidentally the sister was older
00:24:41
so this is an incorrect
00:24:43
fact but just because he had a sister
00:24:46
that that bumps up the probability of
00:24:47
that
00:24:48
word so I want I want to add this to my
00:24:51
toolbox I have a bunch of tools um it's
00:24:53
great that I can like use this model to
00:24:56
kind of academically understand language
00:24:58
um I have a lot of tools at my disposal
00:25:00
to do that and I want to add this tool
00:25:01
to my toolbox but I'm still trying to
00:25:03
figure out how to use it and how to get
00:25:05
it to do what I want it to do and do
00:25:06
something useful and be on a chatbot you
00:25:08
know chatbots are great I think we've
00:25:10
fully explored the capabilities of J jpt
00:25:13
all on our own now we want to build more
00:25:15
interesting things that maybe provide a
00:25:16
little more societal
00:25:17
value well if I have a pure function and
00:25:20
I want to get different output what are
00:25:22
my
00:25:23
options I can either find a different
00:25:25
function but I don't have millions of
00:25:26
dollars to train a new function so my
00:25:29
options are I can change the input it's
00:25:31
literally mathematically the only thing
00:25:33
I can do to get different results or
00:25:35
better results out of a language model
00:25:37
and this is the entire field of um quote
00:25:40
unquote AI programming is putting the
00:25:42
right stuff into the model to try to get
00:25:44
it to get out the stuff that you
00:25:47
want where does this data come from you
00:25:49
can have human input like a chatbot or a
00:25:51
programmer manually putting an input you
00:25:53
can have an old fashioned program like a
00:25:55
regular program that builds a bunch of
00:25:56
strings and concat stuff and then sends
00:25:58
them off to the model you can have the
00:26:00
result of one model feed into another
00:26:02
model or the same model invoked
00:26:03
recursively and really any combination
00:26:06
of the above and all of AI programming
00:26:09
unless you're working on the models
00:26:10
themselves is some combination of these
00:26:14
altering the inputs of the function in
00:26:16
various ways and building programs to
00:26:17
programmatically alter the inputs of the
00:26:19
functions there's a bunch of patterns
00:26:21
for this um this is just descriptive
00:26:23
this is how people are using these out
00:26:25
in the world the simplest one is prompt
00:26:26
engineering if I have user input and
00:26:28
maybe a history of past messages how do
00:26:30
I get the the this the language model to
00:26:34
act different or emit different things I
00:26:36
give it a system prompt I may say Talk
00:26:38
Like a Pirate or emit Json right that's
00:26:41
the system prompt prompt it's not
00:26:43
engineering it's just trying to find
00:26:45
stuff and experimenting around with the
00:26:47
model this one's really gaining a lot of
00:26:49
popularity which is um you know you want
00:26:51
to sometimes feed real information into
00:26:53
the model that maybe it can't reliably
00:26:55
get out of its internals so you take the
00:26:58
user input you pass it to a search
00:26:59
engine could be a keyword based search
00:27:00
engine sematic search doesn't really
00:27:02
matter and you pass the topend results
00:27:03
along with your system prompt and the
00:27:04
user input into the model right and
00:27:09
assuming your search is good assuming
00:27:10
the data that you want to talk about or
00:27:11
you want the model to to uh relay to you
00:27:14
is assuming the data is there models do
00:27:17
a good job the attention mechanism is
00:27:18
actually pretty reliable at kind of
00:27:20
zeroing in on the the relevant parts of
00:27:22
the input data of course if your search
00:27:24
wasn't good um and you don't have good
00:27:26
recall and the answer you actually
00:27:27
wanted is not in the topend results the
00:27:29
model is back to bullshitting and it
00:27:31
will not be able to give you reliable
00:27:34
information okay so an extenstion of
00:27:36
that is we're going to invoke the model
00:27:38
twice we're going to invoke the model
00:27:39
and we're going to give it our database
00:27:40
schema and a question from the user
00:27:42
we're going to say write some SQL that
00:27:44
answers this
00:27:45
question and this actually works for
00:27:47
simple queries and simple databases um
00:27:49
you get the results from the database
00:27:51
then you call the model again with all
00:27:52
those results and uh the user input uh
00:27:56
and it works but it's completely reliant
00:27:59
on the ability of the model to generate
00:28:00
SQL code and to do so correctly and also
00:28:02
you can't code review it before it runs
00:28:03
if you're using it in
00:28:05
production so some issues with this but
00:28:07
people are using it effectively and and
00:28:09
if you can get your query simple enough
00:28:10
and your schema simple enough you can
00:28:12
get get some reliability up there
00:28:14
another big thing open AI just released
00:28:16
this feature a few weeks ago is tool use
00:28:18
I can alter my
00:28:20
inputs uh such that the model can emit
00:28:24
Json that matches a certain pattern
00:28:26
which then I can can pass off to an API
00:28:29
which may go to the side effect in the
00:28:30
world it could order a pizza and then it
00:28:32
can go back and feed it into the model
00:28:34
again so you know people talk about tool
00:28:37
use as if the model is doing something
00:28:38
incredible but all all it is is telling
00:28:40
the model to emit API calls and then
00:28:43
some external system has to observe
00:28:45
those and actually execute
00:28:48
them and the other big thing
00:28:50
that um is is really highly hyped these
00:28:54
days and there's a billion startups and
00:28:56
people are talking about this is you
00:28:57
know going to lead to AGI and whatnot um
00:28:59
is this concept of Agents all agents are
00:29:01
is arbitrary combinations of the above
00:29:03
patterns and invoking language models
00:29:06
recursively that's it at the end of the
00:29:08
day each model invocation is still a
00:29:10
pure function of input to output and
00:29:11
we're still just trying to Marshall up
00:29:13
the correct inputs at each phase and
00:29:16
this is actually I think closer building
00:29:18
a good Agent I think is closer to
00:29:20
traditional software engineering than it
00:29:22
is to you know magic AI programming
00:29:28
it is different from traditional
00:29:29
programming in one way did it work um
00:29:31
the model output is always going to be a
00:29:32
bit Opa it is going to be deterministic
00:29:35
but it will be opag and it can be
00:29:36
non-deterministic if you decide to turn
00:29:38
up the randomness
00:29:40
um it's not like regular programming
00:29:42
where once a function works on a variety
00:29:44
of test cases you can be pretty sure it
00:29:46
works um it needs to work across all
00:29:49
test cases and the only way to validate
00:29:51
that is to statistically um you have to
00:29:54
apply experimental techniques to
00:29:56
actually give it a variety of inputs and
00:29:57
then see see what your uh result uh
00:30:00
success rate is and do datadriven
00:30:02
analysis of the results and you need to
00:30:04
know your problem domain like for some
00:30:06
problem domains 90% accuracy may be
00:30:08
great um for other domains you may need
00:30:11
five nines of accuracy um probably not
00:30:13
going to get that from language model
00:30:15
ever but you need to know what that
00:30:17
number
00:30:20
is all right so that's the state of
00:30:24
language model programming today um all
00:30:28
the fur and activity and I can't keep up
00:30:30
with all of it but everything I have
00:30:31
kept up with and have read falls into
00:30:32
two categories it's either improving the
00:30:34
models themselves and like the the core
00:30:36
data science used to train them or it's
00:30:38
working on what are techniques for
00:30:41
giving the models better inputs so that
00:30:43
we can get better outputs and what kind
00:30:44
of programs can we write up as a
00:30:46
scaffolding around the models to to
00:30:48
formulate those
00:30:52
prompts um mix success this is a very
00:30:55
active field sometimes they work well
00:30:56
sometimes they don't
00:31:00
one problem
00:31:02
well one thing I do well in programming
00:31:05
we all do well here is uh data and logic
00:31:07
we've been working with data and Logic
00:31:09
for quite some time business logic data
00:31:11
databases we're all very comfortable
00:31:12
with those um and we write programs that
00:31:15
work between them a lot we also have
00:31:18
language um now we can now work with
00:31:21
language using the techniques I just
00:31:23
described but it's still how do I get my
00:31:26
data to meet my language
00:31:28
right I I can just shove it in the
00:31:29
prompt and I have to shove it in the
00:31:31
prompt the context the input to the pure
00:31:34
function that's the only thing I can do
00:31:37
there is no other way I can make my data
00:31:40
accessible to a system what what's the
00:31:43
best way to do
00:31:45
that what possible technology kind of
00:31:47
lives in the intersection of data and
00:31:50
logic and language um that kind of has a
00:31:53
foot in each World such that I can work
00:31:54
with it in a very data way on the data
00:31:56
side and work it with it in a very
00:31:57
language way on the language
00:31:59
side and obviously this is a leading
00:32:02
question it is rdf
00:32:06
um so we should be putting rdf data in
00:32:09
our prompts and when we are asking to
00:32:11
get kind of more structured data out of
00:32:12
models we should be asking for it in rdf
00:32:14
format and this works quite
00:32:17
well so at a syntactic level
00:32:21
um well let's step back and talk about n
00:32:25
Chomsky always love to do that good old
00:32:28
gnome he's still kicking around um this
00:32:31
book uh establishes the concept of a
00:32:34
generative grammar which is a language
00:32:36
model it is a simple language model that
00:32:37
fits on a page that that math right
00:32:40
there is his language model it's a
00:32:41
generative grammar of language and how
00:32:42
it works he built it based on observing
00:32:45
many languages and trying to kind of
00:32:46
figure out what is the essence of
00:32:48
language or this language particular
00:32:49
languages or all languages um try
00:32:52
believed that there's a biological basis
00:32:54
these rules are actually in human brains
00:32:56
that there was like some mutation that
00:32:58
gave us these rules um and that's why
00:33:00
humans are language using
00:33:01
creatures um and you know this is
00:33:04
different for every actual language but
00:33:06
one thing that is super foundational is
00:33:08
subjects predicates and objects or
00:33:10
subjects verbs and objects you know
00:33:13
there are languages that kind of stretch
00:33:14
the definition in one way or another or
00:33:16
leave it a little bit more confusing but
00:33:17
there's something pretty fundamental um
00:33:19
to cognition in this and that's what CH
00:33:21
is exploring in this book and so when
00:33:24
you go back to a language model because
00:33:25
the language model is trained on
00:33:27
language those concepts are also sort of
00:33:29
baked into language model they are
00:33:31
captured they are measured as part of
00:33:33
the model making process the process of
00:33:35
measuring and so rdf is really
00:33:38
surprisingly good at going forth back
00:33:40
and forth between natural language and
00:33:42
rdf you can go to any snit of text and
00:33:45
paste it into chat GPT and say give the
00:33:47
facts here an rdf format and if you
00:33:49
wanted to do an even better job you can
00:33:50
say give me the facts and here's the the
00:33:52
predicates I really care about you know
00:33:54
given this list of predicates find
00:33:56
anything in here that could be to those
00:33:58
predicates and give it to me in rdf
00:33:59
format and the other way works well too
00:34:01
you can give it rdf data and then just
00:34:02
have a conversation with that data
00:34:04
really easily um and it works better in
00:34:07
my experience than you know trying to
00:34:09
upload csvs or spreadsheets or any of
00:34:11
the other ways you can get structured
00:34:12
data into a model because they're just
00:34:14
statements and the difference between a
00:34:16
statement in language and a statement in
00:34:19
rdf is not that big a conceptual
00:34:23
lead tool use it also does a good job of
00:34:26
tool use there are many things that
00:34:27
currently people people use to use
00:34:29
four say I have this question who are
00:34:31
Luke's parents and I want to ask it of
00:34:33
the model and I want it to use a variant
00:34:36
of tool use which is the query
00:34:37
generation right say I want to do this
00:34:40
with rdf the model can emit an rdf query
00:34:44
Luke is the child of who right it can
00:34:46
convert that statement that English
00:34:49
statement in or that English question
00:34:52
into this rdf
00:34:54
question and here's where rdf shines so
00:34:56
far this is not that different than SQL
00:34:58
right it's just a different query
00:35:00
format what if my rdf implementation
00:35:03
supports
00:35:05
reasoning now the language model is
00:35:06
asking a different question who is Luke
00:35:08
a descendant of it's a different
00:35:10
question I can ask but the language
00:35:12
model doesn't know any
00:35:14
different it's to the language model
00:35:16
this is exactly the same sort of
00:35:17
question where we're quering about a
00:35:19
property of Luke even though under the
00:35:21
hood there's probably like a bunch of
00:35:22
data log rules firing to answer this
00:35:24
question and return the result set but
00:35:26
the key point is all that complexities
00:35:28
abstracted out of the model and if you
00:35:30
did something like that in SQL you would
00:35:31
have to like put all that in the model
00:35:35
and make its tool use is much more
00:35:36
complex so rdf really simplifies tool
00:35:39
use for um at least as far as tool use
00:35:41
involves calculating
00:35:44
data and you can even ask it like super
00:35:46
complex questions that are much more
00:35:48
open-ended over structured data who am I
00:35:51
or what is the relationship between Luke
00:35:53
and
00:35:54
rembrand you know that's that's a very
00:35:56
open-ended question language model can't
00:35:57
answer it but if I have a full
00:35:59
genealogical database and I have the
00:36:01
correct inference rules in there this
00:36:03
query can precisely answer you know
00:36:05
Lucas REM Brand's 13 times removed great
00:36:08
uncle which is true um he is actually
00:36:11
way back there in the family tree but
00:36:14
that's structured data that is that is a
00:36:16
true fact that's not like a a maybe a
00:36:18
fact pattern that it maybe saw somewhere
00:36:19
on the internet that's actually real
00:36:21
logic and reasoning that gives me that
00:36:25
answer let's talk about semantics
00:36:29
well we know language models model
00:36:31
semantics how do language models model
00:36:32
semantics
00:36:34
well it's hard to get into it in our
00:36:36
remaining time um but stated briefly
00:36:40
there is models have what's called a
00:36:42
latent space which is a high dimensional
00:36:44
Vector space um technically it's a
00:36:47
semantic field with a distance metric um
00:36:49
but it's basically um a mathematical
00:36:52
high-dimensional mathematical construct
00:36:54
such that two points that are close in
00:36:56
this mathematical space are Al closely
00:36:58
reled somehow um
00:37:01
conceptually this is really abstract
00:37:03
though this this latent space of a model
00:37:04
these things are pretty opaque uh
00:37:06
there's a lot of research on how to
00:37:07
observe them how to interpret them
00:37:11
um but you know they're mostly opaque to
00:37:14
humans even though they are a
00:37:15
measurement of language it's just a
00:37:17
mathematical object with you know
00:37:19
thousands and thousands of Dimensions
00:37:21
it's the human brain doesn't easily wrap
00:37:23
around it well how does rdf model data
00:37:26
rdf models data as a graph or concepts
00:37:29
that are linked to other Concepts the
00:37:31
human brain is pretty good at at
00:37:32
grasping data in this
00:37:35
format
00:37:37
so you know a Knowledge Graph linked
00:37:40
resources I have a bunch of
00:37:43
information what you can do conceptually
00:37:46
and I'm still working on the best ways
00:37:48
to do this in practice is you can
00:37:50
project your rdf conceptual map into the
00:37:54
conceptual space you can embed your
00:37:57
concrete logical symbols and Concepts
00:37:59
into the conceptual space and this does
00:38:02
a few things it gives you
00:38:03
interpretability of that conceptual lat
00:38:05
in
00:38:07
space you can say you know you can say
00:38:11
oh this region of that space that's
00:38:13
where that fact landed that tells me
00:38:16
something about the topology of the
00:38:17
space I can kind of overlay them on top
00:38:18
of each
00:38:19
other so it also gives me
00:38:22
insights I might say hey I had these two
00:38:24
entities that I embedded in the model
00:38:27
and landed pretty close I'd never
00:38:29
thought of them as close before maybe I
00:38:30
should explore that relationship kind of
00:38:33
a soft way right so if you're trying to
00:38:36
do exploratory information based on the
00:38:38
data you have that can be really
00:38:41
interesting the other thing you can do
00:38:43
is soft
00:38:44
inference right so what's one of the
00:38:46
reasons the semantic web
00:38:47
failed or indeed why did attempts to
00:38:51
solve AI with rule-based systems and
00:38:53
logic alone
00:38:54
fail it's because the world is more full
00:38:57
of rules than anyone ever has the
00:38:58
patience to write down there are so many
00:39:01
aspects of the world um common knowledge
00:39:04
um implicit assumptions that it is
00:39:07
impossible to enumerate them all as a
00:39:09
human and people try go look up the
00:39:11
psych project coic they really
00:39:13
tried um but it's hard and even if I did
00:39:17
it my rdf graph would soon grow
00:39:21
intractably
00:39:24
large but what the language model can do
00:39:27
is it can kind of give me for like
00:39:28
really implicit things that are
00:39:31
obvious I can just ask the language
00:39:33
model to give me rdf expressing the
00:39:35
relationship between arbitrary objects
00:39:37
and it'll just spit out a set of facts
00:39:38
that are most likely true right because
00:39:41
they're they're kind of implicit in the
00:39:42
world and they're implicit in what has
00:39:44
been trained into the
00:39:46
model they may not be 100% accurate
00:39:48
again models are probabilistic that's
00:39:49
why I call this soft
00:39:51
inference but it means that we now have
00:39:53
kind of like we have the hard inference
00:39:54
of our reasoning system our our
00:39:56
inference engine we have the soft reason
00:39:58
of the language model and if you combine
00:40:00
those together you can do a lot of
00:40:01
reasoning that you couldn't do with
00:40:02
either one alone and I think that's a
00:40:04
pretty compelling this is the central
00:40:07
Insight behind what's called neuros
00:40:08
symbolic AI it's a small subfield of of
00:40:10
AI research um it's basically anything
00:40:15
any research that is trying to combine
00:40:18
uh the abstract fuzzy neural network
00:40:20
with hard concrete logical symbols and
00:40:22
there's a bunch of different approaches
00:40:23
for it some people are using prologue to
00:40:25
do this um kind of exactly the same way
00:40:27
I describ for rdf but doing it with
00:40:29
prologue instead of rdf um other people
00:40:31
are like actually trying to encode
00:40:34
symbols as items in this dimensional
00:40:36
Vector space and then use that for
00:40:38
training the models there's a lot of
00:40:39
complicated things people are doing
00:40:42
um U but if you're you're interested in
00:40:44
this you know this is Def this approach
00:40:46
of using rdf to interact with language
00:40:47
models is definitely a sub it's it's a
00:40:50
specific approach to neuros symbolic
00:40:53
AI so finally um
00:40:57
you know the biggest problem I think we
00:40:59
have with AI is a lot of people are
00:41:00
using it for the wrong things um and you
00:41:02
know I I don't want it to do my writing
00:41:05
for me or my singing for me or my music
00:41:07
playing for me um I don't even
00:41:09
necessarily want it to do my coding for
00:41:11
me like I I find the code it produce
00:41:13
kind of slop um but I do a lot of
00:41:17
programming is dishes and laundry the
00:41:19
dishes and laundry of data just like you
00:41:21
were saying earlier and uh particularly
00:41:24
on the data side so I think is a tool to
00:41:28
actually automate a lot of the dishes
00:41:29
and the laundry of working with data and
00:41:34
I'm trying to build it I'm this is what
00:41:36
I'm working on now and if anyone is
00:41:37
interested in talking about that um I'd
00:41:39
love to chat with you so yeah we're
00:41:41
going to bring back the CTIC web with uh
00:41:43
s mayi under the hood all right thanks

Tag

RDF
language models
AI ethics
technology
knowledge representation
semantic web
neurosymbolic AI
Transformer
data modeling