What is one major topic discussed in the panel?

The challenges and excitement of working in AI product development.

How do Kevin and Mike view the evolution of AI capabilities?

They see it as rapidly advancing, with AI models becoming more sophisticated and useful.

What is one way AI is expected to change user interactions?

Through more proactive and personalized communication with users.

Why is user feedback important in AI product development?

It helps improve the AI's understanding and capability to meet user needs.

What kind of skills should product managers develop according to the discussion?

Skills in writing evaluations and prototyping with AI models.

How do Kevin and Mike suggest educating users about AI?

By leveraging power users in organizations to teach and share their experiences.

What interesting anecdote is mentioned regarding internal AI usage?

An AI model successfully ordered pizza for the office during beta testing.

What do the speakers say about user adaptation to AI?

They express amazement at how quickly users adapt to new technologies like AI.

What key aspect of future AI interactions is discussed?

The idea of AI being proactive and able to handle complex tasks asynchronously.

What reflections do Mike and Kevin have on the emotional relationship users form with AI?

They discuss users developing empathy towards the AI and how they respond to its personality.

A conversation with OpenAI's CPO Kevin Weil, Anthropic's CPO Mike Krieger, and Sarah Guo

00:40:58

https://www.youtube.com/watch?v=IxkvVZua28k

Summary

TLDRIn a panel discussion, Kevin and Mike, leaders in AI product development, share their insights on the rapid transformation within the AI landscape and the implications for product management. They articulate their excitement for AI’s evolving capabilities and underscore the challenges of developing AI technologies that meet user needs. They highlight the importance of user feedback in shaping product experiences and the evolving role of product managers in an AI-dominated environment. Key themes include the need for product managers to master evaluation skills, the opportunity to create proactive AI interactions, and the way users start forming emotional bonds with AI entities. The discussion reflects on real-world applications of AI, the adaptability of users, and forecasts the future of AI integration in everyday tasks.

Takeaways

👑 Kevin and Mike are excited about their new roles in AI product development.
💡 AI capabilities are evolving rapidly, changing product experiences.
🤝 User feedback is crucial for refining and enhancing AI products.
🚀 Future interactions with AI may become more proactive and personalized.
🛠️ Product managers need to develop new skills for working with AI technologies.
📈 Observing how users adapt can inform better design strategies.
🍕 A memorable anecdote involved an AI ordering pizza during internal testing.
🧠 Users form emotional connections with AI, seeing them as entities with personality.
⚙️ Effective evaluation methods can significantly enhance AI product quality.
🌍 Future AI may facilitate international communication through real-time translation.

Timeline

00:00:00 - 00:05:00
The discussion opened with Sarah expressing excitement to be with Kevin and Mike, both known for their expertise in AI and previous roles at Instagram. She proposed discussing new product ideas but settled for a casual exchange of insights.
00:05:00 - 00:10:00
Kevin shared that he finds his new role in AI product management both challenging and fascinating, as it involves constantly adapting to new technological capabilities. He described the experience as sleepless but rewarding and highlighted the rapid evolution of AI technology.
00:10:00 - 00:15:00
Mike reflected on the various reactions he received from peers upon joining the AI team. While some were supportive, others questioned his choice given his previous semi-retirement. He emphasized that his passion for innovation drove him back into the tech world.
00:15:00 - 00:20:00
Both Kevin and Mike discussed their transition to enterprise roles, noting the differences from their previous consumer-focused experiences. They expressed excitement about engagement with enterprise clients and receiving direct feedback on how products are used, which differs from consumer feedback.
00:20:00 - 00:25:00
Kevin pointed out the unique challenges in enterprise product management, such as aligning with buyer goals and predicting market needs. He emphasized that understanding the audience and specific use cases is critical for successful AI product development.
00:25:00 - 00:30:00
The conversation turned toward the unpredictable nature of AI products, with Mike noting that product managers must remain flexible as they await the outcomes of model training and research insights. They both acknowledged the challenge of navigating evolving AI capabilities.
00:30:00 - 00:35:00
Mike shared insights on the importance of evaluation skills for product managers, stressing that a deeper understanding of AI models and effective eval writing will be essential for successful project outcomes moving forward.
00:35:00 - 00:40:58
The discussion concluded with a glimpse into the future of AI products, focusing on more personalized and proactive interactions, improving user experience through intuitive features, and how innovations will rapidly reshape both consumer and enterprise landscapes.

Mind Map

Video Q&A

What is one major topic discussed in the panel?
The challenges and excitement of working in AI product development.
How do Kevin and Mike view the evolution of AI capabilities?
They see it as rapidly advancing, with AI models becoming more sophisticated and useful.
What is one way AI is expected to change user interactions?
Through more proactive and personalized communication with users.
Why is user feedback important in AI product development?
It helps improve the AI's understanding and capability to meet user needs.
What kind of skills should product managers develop according to the discussion?
Skills in writing evaluations and prototyping with AI models.
How do Kevin and Mike suggest educating users about AI?
By leveraging power users in organizations to teach and share their experiences.
What interesting anecdote is mentioned regarding internal AI usage?
An AI model successfully ordered pizza for the office during beta testing.
What do the speakers say about user adaptation to AI?
They express amazement at how quickly users adapt to new technologies like AI.
What key aspect of future AI interactions is discussed?
The idea of AI being proactive and able to handle complex tasks asynchronously.
What reflections do Mike and Kevin have on the emotional relationship users form with AI?
They discuss users developing empathy towards the AI and how they respond to its personality.

View more video summaries

Get instant access to free YouTube video summaries powered by AI!

Subtitles

Auto Scroll:

00:00:18
all right hello
00:00:20
everyone okay okay Sarah you're the
00:00:23
queen of AI investing this is a phrase
00:00:25
never ever to be used again but it's
00:00:27
great to be here with both of you um so
00:00:30
I had two different ideas for our final
00:00:33
discussion the first was a product off
00:00:37
because these two men have the merge to
00:00:40
to prod button both of them and I was
00:00:42
like oh please just release everything
00:00:43
we know is coming over the next six or
00:00:45
12 months ignore all internal guidelines
00:00:48
um the second was we just redesigned
00:00:50
Instagram together since they've both
00:00:52
actually run Instagram
00:00:54
before uh both of these got fully shot
00:00:56
down and so instead I think we'll just
00:00:59
like trade notes among friends like lame
00:01:01
I know but uh but really excited to hear
00:01:03
from you both anyway uh so this is
00:01:06
actually a relatively new role for both
00:01:09
of you Kevin let's start with you like
00:01:12
you've done a bunch of really different
00:01:13
interesting things like what was the
00:01:14
reaction you got when you took the job
00:01:16
from friends and the
00:01:18
team uh generally excitement I mean uh
00:01:21
it's I think it's one of the most
00:01:23
interesting and impactful roles there's
00:01:24
so much to figure out um I've never had
00:01:29
such a challenging
00:01:32
interesting uh Sleepless product role in
00:01:35
my life uh it's it's got all the
00:01:38
challenges of a normal product role
00:01:40
where you're trying to figure out who
00:01:41
you're building for and what problems
00:01:43
you can solve and things like that but
00:01:46
normally when you're Building Product
00:01:47
You're Building off of kind of a fixed
00:01:49
technology base right you know what you
00:01:50
have to work with and uh and you're
00:01:53
trying to build the best product you can
00:01:56
here it's like every two months
00:01:58
computers can do something computers
00:01:59
have never been able to do before in the
00:02:01
history of the world and you're trying
00:02:02
to figure out how that changes your
00:02:05
product and the answer should probably
00:02:07
be a fair amount um and so it just it's
00:02:11
so
00:02:12
interesting uh and fascinating to see on
00:02:15
the inside as AI gets developed but uh
00:02:17
I've been having a
00:02:18
blast Mike what about you I I remember
00:02:21
hearing the news I was like oh I didn't
00:02:23
know you could convince the founder of
00:02:24
Instagram to go work on something that
00:02:26
existed already yeah my favorite three
00:02:28
reactions like people who know me were
00:02:29
like that makes sense like you're going
00:02:31
to like have fun there uh the middle
00:02:33
people were like why like you don't have
00:02:35
to work like why are you doing this and
00:02:37
like then if you knew me you know me
00:02:38
like I can't not and I think that like I
00:02:40
couldn't stop myself and the third was
00:02:42
like oh you could hire the founder of
00:02:44
Instagram which was also fun and it's
00:02:45
like I mean not many people could but
00:02:47
like there's like probably a list of
00:02:49
three companies that would have been
00:02:50
interesting um and so yeah there's like
00:02:52
a range of reactions depending on how
00:02:54
well you knew me and how like you've
00:02:56
seen me in my like uh semi-retired state
00:02:59
which lasted like six weeks and I was
00:03:00
like all right what are we doing
00:03:02
next um so we had dinner together with a
00:03:05
bunch of friends recently and you I was
00:03:08
like impressed by the childish Delight
00:03:11
that you had around like yeah I'm
00:03:12
learning about all this Enterprise stuff
00:03:15
like tell me if it's about serving
00:03:18
customers that are not all of us with
00:03:19
Instagram or uh just you know working in
00:03:22
a a an organization that's research
00:03:24
driven like what's the biggest surprise
00:03:26
so far I those are two I think both very
00:03:29
worthwhile pieces of this role that are
00:03:31
like very new to me as well like I um
00:03:33
when I was 18 I made this like very
00:03:34
18-year-old vow which was like every
00:03:36
year of my life I wanted to be different
00:03:37
like I don't have the same year twice
00:03:39
and I was like it's why like you know I
00:03:41
didn't there's been times where I was
00:03:42
like oh like another social product I'm
00:03:44
like doing that again first of all like
00:03:46
your bar is like really distorted and
00:03:48
second of all like just would feel like
00:03:49
too much of the same thing so yeah
00:03:51
Enterprise has been wild I'm really
00:03:53
curious about your experience with that
00:03:54
as well like you're like you know uh
00:03:57
your your feedback GL I actually imagine
00:03:59
it's a lot more like investing is far
00:04:00
longer right you're like you have that
00:04:02
initial convo and you're like I think
00:04:03
they like me and then you're like oh no
00:04:05
it's now in like some requisition State
00:04:07
and it's going to take like six months
00:04:09
before they like even get to deployment
00:04:11
before you know whether it's right and
00:04:12
so like getting used to that pace we're
00:04:14
like what why hasn't this shipped yet
00:04:16
they're like Mike you've been here two
00:04:17
months like this is like it's making its
00:04:19
way through the VPS like it's going to
00:04:21
get there eventually so like getting
00:04:22
used to different timelines for sure but
00:04:24
like the part that is fun is actually
00:04:26
getting the feedback in the kind of
00:04:28
Engagement where you're like once it
00:04:29
gets the deployed you have somebody that
00:04:31
can call you and you can call them and
00:04:32
be like how's it working for you like is
00:04:34
this good like whereas users like you're
00:04:36
doing like data science and Aggregate
00:04:37
and like sure you can bring in like one
00:04:39
or two people but it's like they won't
00:04:41
they don't have enough financial
00:04:42
incentive writing on telling you where
00:04:44
you suck and where you're doing well and
00:04:45
like that's been a different but also
00:04:47
like rewarding side of that for sure ke
00:04:50
Kevin you've worked on such a wide range
00:04:52
of products before how much do your
00:04:54
instincts like apply yeah I was going to
00:04:57
add on to the Enterprise Point too um
00:04:58
and then I'll get to like the other
00:05:01
interesting thing about Enterprise is
00:05:02
it's not necessarily about the product
00:05:04
right there's a buyer and they have
00:05:07
goals and you could build the best
00:05:09
product in the world that all the people
00:05:10
at the company might be happy to use and
00:05:12
it still doesn't necessarily matter
00:05:14
exactly um I was in a a meeting with one
00:05:18
of our big Enterprise customers and they
00:05:20
were like um this is great we're really
00:05:23
happy da D you know the one thing we
00:05:25
need is we really need you to tell us 60
00:05:29
days before you launch
00:05:31
anything and I was like I also would
00:05:34
like to know 60
00:05:37
days so very very different actually and
00:05:40
it's interesting right because at open
00:05:41
AI we have a consumer product and we
00:05:44
have an Enterprise product and we have a
00:05:46
developer product so we're kind of doing
00:05:48
all at once um Instinct
00:05:53
wise in I'd say in like half the job it
00:05:56
works you know when you have when you
00:05:59
have a sense of the product you're
00:06:00
trying to build you know we're getting
00:06:03
towards uh you know the end of shipping
00:06:06
uh Advanced speech mode or something or
00:06:08
you're getting towards shipping canvas
00:06:10
and you're making final touches trying
00:06:12
to understand who you're building for
00:06:14
and like what you know exactly what
00:06:15
problems you're trying to solve it it
00:06:18
works then because it's that's a little
00:06:19
bit more like the tail end of it is
00:06:21
shipping a normal product but the
00:06:23
beginning of these things is nothing
00:06:25
like that
00:06:27
um so uh
00:06:31
like there will just be these
00:06:33
capabilities that we don't
00:06:35
know you you have some sense as you're
00:06:38
training some new model that it might
00:06:41
have capability X you don't really know
00:06:44
nor does the research team nor does
00:06:46
anybody right you're like I think this
00:06:48
might be possible and it's kind of like
00:06:50
coming through the M statue but it's
00:06:52
this emergent property of a model and
00:06:55
you know so you don't know whether it's
00:06:56
going to really work and you don't know
00:06:58
whether it's going to be like 60% good
00:07:01
or 90% good or 99% good and the product
00:07:06
that you would build that would make
00:07:08
sense with something that works 60% of
00:07:10
the time is Super different than 90 or
00:07:12
99% of the time right so you're kind of
00:07:14
just waiting and you're you know at
00:07:16
least like I don't know if you feel this
00:07:18
checking in with the research team from
00:07:20
time to time like hey guys how's it
00:07:22
going how's that model training uh any
00:07:25
any insight on this and they're like
00:07:27
it's research we're working on it you
00:07:29
know it's uh we don't know either we're
00:07:31
we're we're we're working through this
00:07:32
at the same time and it's I mean it
00:07:35
makes it super fun because you're kind
00:07:36
of like discovering things together but
00:07:39
very sort of stochastic too it's the
00:07:41
thing it most reminds me of like from
00:07:43
the Instagram days where like apple like
00:07:45
wwc announcements you're like this could
00:07:47
either be awesome for us or could like
00:07:49
absolutely like cause chaos for it it's
00:07:51
like that but your own company is the
00:07:53
one kind of disrupting you from within
00:07:55
which is like a very like it's very cool
00:07:57
but also like oh this might totally in
00:07:59
my product road map now
00:08:02
yeah uh what does that cycle look like
00:08:05
for both of you you described it as um
00:08:08
you know like peering through the Mist
00:08:09
trying to look at the next set of
00:08:11
capabilities I mean can you can you plan
00:08:14
if you don't know exactly what is coming
00:08:16
and what is the iteration cycle to
00:08:17
discover new things that should belong
00:08:18
in your product I think like on the
00:08:20
intelligence side you can sort of squint
00:08:22
and see like all right it's advancing
00:08:24
this way and so the kinds of things that
00:08:26
you'll want to do with the model and
00:08:28
start building the product around that
00:08:29
so I I there's three ways right
00:08:31
intelligence feels not predictable but
00:08:33
at least on like a slope that you can
00:08:35
kind of watch there's the capabilities
00:08:37
you decide to invest in from the product
00:08:38
side and then do fine-tuning with the
00:08:40
actual research teams um so something
00:08:42
like artifacts we spend a lot of time
00:08:44
between research I think the same was
00:08:45
true with campus right like you're doing
00:08:47
a like co-design co- research co-
00:08:49
finetune and that's like I think a real
00:08:51
privilege of getting to work at this
00:08:52
company and getting to do design there
00:08:53
and then there's the capability front so
00:08:55
maybe speech mode for open a for us it's
00:08:57
the computer use uh work that we we
00:08:59
released this week you're like all right
00:09:01
60% all right got good yes all right
00:09:04
yeah and like so what we try to do is
00:09:06
embed designers early in the process but
00:09:08
knowing that like you're not placing a b
00:09:10
like the the experimentation talk was
00:09:12
saying like your um your output for
00:09:14
experiment should be learning not
00:09:15
necessarily like perfect products you're
00:09:16
going to ship every time I think the
00:09:17
same is true when you partner with
00:09:18
research like your outcome is hopefully
00:09:20
demos or informative things that like
00:09:22
could spark product ideas not like a
00:09:24
predictable product process where you're
00:09:26
like well it's this der risked by now
00:09:28
which means it's going to look that way
00:09:29
when research comes along I've also one
00:09:32
thing that I've really enjoyed because
00:09:34
research is at least parts parts of
00:09:36
research are very product oriented
00:09:38
especially on the post-training side
00:09:39
like Mike was saying and then parts of
00:09:41
it are really like academic research at
00:09:43
some level and so you we'll just also
00:09:47
occasionally hear about some capability
00:09:49
and we'll be in a meeting and you'll be
00:09:51
like oh I really wish we could do this
00:09:52
thing and a researcher on the team will
00:09:55
be like oh no we can do that we've had
00:09:57
that for three months and we're like
00:09:59
really what does that like okay where do
00:10:02
I learn more and they're like oh well we
00:10:03
didn't think we didn't we didn't know it
00:10:05
was important so you know I'm working on
00:10:07
this other thing now um but you do just
00:10:10
get like magic happening sometimes
00:10:13
too uh one thing like we think a lot
00:10:16
about when we're investing is actually
00:10:18
like can you do anything with a model if
00:10:21
it is 60% successful at a task instead
00:10:23
of 99% and unlike lots of tasks that's
00:10:26
closer to 60 right but the task is
00:10:28
really important valuable like how do
00:10:30
you how do you think about that
00:10:32
internally in terms of evaluating
00:10:34
progression on a task and then what what
00:10:37
types of things you like put in sort of
00:10:40
the burden of product to make it
00:10:42
graceful failure or to like sort of
00:10:44
cross the miles the user versus you know
00:10:47
we just need to wait for the models to
00:10:48
get better I'd argue there are a lot of
00:10:50
things that you can actually do when
00:10:52
something is 60% right you just you just
00:10:54
need to really design for it um you have
00:10:56
to expect that there's a human in the
00:10:58
loop a lot more than there would be
00:10:59
otherwise like if you look at uh take
00:11:02
like GitHub co-pilot right that was kind
00:11:04
of the first AI product that really open
00:11:07
people's eyes to like this thing can be
00:11:09
useful not just as you know Q&A but for
00:11:11
really economically valuable work and
00:11:15
that launched I don't know exactly which
00:11:16
model that was built off of but I mean
00:11:18
it was multiple Generations ago so I
00:11:20
guarantee you that model wasn't perfect
00:11:22
at anything related to coding I think
00:11:24
it's gpt2 which is like pretty small so
00:11:27
yeah I mean and so but the fact that it
00:11:29
was still valuable for you cuz if it got
00:11:31
the code you know some significant
00:11:34
fraction of the way there that was still
00:11:36
stuff you didn't have to type yourself
00:11:37
and you could edit it and so there are
00:11:39
experiences like that that I think
00:11:40
totally work I think we'll see the same
00:11:42
kinds of things happening with um with
00:11:45
sort of the the shift towards uh agents
00:11:48
and longer form tasks where you know it
00:11:51
may not be perfect but if it can save
00:11:54
you five or 10 minutes that's still
00:11:55
valuable and even more if the model can
00:11:57
understand where it doesn't have
00:12:00
confidence and can come back to you and
00:12:01
say I'm not sure about this can you
00:12:02
actually help me with this then you know
00:12:05
the the combination of human and model
00:12:07
together can be much higher than 60% I
00:12:09
also find that 60% this magic 60% number
00:12:12
like it's kind of lump I made it up five
00:12:14
minutes ago that was a takeway 6% that
00:12:17
is our new that's the Mendoza Line of uh
00:12:19
of AI like I think it's often very lumpy
00:12:22
where like it'll do very well on some
00:12:23
tasks and not well on others and I think
00:12:25
that also helps like uh when we ever run
00:12:27
like pilot programs with customers is
00:12:29
really interesting when we'll get like
00:12:30
the same day feedback from two different
00:12:32
companies when be like it's solved our
00:12:33
whole problem like we've been trying to
00:12:35
do this for three months thank you other
00:12:36
be like it was way off it's like worse
00:12:38
than the other model and so like uh it's
00:12:40
also humbling to know that you have your
00:12:42
own internal evals but like the rubber
00:12:45
hitting the road and actually seeing the
00:12:46
model out in the world is where it's
00:12:48
kind of the equipment of like you do all
00:12:49
this design and then like you put it in
00:12:50
front of one user and you're like oh wow
00:12:52
I was wrong uh the model has that
00:12:54
feeling as well we like we try as hard
00:12:56
as we can to like have a good sense but
00:12:58
then people have their own custom data
00:13:00
sets they have their own internal use
00:13:01
they've prompted it a certain way and
00:13:03
like uh so that delies that sort of
00:13:05
almost like bodal nature of when you
00:13:07
actually put it out in the world I'm
00:13:09
curious if you feel
00:13:10
this I think there's a very real sense
00:13:13
in which models today are not
00:13:15
intelligence limited they're eval
00:13:17
limited yeah they can actually do much
00:13:19
more and be much more correct on a wider
00:13:21
range of things than they are today and
00:13:23
it's really about sort of teaching them
00:13:25
they have the intelligence you need to
00:13:27
teach them certain specific topics that
00:13:29
you know maybe weren't in their original
00:13:31
training set but they can do it if you
00:13:32
do it right yeah we've seen that all the
00:13:33
time where like um uh there was a lot of
00:13:36
like exciting AI deployments that
00:13:38
happened in like you know maybe three
00:13:40
years ago and now they're like we think
00:13:41
the new models are better but we never
00:13:43
did evals because all we were doing was
00:13:44
just shipping cool AI features three
00:13:45
years ago and like the hardest hump to
00:13:47
get people over is like let's step back
00:13:49
and like what does success actually look
00:13:51
like for you like what problem are you
00:13:52
solving like often the pm has rotated so
00:13:55
it's like somebody's inherited it and
00:13:56
then be like all right what does that
00:13:58
look like all right let's some
00:13:59
evaluations what we've learned is like
00:14:01
Claud is actually good at writing
00:14:02
evaluations and also grading them so
00:14:04
like we can automate a lot of this for
00:14:05
you but you have to tell us what success
00:14:07
looks like and then let's go and
00:14:09
actually iteratively improve our way
00:14:11
over there and like that is often like
00:14:13
the difference between like 60% of a
00:14:15
task and like 85% of task if you come
00:14:17
interview at anthropic which maybe you
00:14:18
should uh at some point maybe you're
00:14:20
happy in your R maybe not um uh you'll
00:14:22
see one of the things we do in our
00:14:23
interview process actually like make you
00:14:25
get uh a prompt from like crappy eval to
00:14:28
good and like just we want to see you
00:14:30
think but like not enough of that Talent
00:14:31
exists elsewhere so we're trying to get
00:14:33
that like if there's one thing we can
00:14:35
teach people that's probably the most
00:14:36
important thing writing eals I mean it's
00:14:38
I actually think it's going to become a
00:14:40
core skill for PMS we actually had this
00:14:43
and maybe this is like a little inside
00:14:44
baseball but I thought this was
00:14:45
interesting like internally we had our
00:14:47
research PMS who like work a lot on
00:14:49
model capabilities and model development
00:14:51
and then we had our like more like
00:14:52
product surface PMS or API PMs and we
00:14:55
ended up realizing that like the job of
00:14:57
a PM in 2024 202 5 building AI powered
00:15:00
features looks is looking more and more
00:15:02
like the former than the latter in a lot
00:15:04
of cases like uh we launched uh like our
00:15:07
uh code analysis and like basically
00:15:09
Claud can analyze csvs and write code
00:15:10
for you now and the PM there was like
00:15:13
getting it 80% of the way there and then
00:15:14
having to hand it over to the PM that
00:15:15
could write the evals then go to like
00:15:17
fine tune and like prompt I was like
00:15:19
that's actually the same role like the
00:15:20
quality of your feature is now gated on
00:15:22
how well you have done the evals and the
00:15:24
prompts and so like that PM like
00:15:27
definition is definitely just mer now
00:15:29
yeah absolutely I we we set up a boot
00:15:31
camp and like took every PM through uh
00:15:35
writing evals and like what it was like
00:15:37
difference between good and bad evals
00:15:39
and you know we're we're definitely not
00:15:41
done there we've got to keep iterating
00:15:42
and getting better on it but it is such
00:15:44
a critical part of of making a good
00:15:46
product with AI yeah as part of this
00:15:49
recruiting call for any of the people
00:15:51
who want to be good at building AI
00:15:53
product or research product in the
00:15:55
future um we can't come to your boot
00:15:58
camp Kevin so how do we develop some
00:16:00
intuition for getting good at this eval
00:16:03
and iteration L I actually think it's
00:16:06
something you can you can use the models
00:16:07
themselves for like you were talking
00:16:09
about you can ask the models at this
00:16:10
point what makes a good eval give me you
00:16:12
know I want to do this can you write me
00:16:13
a sample eval and it will it will be
00:16:16
pretty good yeah I think that's like
00:16:18
that goes a log way I think there's also
00:16:20
this question of like it's and I if you
00:16:23
listen to like everybody from like Andre
00:16:25
carpath to others who have like spent a
00:16:27
lot of time in the field like nothing
00:16:28
beats looking at at data and so like
00:16:30
people often get caught up um being like
00:16:32
well we already have these evaluations
00:16:33
and the new model is like 80% there
00:16:35
rather than 78% we can't or like
00:16:37
you know it's worse and I was like have
00:16:38
we looked at the cases where it fails
00:16:40
and you're like oh actually this was
00:16:41
better it's just our grader is not as
00:16:43
good you know or um it's funny like
00:16:46
again a little inside baseball you know
00:16:47
like every model release has the model
00:16:49
card and some of these model uh these
00:16:51
eiles we've seen like even the golden
00:16:53
answer I'm like I'm not sure a human
00:16:55
would say it or like I think that math
00:16:56
is actually a little wrong like getting
00:16:58
100% is going to be really hard cuz even
00:16:59
just grading them is very challenging so
00:17:01
like I'd encourage you to like the way
00:17:03
you build the intuition is go look at
00:17:04
the actual answers even to sample them
00:17:07
be like all right yeah maybe we should
00:17:09
evolve The evals or maybe like The Vibes
00:17:11
are good even if the eval is like tied
00:17:13
so like getting real and getting like
00:17:15
deep on the data I think matters I also
00:17:18
think it'll be really interesting to see
00:17:19
how this evolves as we go towards longer
00:17:21
form more agentic tasks because it's one
00:17:24
thing when your evals are like I gave
00:17:26
you this math thing and you were able to
00:17:28
like add four-digit numbers and get to
00:17:30
the right answer you know it's easy to
00:17:32
know what what good looks like there as
00:17:34
the models start to do more uh long form
00:17:38
more ambiguous things go get me a hotel
00:17:41
in New York City you know what's what's
00:17:44
right there a lot of it will be about
00:17:46
personalization uh you know if you ask
00:17:48
any two humans who are perfectly
00:17:50
competent they're going to do two
00:17:52
different things so your grading becomes
00:17:55
much softer and it you know it'll just
00:17:57
be interesting we I think we'll have to
00:17:58
to evolve yet again like speaking of
00:18:00
having to reinvent stuff over and over
00:18:02
again I think a lot like when you think
00:18:04
about and I think both Labs have some
00:18:06
concept of like this is what
00:18:08
capabilities look like as things evolve
00:18:09
like it looks a little bit like a career
00:18:11
ladder like what bigger and longer
00:18:12
Horizon tasks are you taking and maybe
00:18:14
like eval start looking more like
00:18:16
performance review I'm in performance
00:18:17
review season so this is the metaphor
00:18:18
that's in my head sorry but it's like
00:18:20
you know like did the model like meet
00:18:22
your expectation of like what a
00:18:23
competent human have done did it exceed
00:18:25
it because it did it twice as fast or
00:18:26
like discovered some restaurant you
00:18:28
wouldn't have known it greatly exceed
00:18:29
meets most like it starts being like
00:18:31
more nuanced than just like right or
00:18:33
wrong let alone you have humans riding
00:18:36
these eval and the models are getting to
00:18:38
the point where they can often beat
00:18:39
humans at certain tasks like people
00:18:41
prefer the model's answers to a human's
00:18:43
answers and so if your humans writing
00:18:44
your evals like yeah you know so what
00:18:47
does that
00:18:48
mean uh what okay evals are clearly the
00:18:52
key um we're going to go spend a bunch
00:18:54
of time with these models teaching
00:18:56
ourselves to write EV vals what are
00:18:58
their skills should product people be
00:19:00
learning now you you're both on that
00:19:01
learning path I think uh prototyping
00:19:05
with these models is a thing that is
00:19:07
underused like our best PMS do this
00:19:09
where we'll get into some long
00:19:10
conversation about like should the UI be
00:19:12
this or that and before our designers
00:19:14
have even like picked up their figma
00:19:16
like our our often our PMS or sometimes
00:19:19
our Engineers will be like great I
00:19:21
prompted Claude I did like an AB
00:19:22
comparison of what these two UI could
00:19:24
look like let's try them and I'm like oh
00:19:25
this is really cool I'm like play that
00:19:27
out and like we'll be able to Pro
00:19:28
protype much like a far greater variety
00:19:31
and evaluate um like on a much faster
00:19:34
scale than before so like that skill of
00:19:36
like using these tools to actually be in
00:19:39
prototyping mode I think is a really
00:19:41
really useful one that's a good one I
00:19:43
would also I you you sort of said this
00:19:46
but I think it's also going to push PMS
00:19:49
to go deeper into the tech stack yeah um
00:19:51
because it's and maybe that changes over
00:19:53
the years like that if you were doing
00:19:55
like database Tech in I don't know 2005
00:19:59
maybe it required you to be able to go
00:20:01
really deep in a different way than it
00:20:02
would if you were doing database Tech
00:20:04
now like layers of abstraction get built
00:20:06
and you maybe don't need to know all the
00:20:08
fundamentals but it's not like every PM
00:20:11
needs to be a researcher by any means
00:20:12
but I think having an appreciation for
00:20:14
it spending time and learning the
00:20:17
language and gaining intuition for how
00:20:19
this stuff works a little bit I think
00:20:21
will go a long way I think the other
00:20:22
piece like you're dealing with this like
00:20:24
stochastic non-deterministic system
00:20:26
which like eals are our best attempt to
00:20:28
do it but like product design in a world
00:20:30
where like you're not in control of what
00:20:33
the model is going to say you can try
00:20:35
and so like what are the feedback
00:20:36
mechanisms that you need to close that
00:20:38
Loop like how do you decide when like
00:20:40
the model's gone astray how do you
00:20:41
collect that feedback in a in a rapid
00:20:43
way you know like what are the guard
00:20:44
rails you want to put in like how do you
00:20:47
even know what it's doing in aggregate
00:20:48
like it's a much more like you're
00:20:51
understanding like the output of this
00:20:53
intelligence across a lot of outputs
00:20:55
over a lot of people every single day it
00:20:57
just requires a very different set that
00:20:59
like oh the bug report is you clicked on
00:21:01
the button and didn't follow the user
00:21:02
it's like that's a pretty knowable kind
00:21:04
of problem right and and maybe this will
00:21:05
change you know 5 years from now when
00:21:08
people are used to it but I think we're
00:21:09
all still in the mode of adapting to
00:21:11
this sort of
00:21:13
non-deterministic user interface
00:21:15
ourselves uh and certainly people who
00:21:17
are not you know tech people here in
00:21:19
this room working on Tech products who
00:21:21
are using AI are are definitely not used
00:21:24
to it like it goes against all of the
00:21:25
intuition that we've built up for the
00:21:27
last like 25 years of using
00:21:29
computers uh and so like the idea that
00:21:32
you're going to put in the exact same
00:21:33
things normally if you put in the exact
00:21:35
same inputs computers give you the exact
00:21:37
same outputs and that is no longer true
00:21:40
uh and it it it's not just that we have
00:21:43
to adapt to it Building Products we have
00:21:44
to also put oursel in the shoes of the
00:21:47
people who are using our products and
00:21:49
think about what this means for them and
00:21:50
there's like I mean there are downsides
00:21:52
to it there also really cool upsides and
00:21:54
so it's fun to kind of think about how
00:21:56
how you can use that to your advantage
00:21:58
in different ways I remember like we we
00:22:00
did like a lot of like rolling user
00:22:02
research at Instagram so we have like
00:22:04
the same like or researchers would bring
00:22:06
in different people every single week
00:22:07
whatever it was like prototype ready
00:22:08
would get put through it and we do the
00:22:10
same thing at anthropic but what's
00:22:11
interesting is like for those sessions
00:22:13
what would often surprise me is like how
00:22:15
users were using Instagram there's
00:22:16
something interesting about like their
00:22:17
use case or like their reaction to a new
00:22:19
feature and like now it's like half that
00:22:21
and half what the model did in that
00:22:22
situation you're like oh it did the
00:22:24
right thing this is great so there's
00:22:25
like this like very like almost like a
00:22:28
sense of Pride maybe of like when it
00:22:29
reacts well and you're like in a user
00:22:31
research environment and then like also
00:22:33
the like frustration you're like oh no
00:22:35
you misunderstood the intent and now
00:22:36
you're like 10 pages down into this
00:22:38
answer and so like it it's also like
00:22:40
maybe a little of getting Zen about like
00:22:42
letting go of control and you know
00:22:44
what's going to happen in those
00:22:45
environments yeah you have both worked
00:22:48
on these consumer experiences that
00:22:49
taught new behaviors to you know many
00:22:52
hundreds of millions of people uh
00:22:55
quickly uh these AI products are
00:22:57
happening actually faster than that
00:22:59
right and you know if if PMs and
00:23:02
Technical people don't have that much
00:23:04
intuition naturally for how to use them
00:23:06
how do you think about educating end
00:23:08
users at the scale you're both working
00:23:10
with on something that is so unintuitive
00:23:13
I mean it it is kind of amazing how fast
00:23:16
we all
00:23:17
adapt uh I was talking to somebody the
00:23:20
other day and they were telling me about
00:23:21
their first wayo ride who's ridden in a
00:23:24
whmo who rode one here yeah if you
00:23:27
haven't ridden in a whmo you're in San
00:23:29
Francisco ride AO to wherever you're
00:23:31
going when you leave here it's a magical
00:23:34
experience but they were like my first
00:23:37
30 seconds I was like oh my God watch
00:23:40
out for that
00:23:40
bicyclist right and then 5 minutes in it
00:23:43
was like oh my God I'm living in the
00:23:46
future and then 10 minutes later it was
00:23:49
like board scrolling on your phone like
00:23:52
you know how quickly we become used to
00:23:54
something that is just absolute magic
00:23:56
yeah um and I think I mean cha PT is
00:24:00
less than 2 years old MH and it was
00:24:03
absolutely mind-blowing when it exist or
00:24:05
when it when it first came and now I
00:24:07
think if we had to go back and use the
00:24:08
original whatever it was GPT 3.5 I think
00:24:12
the horror yeah yeah like no everybody
00:24:14
be like GH so dumb how could I Poss you
00:24:18
know and and you know the stuff that's
00:24:20
that's happening today that we're
00:24:22
working on that you guys are working on
00:24:24
it all feels like magic 12 months from
00:24:26
now we're going to be like can you can
00:24:27
you believe if we use that garbage
00:24:30
because it's going to I me that's how
00:24:31
fast this thing is moving but it's also
00:24:33
amazing to me how quickly people adapt
00:24:35
because I mean as much as we try and
00:24:37
bring people along like there are also
00:24:39
um there's just there's a lot of
00:24:41
excitement people understand that this
00:24:43
is like the world is moving in this
00:24:45
direction and um we've got to try and
00:24:47
make it the best possible move that we
00:24:50
can but uh it's it's happening and it's
00:24:52
happening fast one thing we're trying to
00:24:53
get better at and that's is also letting
00:24:55
the product be like educational in a
00:24:57
very little away which is like a thing
00:24:59
we did not do early and now we're
00:25:01
changing is just tell Claude more about
00:25:04
itself which was like you know it's in
00:25:05
its training set that it's you know uh
00:25:08
artificial intelligence created by
00:25:09
anthropic whatever but now we're
00:25:10
literally like and here's how you use
00:25:12
this fature we Shi because people would
00:25:13
ask and again this came from user
00:25:14
research because we'd be like they would
00:25:16
be like how do I use this thing and then
00:25:18
Cloud would be like I don't know have
00:25:19
you tried like looking at it on the
00:25:21
internet you're like no that's un
00:25:22
helpful and So like um uh like we're
00:25:25
really trying to ground it and then at
00:25:26
launch time we're like you know it's a
00:25:28
process were improving but like it's
00:25:29
it's cool to now see like this is the
00:25:31
exact link to the documentation like
00:25:32
here's how you do it like I can help you
00:25:34
stuff by St oh you're stuck I can help
00:25:36
you here so uh these things are actually
00:25:38
very good at solving uh UI problems and
00:25:41
like user confusion and like we should
00:25:43
use them more for that yeah that's got
00:25:45
to be different when you are um you know
00:25:47
trying to do like change management in
00:25:48
an Enterprise though right because
00:25:50
there's a there's a status quo for how
00:25:52
are you doing things there's
00:25:53
organizational process like how do you
00:25:55
think about educating entire
00:25:57
organizations about productivity
00:25:59
improvements or whatever else can come I
00:26:00
think the Enterprise one is really
00:26:01
interesting because like even like these
00:26:04
products have like millions and millions
00:26:05
of users but like the power users are
00:26:08
very much I think still like early
00:26:10
adopters people who like technology and
00:26:11
then there's like a you know long tail
00:26:13
whereas when you go into Enterprise
00:26:14
you're deploying to like an organization
00:26:16
that is like often there's folks who are
00:26:17
like not very non-technical and like I
00:26:19
think that's really cool actually seeing
00:26:22
fairly non-technical users get exposed
00:26:24
to like a chat powered llm for the first
00:26:27
time and then getting to see it then you
00:26:29
have the luxury of like getting to run
00:26:30
like a session where you teach them
00:26:32
about it and like have educational
00:26:33
materials um and so I think we need to
00:26:36
learn from what happens in those and
00:26:38
then say like that's what we need to do
00:26:39
to teach the next 100 million people how
00:26:41
to use these these these uis and they're
00:26:43
usually power users internally and
00:26:45
they're they're excited to teach the
00:26:48
rest of people and you know like with
00:26:50
open AI we have these custom gpts that
00:26:52
you can make and organizations make
00:26:54
thousands of them often and it's a way
00:26:56
for the power users to make something
00:26:58
that makes AI easier and like
00:27:01
immediately valuable for the people that
00:27:03
might not know how to use it otherwise
00:27:05
um so like that's one cool thing you
00:27:07
find the pockets of power users and they
00:27:09
actually will sort of be
00:27:11
evangelists I I have to ask you then
00:27:14
because you you know your organizations
00:27:16
are both like all power users right so
00:27:18
you know you're living in your little
00:27:19
pocket of the future uh I'll ask about
00:27:22
one thing but feel free to redirect Mike
00:27:24
how am I supposed to use computer use
00:27:26
this is amazing like what are you guys
00:27:27
doing
00:27:28
yeah well internally like we're I mean
00:27:31
this to Kevin's earlier comment around
00:27:32
like when is it going to be ready all
00:27:34
right like go this like it was pretty
00:27:36
late breaking like we like had
00:27:38
conviction that it was like this is like
00:27:40
good and like we want to put this down
00:27:41
like it's early still and it's like
00:27:43
still going to make mistakes but like
00:27:44
how do we do this as well the funniest
00:27:46
use case like while we were beta testing
00:27:47
it was like somebody was like I wonder
00:27:49
if I can get it to order us a pizza and
00:27:50
like it did and they're like great
00:27:51
there's do like the the moment where
00:27:53
Domino's shows up at your office and it
00:27:55
was ordered entirely by AI is like a
00:27:57
very was a very cool like seminal moment
00:27:58
and then we're like oh but it's Domino's
00:28:00
but like you know like but like it was
00:28:02
definitely like amazing yeah uh but it
00:28:05
was AI you know so it was all it was it
00:28:06
was it was good it also like ordered
00:28:08
quite a bit of pizza so it was like
00:28:09
maybe hungrier than intended uh some
00:28:11
early things that we're seeing that we
00:28:12
think are really interesting one is UI
00:28:14
testing which is like I was like at
00:28:16
Instagram we had basically no UI tests
00:28:18
because they're hard to write they're
00:28:19
like they're brittle um and they're like
00:28:21
often like a little bit like oh like we
00:28:23
moved this button around and like it
00:28:24
should still pass that was the point of
00:28:26
the PR but like now it's going to fail
00:28:27
we're going to have to like do this
00:28:28
whole other snapshot um and like early
00:28:30
signs are like computer just works
00:28:31
really well for like hey does it work as
00:28:33
intended does it do the thing that you
00:28:34
want it to do and I think that's like
00:28:36
been been very very interesting and then
00:28:38
what we're starting getting into too is
00:28:39
like what are the agentic things that
00:28:40
just like involve a lot of like data
00:28:43
manipulation so we're looking at it with
00:28:44
our support teams and our finance teams
00:28:46
around like those PR forms are going to
00:28:48
fill themselves but like it's very
00:28:50
repetitive you of have data in one Silo
00:28:52
you want to put it in a different Silo
00:28:53
and it just requires like human time
00:28:55
like I keep using the word drudgery when
00:28:57
I talk about computer like can we
00:28:58
automate the drudgery so you can focus
00:29:00
on the creative stuff and not like the
00:29:02
you know 30 clicks to do one single
00:29:06
thing Uh Kevin I I think we have a lot
00:29:09
of teams that are um experimenting with
00:29:12
o1 you can obviously do much more
00:29:13
sophisticated things you also can't use
00:29:15
it as a one forone replacement if you're
00:29:17
already using right one of the you know
00:29:19
gbd 40 models or whatever in uh in your
00:29:22
application like can you give us some
00:29:24
guidance what are you guys doing with it
00:29:26
internally so I think one thing
00:29:29
that people maybe don't realize that
00:29:31
actually a lot of the most sophisticated
00:29:33
customers of ours are doing and that
00:29:35
we're certainly doing internally is it's
00:29:36
not really about one model for any
00:29:39
particular thing you end up putting
00:29:41
together sort of workflows and
00:29:43
orchestration between models and so you
00:29:45
use them for what they're good at 0
00:29:47
one's really good at reasoning but it
00:29:48
also takes a little bit of time to think
00:29:50
and it's not multimodal and you know has
00:29:52
other limitations you Define reasoning
00:29:54
for the group I realize it's a basic
00:29:55
question but yeah so uh we people are I
00:29:59
think pretty used to the concept the
00:30:01
like scaling pre-training concept you go
00:30:04
gpt2 3 four five whatever and you're
00:30:07
doing bigger and bigger runs on
00:30:09
pre-training these models are getting
00:30:10
you know smarter and smarter um like
00:30:13
they or rather maybe they know more and
00:30:15
more but they're kind of like system one
00:30:18
thinking right it's it's you ask it a
00:30:20
question you immediately get an answer
00:30:22
it's like text completion yeah sort of
00:30:24
if I ask you me asking you questions
00:30:26
right now and you just have to stream
00:30:28
one Tok at a time keep going don't think
00:30:30
it's amazing actually how much human
00:30:33
like your intuition about how other
00:30:34
humans work will often like help you in
00:30:38
intuiting about how these models work um
00:30:40
you know you asked me a question I got
00:30:42
off onto the wrong like sentence it's
00:30:44
hard to recover the models totally do
00:30:46
the same thing um but uh so you've got
00:30:50
that that sort of larger and larger
00:30:52
pre-training 01 is actually a different
00:30:56
way of scaling
00:30:58
intelligence by doing it at uh at query
00:31:01
time basically so instead of system one
00:31:04
thinking I ask you a question and
00:31:05
immediately tries to give you an answer
00:31:07
it'll pause same thing I would you know
00:31:09
you would do if I asked you a question I
00:31:10
said solve this Sudoku do this New York
00:31:13
Times connections puzzle you you would
00:31:15
start going okay these words how do they
00:31:17
group together okay these might be these
00:31:19
four well no I'm not sure could be you
00:31:22
know you're you're like forming
00:31:23
hypotheses using what you know to refute
00:31:26
these hypothesis or affirm them and then
00:31:29
from that continuing to reason on it's
00:31:32
how it's How scientific breakthroughs
00:31:34
are made it's how we answer hard
00:31:36
questions um and so this is about
00:31:38
teaching the models to do it and right
00:31:40
now you know they'll think for 30 or 60
00:31:43
seconds before they answer imagine what
00:31:45
happens if they can think for five hours
00:31:47
or five days um so it's basically a new
00:31:50
way to scale intelligence and we feel
00:31:53
like we're just at the very beginning
00:31:55
you know we're at the like gpt1 phase of
00:31:58
um of this new form of reasoning um but
00:32:01
in the same way it's not you don't use
00:32:03
it for everything right there are
00:32:04
sometimes when you ask me a question you
00:32:05
don't want me to wait 60 seconds you I
00:32:07
should just give you an answer um so we
00:32:10
end up using our models in a bunch of
00:32:13
different ways together so for example
00:32:15
like cyber security you would think not
00:32:18
really a use case for models they can
00:32:20
hallucinate that seems like a bad place
00:32:21
to hallucinate but you can a like find
00:32:25
tune a model to be good at certain Tas
00:32:27
asks and then you can fine-tune models
00:32:30
to be very precise about the kinds of
00:32:32
inputs and outputs that they expect and
00:32:34
have these models start working in in
00:32:36
concert together and you know models
00:32:39
that are checking the outputs of other
00:32:40
models realizing when something doesn't
00:32:42
make sense asking it to try again um and
00:32:47
uh so like that ends up being how we get
00:32:50
a ton of value out of our own models
00:32:52
internally it's like specific use cases
00:32:56
uh and or orchestrations of models
00:32:59
together designed sort of working in
00:33:00
concert to do specific tasks which again
00:33:03
going back to like reasoning about how
00:33:04
we work as humans how do we do complex
00:33:07
things as humans you have different
00:33:08
people who often have different skill
00:33:10
sets and they work together to
00:33:11
accomplish a hard
00:33:13
task I can't let you guys get away
00:33:16
without without telling us something
00:33:18
about the future and what's coming and
00:33:20
so um you don't have to give us release
00:33:23
dates I understand you don't know but uh
00:33:26
if you if you look out I I think the
00:33:28
furthest anyone can look out in AI right
00:33:29
now is like well tell me if you can see
00:33:31
the future but like let's say like 6
00:33:33
months 12 months like what's an
00:33:35
experience that you imagine is going to
00:33:37
be possible or prevalent I think a lot
00:33:40
about um like to Breaking the well I
00:33:43
think a lot about this all the time but
00:33:45
like the um two maybe two words to be
00:33:47
like plant seeds in in everybody's mind
00:33:50
like one is proactivity like how do the
00:33:51
models become more proactive like once
00:33:53
they know about you and they're
00:33:54
monitoring like they're reading your
00:33:56
email in a good not creepy way and
00:33:58
they're like uh because you authorized
00:33:59
them to and then they like you know spot
00:34:02
an interesting Trend or you start your
00:34:03
day with something that's a like um like
00:34:05
a proactive like uh recap of what's
00:34:08
going on some conversations you're going
00:34:09
to have I I I prid some research for you
00:34:11
hey your next meeting is coming up like
00:34:13
here's what you might want to talk about
00:34:14
I saw you have this like presentation
00:34:16
coming up here's the first draft that I
00:34:17
put together like that kind of
00:34:18
proactivity I think is going to be
00:34:20
really really powerful and then the
00:34:21
other part is being more asynchronous so
00:34:23
like uh I think o1 is like early UI in
00:34:27
this exploration which is like it's
00:34:29
going to do a lot and it's going to tell
00:34:30
you kind of what it's going to do along
00:34:31
the way and like you can sit there and
00:34:33
wait for it but you could also like be
00:34:34
like it's going to think for a while I'm
00:34:35
going to go like do something else maybe
00:34:37
tab back maybe it like can tell me when
00:34:39
it's done like yeah I expanding the time
00:34:41
Horizon both in terms of like you didn't
00:34:43
ask a question it just told you
00:34:44
something I think that's going to be
00:34:45
interesting and then you did ask a
00:34:47
question and you're going to be like
00:34:48
great like I'm going to go reason about
00:34:50
it I'm going to go research it I might
00:34:52
have to ask another human about it like
00:34:53
and then I'm going to like maybe come up
00:34:55
with my first answer I'm going to vet
00:34:56
that answer you'll hear back from me in
00:34:58
like an hour like Breaking Free of those
00:35:00
like uh constraints of like expecting an
00:35:02
answer immediately I think will let you
00:35:04
do things like hey I have this like
00:35:06
whole like mini project plan like go
00:35:08
flesh it out or like not just like I
00:35:10
want you to like change this one thing
00:35:11
on the screen but like fix this bug for
00:35:13
me like take my PRD and like adapt it
00:35:16
for these new market conditions like
00:35:18
adapt it for these three different
00:35:19
marketing conditions that emerg like
00:35:20
being able to push those Dimensions I
00:35:22
think is what I'm personally most
00:35:23
excited about on the product side yeah I
00:35:26
completely agree with all of that that
00:35:28
um and it's the models are going to get
00:35:31
smarter at an accelerating rate I think
00:35:33
which is also part of how all of that uh
00:35:35
comes to pass another thing that will be
00:35:38
really exciting is seeing the models
00:35:40
able to interact in all the same ways
00:35:42
that we as humans interact you know
00:35:44
right now you mostly type to these
00:35:46
things and you know I mostly type to a
00:35:48
lot of my friends on WhatsApp and other
00:35:49
things but I also speak I also can see
00:35:54
and uh we just we launched this advanced
00:35:57
voice mode Rel relatively recently I was
00:35:59
in uh I was in Korea and
00:36:02
Japan having
00:36:04
conversations and I would just I would
00:36:06
often be with somebody with whom I had
00:36:09
no common language whatsoever before
00:36:11
this we could not have said a word to
00:36:13
each other and instead I was like Hey
00:36:16
chat gbt I want you to act as a
00:36:17
translator when I say something in
00:36:19
English I want you to say it in Korean
00:36:21
and when you hear something in Korean
00:36:23
say it back to me in English and all of
00:36:24
a sudden I had this Universal translator
00:36:26
and I was having business conversations
00:36:29
with another person uh and it was
00:36:32
magical and you think what that can do
00:36:35
like not just in a business context but
00:36:36
think about people's willingness to
00:36:38
travel to new places if you don't ever
00:36:39
have to be worried about not speaking
00:36:41
the language and you've got this like
00:36:42
Star Trek Universal translator in your
00:36:44
pocket you know and so experiences like
00:36:47
that I think it's going to become
00:36:49
commonplace fast but it's magical and
00:36:51
I'm excited about that in combination
00:36:54
with all the stuff Mike was just
00:36:56
saying oh one of my favorite pastimes
00:37:00
now just you know since uh voice mode
00:37:03
release is actually watching there's a
00:37:05
genre of Tik Tok of well this just
00:37:07
speaks to how old I am like there's a
00:37:09
genre of Tik Tok where you just like uh
00:37:11
it's just young people talking to voice
00:37:13
mode like pouring their heart out using
00:37:15
it all these ways where I'm like oh my
00:37:17
God like there's this old term being
00:37:19
like digitally native or mobile native
00:37:21
and I'm like I like pretty strongly
00:37:24
believe in this AI thing and I would not
00:37:26
think to interact in this way but people
00:37:29
who are 14 years old are like well I
00:37:31
expect the AI to be able to do that and
00:37:33
I love that have you ever given it to
00:37:35
your kids uh I haven't yet my kids are
00:37:37
like five and seven Kevin knows them so
00:37:39
we but we'll get there I mean mine are
00:37:41
eight and 10 but like on a car ride
00:37:43
they'll be like can I talk to chat GPT
00:37:45
yes and they will ask it the most
00:37:47
bizarre things they will just have
00:37:49
weirdo conversations with it but they're
00:37:52
perfectly happy talking to an AI yeah
00:37:54
actually one of my favorite experiences
00:37:56
and maybe we'll close and ask you for
00:37:57
like the most surprising Behavior kids
00:37:59
or not is uh um like when my parents
00:38:04
read to me like I got L I was lucky if I
00:38:07
got to choose the book and it wasn't my
00:38:08
dad being like we're going to read this
00:38:10
physics study I'm interested in right my
00:38:13
kids I don't know if it's just like
00:38:14
parenting in the Bay Area but my kids
00:38:16
are like okay Mom make the images right
00:38:19
I want to tell a story about the dragon
00:38:22
unicorn in this setting I'm going to
00:38:23
tell you exactly how it's going to
00:38:25
happen create it in real time and I'm
00:38:27
like like that's a big ask I'm glad you
00:38:30
believe and like know that's possible
00:38:32
but it's it's a wild way to like create
00:38:34
your own entertainment too what is the
00:38:36
um most surprising Behavior you've seen
00:38:38
in your own products
00:38:41
recently I think it's a behavior and a
00:38:45
relationship like people really start
00:38:49
understanding the Nuance of like what
00:38:51
Claud is we just have like a a new
00:38:53
revenge of the model and it's like they
00:38:55
get the Nuance like it's like I guess
00:38:57
the behavor behavior is like almost
00:38:58
befriending or like really like
00:39:00
developing a lot of like 2-way empathy
00:39:02
around what's happening and then like
00:39:03
the is like oh you know the new model
00:39:05
like felt like it was smarter but maybe
00:39:07
a little more distant but maybe you know
00:39:09
and it's like it's like that kind of
00:39:10
like Nuance which like you like I it's
00:39:13
it's given me as a product person a lot
00:39:15
more empathy around like you're not just
00:39:16
shipping a product you're shipping like
00:39:19
intelligence and intelligence and
00:39:21
empathy are like what makes like
00:39:23
interpersonal relationships important
00:39:24
and if somebody show up and they're like
00:39:25
I was upgraded like I say know I scored
00:39:28
2% higher on this math score but like
00:39:30
I'm Different in this way you'd be like
00:39:31
oh I got to adapt now and maybe you know
00:39:33
be a little worried about it so like
00:39:35
that that's been an interesting Journey
00:39:37
for me like understanding the mentality
00:39:39
for people that when they're using our
00:39:40
products yeah Model Behavior is
00:39:43
absolutely a product role like the the
00:39:46
personality of the model is is key and
00:39:49
there are interesting questions around
00:39:50
how much should it customize uh versus
00:39:52
how much should you know open AI have
00:39:54
one personality and Claude has some
00:39:56
distinct personality
00:39:58
and are people going to use one versus
00:39:59
the other because they happen to like it
00:40:01
I mean that's that's a very human thing
00:40:03
right we're friends with different
00:40:04
people because we happen to like
00:40:05
different people better than others and
00:40:06
it's um that's an interesting thing to
00:40:09
to think about we did something recently
00:40:13
um and it sort of went viral on Twitter
00:40:16
people started asking the model based on
00:40:19
everything you know about me based on
00:40:20
all of our past interactions you know
00:40:22
what what would you say about me and the
00:40:25
model will will respond and it will like
00:40:27
give you give it a description of what
00:40:29
it you know kind of thinks based on all
00:40:31
of your past
00:40:32
interactions and it is this sort of
00:40:35
you're you're starting to interact with
00:40:36
it almost like some sort of person or
00:40:39
entity in interesting ways and um
00:40:42
anyways it was fascinating to see
00:40:43
people's reaction to
00:40:46
that Kevin Mike thank you so much for
00:40:48
doing this and giving us a glimpse into
00:40:50
the future thank you so much