A conversation with OpenAI's CPO Kevin Weil, Anthropic's CPO Mike Krieger, and Sarah Guo
الملخص
TLDRIn a panel discussion, Kevin and Mike, leaders in AI product development, share their insights on the rapid transformation within the AI landscape and the implications for product management. They articulate their excitement for AI’s evolving capabilities and underscore the challenges of developing AI technologies that meet user needs. They highlight the importance of user feedback in shaping product experiences and the evolving role of product managers in an AI-dominated environment. Key themes include the need for product managers to master evaluation skills, the opportunity to create proactive AI interactions, and the way users start forming emotional bonds with AI entities. The discussion reflects on real-world applications of AI, the adaptability of users, and forecasts the future of AI integration in everyday tasks.
الوجبات الجاهزة
- 👑 Kevin and Mike are excited about their new roles in AI product development.
- 💡 AI capabilities are evolving rapidly, changing product experiences.
- 🤝 User feedback is crucial for refining and enhancing AI products.
- 🚀 Future interactions with AI may become more proactive and personalized.
- 🛠️ Product managers need to develop new skills for working with AI technologies.
- 📈 Observing how users adapt can inform better design strategies.
- 🍕 A memorable anecdote involved an AI ordering pizza during internal testing.
- 🧠 Users form emotional connections with AI, seeing them as entities with personality.
- ⚙️ Effective evaluation methods can significantly enhance AI product quality.
- 🌍 Future AI may facilitate international communication through real-time translation.
الجدول الزمني
- 00:00:00 - 00:05:00
The discussion opened with Sarah expressing excitement to be with Kevin and Mike, both known for their expertise in AI and previous roles at Instagram. She proposed discussing new product ideas but settled for a casual exchange of insights.
- 00:05:00 - 00:10:00
Kevin shared that he finds his new role in AI product management both challenging and fascinating, as it involves constantly adapting to new technological capabilities. He described the experience as sleepless but rewarding and highlighted the rapid evolution of AI technology.
- 00:10:00 - 00:15:00
Mike reflected on the various reactions he received from peers upon joining the AI team. While some were supportive, others questioned his choice given his previous semi-retirement. He emphasized that his passion for innovation drove him back into the tech world.
- 00:15:00 - 00:20:00
Both Kevin and Mike discussed their transition to enterprise roles, noting the differences from their previous consumer-focused experiences. They expressed excitement about engagement with enterprise clients and receiving direct feedback on how products are used, which differs from consumer feedback.
- 00:20:00 - 00:25:00
Kevin pointed out the unique challenges in enterprise product management, such as aligning with buyer goals and predicting market needs. He emphasized that understanding the audience and specific use cases is critical for successful AI product development.
- 00:25:00 - 00:30:00
The conversation turned toward the unpredictable nature of AI products, with Mike noting that product managers must remain flexible as they await the outcomes of model training and research insights. They both acknowledged the challenge of navigating evolving AI capabilities.
- 00:30:00 - 00:35:00
Mike shared insights on the importance of evaluation skills for product managers, stressing that a deeper understanding of AI models and effective eval writing will be essential for successful project outcomes moving forward.
- 00:35:00 - 00:40:58
The discussion concluded with a glimpse into the future of AI products, focusing on more personalized and proactive interactions, improving user experience through intuitive features, and how innovations will rapidly reshape both consumer and enterprise landscapes.
الخريطة الذهنية
فيديو أسئلة وأجوبة
What is one major topic discussed in the panel?
The challenges and excitement of working in AI product development.
How do Kevin and Mike view the evolution of AI capabilities?
They see it as rapidly advancing, with AI models becoming more sophisticated and useful.
What is one way AI is expected to change user interactions?
Through more proactive and personalized communication with users.
Why is user feedback important in AI product development?
It helps improve the AI's understanding and capability to meet user needs.
What kind of skills should product managers develop according to the discussion?
Skills in writing evaluations and prototyping with AI models.
How do Kevin and Mike suggest educating users about AI?
By leveraging power users in organizations to teach and share their experiences.
What interesting anecdote is mentioned regarding internal AI usage?
An AI model successfully ordered pizza for the office during beta testing.
What do the speakers say about user adaptation to AI?
They express amazement at how quickly users adapt to new technologies like AI.
What key aspect of future AI interactions is discussed?
The idea of AI being proactive and able to handle complex tasks asynchronously.
What reflections do Mike and Kevin have on the emotional relationship users form with AI?
They discuss users developing empathy towards the AI and how they respond to its personality.
عرض المزيد من ملخصات الفيديو
- 00:00:18all right hello
- 00:00:20everyone okay okay Sarah you're the
- 00:00:23queen of AI investing this is a phrase
- 00:00:25never ever to be used again but it's
- 00:00:27great to be here with both of you um so
- 00:00:30I had two different ideas for our final
- 00:00:33discussion the first was a product off
- 00:00:37because these two men have the merge to
- 00:00:40to prod button both of them and I was
- 00:00:42like oh please just release everything
- 00:00:43we know is coming over the next six or
- 00:00:4512 months ignore all internal guidelines
- 00:00:48um the second was we just redesigned
- 00:00:50Instagram together since they've both
- 00:00:52actually run Instagram
- 00:00:54before uh both of these got fully shot
- 00:00:56down and so instead I think we'll just
- 00:00:59like trade notes among friends like lame
- 00:01:01I know but uh but really excited to hear
- 00:01:03from you both anyway uh so this is
- 00:01:06actually a relatively new role for both
- 00:01:09of you Kevin let's start with you like
- 00:01:12you've done a bunch of really different
- 00:01:13interesting things like what was the
- 00:01:14reaction you got when you took the job
- 00:01:16from friends and the
- 00:01:18team uh generally excitement I mean uh
- 00:01:21it's I think it's one of the most
- 00:01:23interesting and impactful roles there's
- 00:01:24so much to figure out um I've never had
- 00:01:29such a challenging
- 00:01:32interesting uh Sleepless product role in
- 00:01:35my life uh it's it's got all the
- 00:01:38challenges of a normal product role
- 00:01:40where you're trying to figure out who
- 00:01:41you're building for and what problems
- 00:01:43you can solve and things like that but
- 00:01:46normally when you're Building Product
- 00:01:47You're Building off of kind of a fixed
- 00:01:49technology base right you know what you
- 00:01:50have to work with and uh and you're
- 00:01:53trying to build the best product you can
- 00:01:56here it's like every two months
- 00:01:58computers can do something computers
- 00:01:59have never been able to do before in the
- 00:02:01history of the world and you're trying
- 00:02:02to figure out how that changes your
- 00:02:05product and the answer should probably
- 00:02:07be a fair amount um and so it just it's
- 00:02:11so
- 00:02:12interesting uh and fascinating to see on
- 00:02:15the inside as AI gets developed but uh
- 00:02:17I've been having a
- 00:02:18blast Mike what about you I I remember
- 00:02:21hearing the news I was like oh I didn't
- 00:02:23know you could convince the founder of
- 00:02:24Instagram to go work on something that
- 00:02:26existed already yeah my favorite three
- 00:02:28reactions like people who know me were
- 00:02:29like that makes sense like you're going
- 00:02:31to like have fun there uh the middle
- 00:02:33people were like why like you don't have
- 00:02:35to work like why are you doing this and
- 00:02:37like then if you knew me you know me
- 00:02:38like I can't not and I think that like I
- 00:02:40couldn't stop myself and the third was
- 00:02:42like oh you could hire the founder of
- 00:02:44Instagram which was also fun and it's
- 00:02:45like I mean not many people could but
- 00:02:47like there's like probably a list of
- 00:02:49three companies that would have been
- 00:02:50interesting um and so yeah there's like
- 00:02:52a range of reactions depending on how
- 00:02:54well you knew me and how like you've
- 00:02:56seen me in my like uh semi-retired state
- 00:02:59which lasted like six weeks and I was
- 00:03:00like all right what are we doing
- 00:03:02next um so we had dinner together with a
- 00:03:05bunch of friends recently and you I was
- 00:03:08like impressed by the childish Delight
- 00:03:11that you had around like yeah I'm
- 00:03:12learning about all this Enterprise stuff
- 00:03:15like tell me if it's about serving
- 00:03:18customers that are not all of us with
- 00:03:19Instagram or uh just you know working in
- 00:03:22a a an organization that's research
- 00:03:24driven like what's the biggest surprise
- 00:03:26so far I those are two I think both very
- 00:03:29worthwhile pieces of this role that are
- 00:03:31like very new to me as well like I um
- 00:03:33when I was 18 I made this like very
- 00:03:3418-year-old vow which was like every
- 00:03:36year of my life I wanted to be different
- 00:03:37like I don't have the same year twice
- 00:03:39and I was like it's why like you know I
- 00:03:41didn't there's been times where I was
- 00:03:42like oh like another social product I'm
- 00:03:44like doing that again first of all like
- 00:03:46your bar is like really distorted and
- 00:03:48second of all like just would feel like
- 00:03:49too much of the same thing so yeah
- 00:03:51Enterprise has been wild I'm really
- 00:03:53curious about your experience with that
- 00:03:54as well like you're like you know uh
- 00:03:57your your feedback GL I actually imagine
- 00:03:59it's a lot more like investing is far
- 00:04:00longer right you're like you have that
- 00:04:02initial convo and you're like I think
- 00:04:03they like me and then you're like oh no
- 00:04:05it's now in like some requisition State
- 00:04:07and it's going to take like six months
- 00:04:09before they like even get to deployment
- 00:04:11before you know whether it's right and
- 00:04:12so like getting used to that pace we're
- 00:04:14like what why hasn't this shipped yet
- 00:04:16they're like Mike you've been here two
- 00:04:17months like this is like it's making its
- 00:04:19way through the VPS like it's going to
- 00:04:21get there eventually so like getting
- 00:04:22used to different timelines for sure but
- 00:04:24like the part that is fun is actually
- 00:04:26getting the feedback in the kind of
- 00:04:28Engagement where you're like once it
- 00:04:29gets the deployed you have somebody that
- 00:04:31can call you and you can call them and
- 00:04:32be like how's it working for you like is
- 00:04:34this good like whereas users like you're
- 00:04:36doing like data science and Aggregate
- 00:04:37and like sure you can bring in like one
- 00:04:39or two people but it's like they won't
- 00:04:41they don't have enough financial
- 00:04:42incentive writing on telling you where
- 00:04:44you suck and where you're doing well and
- 00:04:45like that's been a different but also
- 00:04:47like rewarding side of that for sure ke
- 00:04:50Kevin you've worked on such a wide range
- 00:04:52of products before how much do your
- 00:04:54instincts like apply yeah I was going to
- 00:04:57add on to the Enterprise Point too um
- 00:04:58and then I'll get to like the other
- 00:05:01interesting thing about Enterprise is
- 00:05:02it's not necessarily about the product
- 00:05:04right there's a buyer and they have
- 00:05:07goals and you could build the best
- 00:05:09product in the world that all the people
- 00:05:10at the company might be happy to use and
- 00:05:12it still doesn't necessarily matter
- 00:05:14exactly um I was in a a meeting with one
- 00:05:18of our big Enterprise customers and they
- 00:05:20were like um this is great we're really
- 00:05:23happy da D you know the one thing we
- 00:05:25need is we really need you to tell us 60
- 00:05:29days before you launch
- 00:05:31anything and I was like I also would
- 00:05:34like to know 60
- 00:05:37days so very very different actually and
- 00:05:40it's interesting right because at open
- 00:05:41AI we have a consumer product and we
- 00:05:44have an Enterprise product and we have a
- 00:05:46developer product so we're kind of doing
- 00:05:48all at once um Instinct
- 00:05:53wise in I'd say in like half the job it
- 00:05:56works you know when you have when you
- 00:05:59have a sense of the product you're
- 00:06:00trying to build you know we're getting
- 00:06:03towards uh you know the end of shipping
- 00:06:06uh Advanced speech mode or something or
- 00:06:08you're getting towards shipping canvas
- 00:06:10and you're making final touches trying
- 00:06:12to understand who you're building for
- 00:06:14and like what you know exactly what
- 00:06:15problems you're trying to solve it it
- 00:06:18works then because it's that's a little
- 00:06:19bit more like the tail end of it is
- 00:06:21shipping a normal product but the
- 00:06:23beginning of these things is nothing
- 00:06:25like that
- 00:06:27um so uh
- 00:06:31like there will just be these
- 00:06:33capabilities that we don't
- 00:06:35know you you have some sense as you're
- 00:06:38training some new model that it might
- 00:06:41have capability X you don't really know
- 00:06:44nor does the research team nor does
- 00:06:46anybody right you're like I think this
- 00:06:48might be possible and it's kind of like
- 00:06:50coming through the M statue but it's
- 00:06:52this emergent property of a model and
- 00:06:55you know so you don't know whether it's
- 00:06:56going to really work and you don't know
- 00:06:58whether it's going to be like 60% good
- 00:07:01or 90% good or 99% good and the product
- 00:07:06that you would build that would make
- 00:07:08sense with something that works 60% of
- 00:07:10the time is Super different than 90 or
- 00:07:1299% of the time right so you're kind of
- 00:07:14just waiting and you're you know at
- 00:07:16least like I don't know if you feel this
- 00:07:18checking in with the research team from
- 00:07:20time to time like hey guys how's it
- 00:07:22going how's that model training uh any
- 00:07:25any insight on this and they're like
- 00:07:27it's research we're working on it you
- 00:07:29know it's uh we don't know either we're
- 00:07:31we're we're we're working through this
- 00:07:32at the same time and it's I mean it
- 00:07:35makes it super fun because you're kind
- 00:07:36of like discovering things together but
- 00:07:39very sort of stochastic too it's the
- 00:07:41thing it most reminds me of like from
- 00:07:43the Instagram days where like apple like
- 00:07:45wwc announcements you're like this could
- 00:07:47either be awesome for us or could like
- 00:07:49absolutely like cause chaos for it it's
- 00:07:51like that but your own company is the
- 00:07:53one kind of disrupting you from within
- 00:07:55which is like a very like it's very cool
- 00:07:57but also like oh this might totally in
- 00:07:59my product road map now
- 00:08:02yeah uh what does that cycle look like
- 00:08:05for both of you you described it as um
- 00:08:08you know like peering through the Mist
- 00:08:09trying to look at the next set of
- 00:08:11capabilities I mean can you can you plan
- 00:08:14if you don't know exactly what is coming
- 00:08:16and what is the iteration cycle to
- 00:08:17discover new things that should belong
- 00:08:18in your product I think like on the
- 00:08:20intelligence side you can sort of squint
- 00:08:22and see like all right it's advancing
- 00:08:24this way and so the kinds of things that
- 00:08:26you'll want to do with the model and
- 00:08:28start building the product around that
- 00:08:29so I I there's three ways right
- 00:08:31intelligence feels not predictable but
- 00:08:33at least on like a slope that you can
- 00:08:35kind of watch there's the capabilities
- 00:08:37you decide to invest in from the product
- 00:08:38side and then do fine-tuning with the
- 00:08:40actual research teams um so something
- 00:08:42like artifacts we spend a lot of time
- 00:08:44between research I think the same was
- 00:08:45true with campus right like you're doing
- 00:08:47a like co-design co- research co-
- 00:08:49finetune and that's like I think a real
- 00:08:51privilege of getting to work at this
- 00:08:52company and getting to do design there
- 00:08:53and then there's the capability front so
- 00:08:55maybe speech mode for open a for us it's
- 00:08:57the computer use uh work that we we
- 00:08:59released this week you're like all right
- 00:09:0160% all right got good yes all right
- 00:09:04yeah and like so what we try to do is
- 00:09:06embed designers early in the process but
- 00:09:08knowing that like you're not placing a b
- 00:09:10like the the experimentation talk was
- 00:09:12saying like your um your output for
- 00:09:14experiment should be learning not
- 00:09:15necessarily like perfect products you're
- 00:09:16going to ship every time I think the
- 00:09:17same is true when you partner with
- 00:09:18research like your outcome is hopefully
- 00:09:20demos or informative things that like
- 00:09:22could spark product ideas not like a
- 00:09:24predictable product process where you're
- 00:09:26like well it's this der risked by now
- 00:09:28which means it's going to look that way
- 00:09:29when research comes along I've also one
- 00:09:32thing that I've really enjoyed because
- 00:09:34research is at least parts parts of
- 00:09:36research are very product oriented
- 00:09:38especially on the post-training side
- 00:09:39like Mike was saying and then parts of
- 00:09:41it are really like academic research at
- 00:09:43some level and so you we'll just also
- 00:09:47occasionally hear about some capability
- 00:09:49and we'll be in a meeting and you'll be
- 00:09:51like oh I really wish we could do this
- 00:09:52thing and a researcher on the team will
- 00:09:55be like oh no we can do that we've had
- 00:09:57that for three months and we're like
- 00:09:59really what does that like okay where do
- 00:10:02I learn more and they're like oh well we
- 00:10:03didn't think we didn't we didn't know it
- 00:10:05was important so you know I'm working on
- 00:10:07this other thing now um but you do just
- 00:10:10get like magic happening sometimes
- 00:10:13too uh one thing like we think a lot
- 00:10:16about when we're investing is actually
- 00:10:18like can you do anything with a model if
- 00:10:21it is 60% successful at a task instead
- 00:10:23of 99% and unlike lots of tasks that's
- 00:10:26closer to 60 right but the task is
- 00:10:28really important valuable like how do
- 00:10:30you how do you think about that
- 00:10:32internally in terms of evaluating
- 00:10:34progression on a task and then what what
- 00:10:37types of things you like put in sort of
- 00:10:40the burden of product to make it
- 00:10:42graceful failure or to like sort of
- 00:10:44cross the miles the user versus you know
- 00:10:47we just need to wait for the models to
- 00:10:48get better I'd argue there are a lot of
- 00:10:50things that you can actually do when
- 00:10:52something is 60% right you just you just
- 00:10:54need to really design for it um you have
- 00:10:56to expect that there's a human in the
- 00:10:58loop a lot more than there would be
- 00:10:59otherwise like if you look at uh take
- 00:11:02like GitHub co-pilot right that was kind
- 00:11:04of the first AI product that really open
- 00:11:07people's eyes to like this thing can be
- 00:11:09useful not just as you know Q&A but for
- 00:11:11really economically valuable work and
- 00:11:15that launched I don't know exactly which
- 00:11:16model that was built off of but I mean
- 00:11:18it was multiple Generations ago so I
- 00:11:20guarantee you that model wasn't perfect
- 00:11:22at anything related to coding I think
- 00:11:24it's gpt2 which is like pretty small so
- 00:11:27yeah I mean and so but the fact that it
- 00:11:29was still valuable for you cuz if it got
- 00:11:31the code you know some significant
- 00:11:34fraction of the way there that was still
- 00:11:36stuff you didn't have to type yourself
- 00:11:37and you could edit it and so there are
- 00:11:39experiences like that that I think
- 00:11:40totally work I think we'll see the same
- 00:11:42kinds of things happening with um with
- 00:11:45sort of the the shift towards uh agents
- 00:11:48and longer form tasks where you know it
- 00:11:51may not be perfect but if it can save
- 00:11:54you five or 10 minutes that's still
- 00:11:55valuable and even more if the model can
- 00:11:57understand where it doesn't have
- 00:12:00confidence and can come back to you and
- 00:12:01say I'm not sure about this can you
- 00:12:02actually help me with this then you know
- 00:12:05the the combination of human and model
- 00:12:07together can be much higher than 60% I
- 00:12:09also find that 60% this magic 60% number
- 00:12:12like it's kind of lump I made it up five
- 00:12:14minutes ago that was a takeway 6% that
- 00:12:17is our new that's the Mendoza Line of uh
- 00:12:19of AI like I think it's often very lumpy
- 00:12:22where like it'll do very well on some
- 00:12:23tasks and not well on others and I think
- 00:12:25that also helps like uh when we ever run
- 00:12:27like pilot programs with customers is
- 00:12:29really interesting when we'll get like
- 00:12:30the same day feedback from two different
- 00:12:32companies when be like it's solved our
- 00:12:33whole problem like we've been trying to
- 00:12:35do this for three months thank you other
- 00:12:36be like it was way off it's like worse
- 00:12:38than the other model and so like uh it's
- 00:12:40also humbling to know that you have your
- 00:12:42own internal evals but like the rubber
- 00:12:45hitting the road and actually seeing the
- 00:12:46model out in the world is where it's
- 00:12:48kind of the equipment of like you do all
- 00:12:49this design and then like you put it in
- 00:12:50front of one user and you're like oh wow
- 00:12:52I was wrong uh the model has that
- 00:12:54feeling as well we like we try as hard
- 00:12:56as we can to like have a good sense but
- 00:12:58then people have their own custom data
- 00:13:00sets they have their own internal use
- 00:13:01they've prompted it a certain way and
- 00:13:03like uh so that delies that sort of
- 00:13:05almost like bodal nature of when you
- 00:13:07actually put it out in the world I'm
- 00:13:09curious if you feel
- 00:13:10this I think there's a very real sense
- 00:13:13in which models today are not
- 00:13:15intelligence limited they're eval
- 00:13:17limited yeah they can actually do much
- 00:13:19more and be much more correct on a wider
- 00:13:21range of things than they are today and
- 00:13:23it's really about sort of teaching them
- 00:13:25they have the intelligence you need to
- 00:13:27teach them certain specific topics that
- 00:13:29you know maybe weren't in their original
- 00:13:31training set but they can do it if you
- 00:13:32do it right yeah we've seen that all the
- 00:13:33time where like um uh there was a lot of
- 00:13:36like exciting AI deployments that
- 00:13:38happened in like you know maybe three
- 00:13:40years ago and now they're like we think
- 00:13:41the new models are better but we never
- 00:13:43did evals because all we were doing was
- 00:13:44just shipping cool AI features three
- 00:13:45years ago and like the hardest hump to
- 00:13:47get people over is like let's step back
- 00:13:49and like what does success actually look
- 00:13:51like for you like what problem are you
- 00:13:52solving like often the pm has rotated so
- 00:13:55it's like somebody's inherited it and
- 00:13:56then be like all right what does that
- 00:13:58look like all right let's some
- 00:13:59evaluations what we've learned is like
- 00:14:01Claud is actually good at writing
- 00:14:02evaluations and also grading them so
- 00:14:04like we can automate a lot of this for
- 00:14:05you but you have to tell us what success
- 00:14:07looks like and then let's go and
- 00:14:09actually iteratively improve our way
- 00:14:11over there and like that is often like
- 00:14:13the difference between like 60% of a
- 00:14:15task and like 85% of task if you come
- 00:14:17interview at anthropic which maybe you
- 00:14:18should uh at some point maybe you're
- 00:14:20happy in your R maybe not um uh you'll
- 00:14:22see one of the things we do in our
- 00:14:23interview process actually like make you
- 00:14:25get uh a prompt from like crappy eval to
- 00:14:28good and like just we want to see you
- 00:14:30think but like not enough of that Talent
- 00:14:31exists elsewhere so we're trying to get
- 00:14:33that like if there's one thing we can
- 00:14:35teach people that's probably the most
- 00:14:36important thing writing eals I mean it's
- 00:14:38I actually think it's going to become a
- 00:14:40core skill for PMS we actually had this
- 00:14:43and maybe this is like a little inside
- 00:14:44baseball but I thought this was
- 00:14:45interesting like internally we had our
- 00:14:47research PMS who like work a lot on
- 00:14:49model capabilities and model development
- 00:14:51and then we had our like more like
- 00:14:52product surface PMS or API PMs and we
- 00:14:55ended up realizing that like the job of
- 00:14:57a PM in 2024 202 5 building AI powered
- 00:15:00features looks is looking more and more
- 00:15:02like the former than the latter in a lot
- 00:15:04of cases like uh we launched uh like our
- 00:15:07uh code analysis and like basically
- 00:15:09Claud can analyze csvs and write code
- 00:15:10for you now and the PM there was like
- 00:15:13getting it 80% of the way there and then
- 00:15:14having to hand it over to the PM that
- 00:15:15could write the evals then go to like
- 00:15:17fine tune and like prompt I was like
- 00:15:19that's actually the same role like the
- 00:15:20quality of your feature is now gated on
- 00:15:22how well you have done the evals and the
- 00:15:24prompts and so like that PM like
- 00:15:27definition is definitely just mer now
- 00:15:29yeah absolutely I we we set up a boot
- 00:15:31camp and like took every PM through uh
- 00:15:35writing evals and like what it was like
- 00:15:37difference between good and bad evals
- 00:15:39and you know we're we're definitely not
- 00:15:41done there we've got to keep iterating
- 00:15:42and getting better on it but it is such
- 00:15:44a critical part of of making a good
- 00:15:46product with AI yeah as part of this
- 00:15:49recruiting call for any of the people
- 00:15:51who want to be good at building AI
- 00:15:53product or research product in the
- 00:15:55future um we can't come to your boot
- 00:15:58camp Kevin so how do we develop some
- 00:16:00intuition for getting good at this eval
- 00:16:03and iteration L I actually think it's
- 00:16:06something you can you can use the models
- 00:16:07themselves for like you were talking
- 00:16:09about you can ask the models at this
- 00:16:10point what makes a good eval give me you
- 00:16:12know I want to do this can you write me
- 00:16:13a sample eval and it will it will be
- 00:16:16pretty good yeah I think that's like
- 00:16:18that goes a log way I think there's also
- 00:16:20this question of like it's and I if you
- 00:16:23listen to like everybody from like Andre
- 00:16:25carpath to others who have like spent a
- 00:16:27lot of time in the field like nothing
- 00:16:28beats looking at at data and so like
- 00:16:30people often get caught up um being like
- 00:16:32well we already have these evaluations
- 00:16:33and the new model is like 80% there
- 00:16:35rather than 78% we can't or like
- 00:16:37you know it's worse and I was like have
- 00:16:38we looked at the cases where it fails
- 00:16:40and you're like oh actually this was
- 00:16:41better it's just our grader is not as
- 00:16:43good you know or um it's funny like
- 00:16:46again a little inside baseball you know
- 00:16:47like every model release has the model
- 00:16:49card and some of these model uh these
- 00:16:51eiles we've seen like even the golden
- 00:16:53answer I'm like I'm not sure a human
- 00:16:55would say it or like I think that math
- 00:16:56is actually a little wrong like getting
- 00:16:58100% is going to be really hard cuz even
- 00:16:59just grading them is very challenging so
- 00:17:01like I'd encourage you to like the way
- 00:17:03you build the intuition is go look at
- 00:17:04the actual answers even to sample them
- 00:17:07be like all right yeah maybe we should
- 00:17:09evolve The evals or maybe like The Vibes
- 00:17:11are good even if the eval is like tied
- 00:17:13so like getting real and getting like
- 00:17:15deep on the data I think matters I also
- 00:17:18think it'll be really interesting to see
- 00:17:19how this evolves as we go towards longer
- 00:17:21form more agentic tasks because it's one
- 00:17:24thing when your evals are like I gave
- 00:17:26you this math thing and you were able to
- 00:17:28like add four-digit numbers and get to
- 00:17:30the right answer you know it's easy to
- 00:17:32know what what good looks like there as
- 00:17:34the models start to do more uh long form
- 00:17:38more ambiguous things go get me a hotel
- 00:17:41in New York City you know what's what's
- 00:17:44right there a lot of it will be about
- 00:17:46personalization uh you know if you ask
- 00:17:48any two humans who are perfectly
- 00:17:50competent they're going to do two
- 00:17:52different things so your grading becomes
- 00:17:55much softer and it you know it'll just
- 00:17:57be interesting we I think we'll have to
- 00:17:58to evolve yet again like speaking of
- 00:18:00having to reinvent stuff over and over
- 00:18:02again I think a lot like when you think
- 00:18:04about and I think both Labs have some
- 00:18:06concept of like this is what
- 00:18:08capabilities look like as things evolve
- 00:18:09like it looks a little bit like a career
- 00:18:11ladder like what bigger and longer
- 00:18:12Horizon tasks are you taking and maybe
- 00:18:14like eval start looking more like
- 00:18:16performance review I'm in performance
- 00:18:17review season so this is the metaphor
- 00:18:18that's in my head sorry but it's like
- 00:18:20you know like did the model like meet
- 00:18:22your expectation of like what a
- 00:18:23competent human have done did it exceed
- 00:18:25it because it did it twice as fast or
- 00:18:26like discovered some restaurant you
- 00:18:28wouldn't have known it greatly exceed
- 00:18:29meets most like it starts being like
- 00:18:31more nuanced than just like right or
- 00:18:33wrong let alone you have humans riding
- 00:18:36these eval and the models are getting to
- 00:18:38the point where they can often beat
- 00:18:39humans at certain tasks like people
- 00:18:41prefer the model's answers to a human's
- 00:18:43answers and so if your humans writing
- 00:18:44your evals like yeah you know so what
- 00:18:47does that
- 00:18:48mean uh what okay evals are clearly the
- 00:18:52key um we're going to go spend a bunch
- 00:18:54of time with these models teaching
- 00:18:56ourselves to write EV vals what are
- 00:18:58their skills should product people be
- 00:19:00learning now you you're both on that
- 00:19:01learning path I think uh prototyping
- 00:19:05with these models is a thing that is
- 00:19:07underused like our best PMS do this
- 00:19:09where we'll get into some long
- 00:19:10conversation about like should the UI be
- 00:19:12this or that and before our designers
- 00:19:14have even like picked up their figma
- 00:19:16like our our often our PMS or sometimes
- 00:19:19our Engineers will be like great I
- 00:19:21prompted Claude I did like an AB
- 00:19:22comparison of what these two UI could
- 00:19:24look like let's try them and I'm like oh
- 00:19:25this is really cool I'm like play that
- 00:19:27out and like we'll be able to Pro
- 00:19:28protype much like a far greater variety
- 00:19:31and evaluate um like on a much faster
- 00:19:34scale than before so like that skill of
- 00:19:36like using these tools to actually be in
- 00:19:39prototyping mode I think is a really
- 00:19:41really useful one that's a good one I
- 00:19:43would also I you you sort of said this
- 00:19:46but I think it's also going to push PMS
- 00:19:49to go deeper into the tech stack yeah um
- 00:19:51because it's and maybe that changes over
- 00:19:53the years like that if you were doing
- 00:19:55like database Tech in I don't know 2005
- 00:19:59maybe it required you to be able to go
- 00:20:01really deep in a different way than it
- 00:20:02would if you were doing database Tech
- 00:20:04now like layers of abstraction get built
- 00:20:06and you maybe don't need to know all the
- 00:20:08fundamentals but it's not like every PM
- 00:20:11needs to be a researcher by any means
- 00:20:12but I think having an appreciation for
- 00:20:14it spending time and learning the
- 00:20:17language and gaining intuition for how
- 00:20:19this stuff works a little bit I think
- 00:20:21will go a long way I think the other
- 00:20:22piece like you're dealing with this like
- 00:20:24stochastic non-deterministic system
- 00:20:26which like eals are our best attempt to
- 00:20:28do it but like product design in a world
- 00:20:30where like you're not in control of what
- 00:20:33the model is going to say you can try
- 00:20:35and so like what are the feedback
- 00:20:36mechanisms that you need to close that
- 00:20:38Loop like how do you decide when like
- 00:20:40the model's gone astray how do you
- 00:20:41collect that feedback in a in a rapid
- 00:20:43way you know like what are the guard
- 00:20:44rails you want to put in like how do you
- 00:20:47even know what it's doing in aggregate
- 00:20:48like it's a much more like you're
- 00:20:51understanding like the output of this
- 00:20:53intelligence across a lot of outputs
- 00:20:55over a lot of people every single day it
- 00:20:57just requires a very different set that
- 00:20:59like oh the bug report is you clicked on
- 00:21:01the button and didn't follow the user
- 00:21:02it's like that's a pretty knowable kind
- 00:21:04of problem right and and maybe this will
- 00:21:05change you know 5 years from now when
- 00:21:08people are used to it but I think we're
- 00:21:09all still in the mode of adapting to
- 00:21:11this sort of
- 00:21:13non-deterministic user interface
- 00:21:15ourselves uh and certainly people who
- 00:21:17are not you know tech people here in
- 00:21:19this room working on Tech products who
- 00:21:21are using AI are are definitely not used
- 00:21:24to it like it goes against all of the
- 00:21:25intuition that we've built up for the
- 00:21:27last like 25 years of using
- 00:21:29computers uh and so like the idea that
- 00:21:32you're going to put in the exact same
- 00:21:33things normally if you put in the exact
- 00:21:35same inputs computers give you the exact
- 00:21:37same outputs and that is no longer true
- 00:21:40uh and it it it's not just that we have
- 00:21:43to adapt to it Building Products we have
- 00:21:44to also put oursel in the shoes of the
- 00:21:47people who are using our products and
- 00:21:49think about what this means for them and
- 00:21:50there's like I mean there are downsides
- 00:21:52to it there also really cool upsides and
- 00:21:54so it's fun to kind of think about how
- 00:21:56how you can use that to your advantage
- 00:21:58in different ways I remember like we we
- 00:22:00did like a lot of like rolling user
- 00:22:02research at Instagram so we have like
- 00:22:04the same like or researchers would bring
- 00:22:06in different people every single week
- 00:22:07whatever it was like prototype ready
- 00:22:08would get put through it and we do the
- 00:22:10same thing at anthropic but what's
- 00:22:11interesting is like for those sessions
- 00:22:13what would often surprise me is like how
- 00:22:15users were using Instagram there's
- 00:22:16something interesting about like their
- 00:22:17use case or like their reaction to a new
- 00:22:19feature and like now it's like half that
- 00:22:21and half what the model did in that
- 00:22:22situation you're like oh it did the
- 00:22:24right thing this is great so there's
- 00:22:25like this like very like almost like a
- 00:22:28sense of Pride maybe of like when it
- 00:22:29reacts well and you're like in a user
- 00:22:31research environment and then like also
- 00:22:33the like frustration you're like oh no
- 00:22:35you misunderstood the intent and now
- 00:22:36you're like 10 pages down into this
- 00:22:38answer and so like it it's also like
- 00:22:40maybe a little of getting Zen about like
- 00:22:42letting go of control and you know
- 00:22:44what's going to happen in those
- 00:22:45environments yeah you have both worked
- 00:22:48on these consumer experiences that
- 00:22:49taught new behaviors to you know many
- 00:22:52hundreds of millions of people uh
- 00:22:55quickly uh these AI products are
- 00:22:57happening actually faster than that
- 00:22:59right and you know if if PMs and
- 00:23:02Technical people don't have that much
- 00:23:04intuition naturally for how to use them
- 00:23:06how do you think about educating end
- 00:23:08users at the scale you're both working
- 00:23:10with on something that is so unintuitive
- 00:23:13I mean it it is kind of amazing how fast
- 00:23:16we all
- 00:23:17adapt uh I was talking to somebody the
- 00:23:20other day and they were telling me about
- 00:23:21their first wayo ride who's ridden in a
- 00:23:24whmo who rode one here yeah if you
- 00:23:27haven't ridden in a whmo you're in San
- 00:23:29Francisco ride AO to wherever you're
- 00:23:31going when you leave here it's a magical
- 00:23:34experience but they were like my first
- 00:23:3730 seconds I was like oh my God watch
- 00:23:40out for that
- 00:23:40bicyclist right and then 5 minutes in it
- 00:23:43was like oh my God I'm living in the
- 00:23:46future and then 10 minutes later it was
- 00:23:49like board scrolling on your phone like
- 00:23:52you know how quickly we become used to
- 00:23:54something that is just absolute magic
- 00:23:56yeah um and I think I mean cha PT is
- 00:24:00less than 2 years old MH and it was
- 00:24:03absolutely mind-blowing when it exist or
- 00:24:05when it when it first came and now I
- 00:24:07think if we had to go back and use the
- 00:24:08original whatever it was GPT 3.5 I think
- 00:24:12the horror yeah yeah like no everybody
- 00:24:14be like GH so dumb how could I Poss you
- 00:24:18know and and you know the stuff that's
- 00:24:20that's happening today that we're
- 00:24:22working on that you guys are working on
- 00:24:24it all feels like magic 12 months from
- 00:24:26now we're going to be like can you can
- 00:24:27you believe if we use that garbage
- 00:24:30because it's going to I me that's how
- 00:24:31fast this thing is moving but it's also
- 00:24:33amazing to me how quickly people adapt
- 00:24:35because I mean as much as we try and
- 00:24:37bring people along like there are also
- 00:24:39um there's just there's a lot of
- 00:24:41excitement people understand that this
- 00:24:43is like the world is moving in this
- 00:24:45direction and um we've got to try and
- 00:24:47make it the best possible move that we
- 00:24:50can but uh it's it's happening and it's
- 00:24:52happening fast one thing we're trying to
- 00:24:53get better at and that's is also letting
- 00:24:55the product be like educational in a
- 00:24:57very little away which is like a thing
- 00:24:59we did not do early and now we're
- 00:25:01changing is just tell Claude more about
- 00:25:04itself which was like you know it's in
- 00:25:05its training set that it's you know uh
- 00:25:08artificial intelligence created by
- 00:25:09anthropic whatever but now we're
- 00:25:10literally like and here's how you use
- 00:25:12this fature we Shi because people would
- 00:25:13ask and again this came from user
- 00:25:14research because we'd be like they would
- 00:25:16be like how do I use this thing and then
- 00:25:18Cloud would be like I don't know have
- 00:25:19you tried like looking at it on the
- 00:25:21internet you're like no that's un
- 00:25:22helpful and So like um uh like we're
- 00:25:25really trying to ground it and then at
- 00:25:26launch time we're like you know it's a
- 00:25:28process were improving but like it's
- 00:25:29it's cool to now see like this is the
- 00:25:31exact link to the documentation like
- 00:25:32here's how you do it like I can help you
- 00:25:34stuff by St oh you're stuck I can help
- 00:25:36you here so uh these things are actually
- 00:25:38very good at solving uh UI problems and
- 00:25:41like user confusion and like we should
- 00:25:43use them more for that yeah that's got
- 00:25:45to be different when you are um you know
- 00:25:47trying to do like change management in
- 00:25:48an Enterprise though right because
- 00:25:50there's a there's a status quo for how
- 00:25:52are you doing things there's
- 00:25:53organizational process like how do you
- 00:25:55think about educating entire
- 00:25:57organizations about productivity
- 00:25:59improvements or whatever else can come I
- 00:26:00think the Enterprise one is really
- 00:26:01interesting because like even like these
- 00:26:04products have like millions and millions
- 00:26:05of users but like the power users are
- 00:26:08very much I think still like early
- 00:26:10adopters people who like technology and
- 00:26:11then there's like a you know long tail
- 00:26:13whereas when you go into Enterprise
- 00:26:14you're deploying to like an organization
- 00:26:16that is like often there's folks who are
- 00:26:17like not very non-technical and like I
- 00:26:19think that's really cool actually seeing
- 00:26:22fairly non-technical users get exposed
- 00:26:24to like a chat powered llm for the first
- 00:26:27time and then getting to see it then you
- 00:26:29have the luxury of like getting to run
- 00:26:30like a session where you teach them
- 00:26:32about it and like have educational
- 00:26:33materials um and so I think we need to
- 00:26:36learn from what happens in those and
- 00:26:38then say like that's what we need to do
- 00:26:39to teach the next 100 million people how
- 00:26:41to use these these these uis and they're
- 00:26:43usually power users internally and
- 00:26:45they're they're excited to teach the
- 00:26:48rest of people and you know like with
- 00:26:50open AI we have these custom gpts that
- 00:26:52you can make and organizations make
- 00:26:54thousands of them often and it's a way
- 00:26:56for the power users to make something
- 00:26:58that makes AI easier and like
- 00:27:01immediately valuable for the people that
- 00:27:03might not know how to use it otherwise
- 00:27:05um so like that's one cool thing you
- 00:27:07find the pockets of power users and they
- 00:27:09actually will sort of be
- 00:27:11evangelists I I have to ask you then
- 00:27:14because you you know your organizations
- 00:27:16are both like all power users right so
- 00:27:18you know you're living in your little
- 00:27:19pocket of the future uh I'll ask about
- 00:27:22one thing but feel free to redirect Mike
- 00:27:24how am I supposed to use computer use
- 00:27:26this is amazing like what are you guys
- 00:27:27doing
- 00:27:28yeah well internally like we're I mean
- 00:27:31this to Kevin's earlier comment around
- 00:27:32like when is it going to be ready all
- 00:27:34right like go this like it was pretty
- 00:27:36late breaking like we like had
- 00:27:38conviction that it was like this is like
- 00:27:40good and like we want to put this down
- 00:27:41like it's early still and it's like
- 00:27:43still going to make mistakes but like
- 00:27:44how do we do this as well the funniest
- 00:27:46use case like while we were beta testing
- 00:27:47it was like somebody was like I wonder
- 00:27:49if I can get it to order us a pizza and
- 00:27:50like it did and they're like great
- 00:27:51there's do like the the moment where
- 00:27:53Domino's shows up at your office and it
- 00:27:55was ordered entirely by AI is like a
- 00:27:57very was a very cool like seminal moment
- 00:27:58and then we're like oh but it's Domino's
- 00:28:00but like you know like but like it was
- 00:28:02definitely like amazing yeah uh but it
- 00:28:05was AI you know so it was all it was it
- 00:28:06was it was good it also like ordered
- 00:28:08quite a bit of pizza so it was like
- 00:28:09maybe hungrier than intended uh some
- 00:28:11early things that we're seeing that we
- 00:28:12think are really interesting one is UI
- 00:28:14testing which is like I was like at
- 00:28:16Instagram we had basically no UI tests
- 00:28:18because they're hard to write they're
- 00:28:19like they're brittle um and they're like
- 00:28:21often like a little bit like oh like we
- 00:28:23moved this button around and like it
- 00:28:24should still pass that was the point of
- 00:28:26the PR but like now it's going to fail
- 00:28:27we're going to have to like do this
- 00:28:28whole other snapshot um and like early
- 00:28:30signs are like computer just works
- 00:28:31really well for like hey does it work as
- 00:28:33intended does it do the thing that you
- 00:28:34want it to do and I think that's like
- 00:28:36been been very very interesting and then
- 00:28:38what we're starting getting into too is
- 00:28:39like what are the agentic things that
- 00:28:40just like involve a lot of like data
- 00:28:43manipulation so we're looking at it with
- 00:28:44our support teams and our finance teams
- 00:28:46around like those PR forms are going to
- 00:28:48fill themselves but like it's very
- 00:28:50repetitive you of have data in one Silo
- 00:28:52you want to put it in a different Silo
- 00:28:53and it just requires like human time
- 00:28:55like I keep using the word drudgery when
- 00:28:57I talk about computer like can we
- 00:28:58automate the drudgery so you can focus
- 00:29:00on the creative stuff and not like the
- 00:29:02you know 30 clicks to do one single
- 00:29:06thing Uh Kevin I I think we have a lot
- 00:29:09of teams that are um experimenting with
- 00:29:12o1 you can obviously do much more
- 00:29:13sophisticated things you also can't use
- 00:29:15it as a one forone replacement if you're
- 00:29:17already using right one of the you know
- 00:29:19gbd 40 models or whatever in uh in your
- 00:29:22application like can you give us some
- 00:29:24guidance what are you guys doing with it
- 00:29:26internally so I think one thing
- 00:29:29that people maybe don't realize that
- 00:29:31actually a lot of the most sophisticated
- 00:29:33customers of ours are doing and that
- 00:29:35we're certainly doing internally is it's
- 00:29:36not really about one model for any
- 00:29:39particular thing you end up putting
- 00:29:41together sort of workflows and
- 00:29:43orchestration between models and so you
- 00:29:45use them for what they're good at 0
- 00:29:47one's really good at reasoning but it
- 00:29:48also takes a little bit of time to think
- 00:29:50and it's not multimodal and you know has
- 00:29:52other limitations you Define reasoning
- 00:29:54for the group I realize it's a basic
- 00:29:55question but yeah so uh we people are I
- 00:29:59think pretty used to the concept the
- 00:30:01like scaling pre-training concept you go
- 00:30:04gpt2 3 four five whatever and you're
- 00:30:07doing bigger and bigger runs on
- 00:30:09pre-training these models are getting
- 00:30:10you know smarter and smarter um like
- 00:30:13they or rather maybe they know more and
- 00:30:15more but they're kind of like system one
- 00:30:18thinking right it's it's you ask it a
- 00:30:20question you immediately get an answer
- 00:30:22it's like text completion yeah sort of
- 00:30:24if I ask you me asking you questions
- 00:30:26right now and you just have to stream
- 00:30:28one Tok at a time keep going don't think
- 00:30:30it's amazing actually how much human
- 00:30:33like your intuition about how other
- 00:30:34humans work will often like help you in
- 00:30:38intuiting about how these models work um
- 00:30:40you know you asked me a question I got
- 00:30:42off onto the wrong like sentence it's
- 00:30:44hard to recover the models totally do
- 00:30:46the same thing um but uh so you've got
- 00:30:50that that sort of larger and larger
- 00:30:52pre-training 01 is actually a different
- 00:30:56way of scaling
- 00:30:58intelligence by doing it at uh at query
- 00:31:01time basically so instead of system one
- 00:31:04thinking I ask you a question and
- 00:31:05immediately tries to give you an answer
- 00:31:07it'll pause same thing I would you know
- 00:31:09you would do if I asked you a question I
- 00:31:10said solve this Sudoku do this New York
- 00:31:13Times connections puzzle you you would
- 00:31:15start going okay these words how do they
- 00:31:17group together okay these might be these
- 00:31:19four well no I'm not sure could be you
- 00:31:22know you're you're like forming
- 00:31:23hypotheses using what you know to refute
- 00:31:26these hypothesis or affirm them and then
- 00:31:29from that continuing to reason on it's
- 00:31:32how it's How scientific breakthroughs
- 00:31:34are made it's how we answer hard
- 00:31:36questions um and so this is about
- 00:31:38teaching the models to do it and right
- 00:31:40now you know they'll think for 30 or 60
- 00:31:43seconds before they answer imagine what
- 00:31:45happens if they can think for five hours
- 00:31:47or five days um so it's basically a new
- 00:31:50way to scale intelligence and we feel
- 00:31:53like we're just at the very beginning
- 00:31:55you know we're at the like gpt1 phase of
- 00:31:58um of this new form of reasoning um but
- 00:32:01in the same way it's not you don't use
- 00:32:03it for everything right there are
- 00:32:04sometimes when you ask me a question you
- 00:32:05don't want me to wait 60 seconds you I
- 00:32:07should just give you an answer um so we
- 00:32:10end up using our models in a bunch of
- 00:32:13different ways together so for example
- 00:32:15like cyber security you would think not
- 00:32:18really a use case for models they can
- 00:32:20hallucinate that seems like a bad place
- 00:32:21to hallucinate but you can a like find
- 00:32:25tune a model to be good at certain Tas
- 00:32:27asks and then you can fine-tune models
- 00:32:30to be very precise about the kinds of
- 00:32:32inputs and outputs that they expect and
- 00:32:34have these models start working in in
- 00:32:36concert together and you know models
- 00:32:39that are checking the outputs of other
- 00:32:40models realizing when something doesn't
- 00:32:42make sense asking it to try again um and
- 00:32:47uh so like that ends up being how we get
- 00:32:50a ton of value out of our own models
- 00:32:52internally it's like specific use cases
- 00:32:56uh and or orchestrations of models
- 00:32:59together designed sort of working in
- 00:33:00concert to do specific tasks which again
- 00:33:03going back to like reasoning about how
- 00:33:04we work as humans how do we do complex
- 00:33:07things as humans you have different
- 00:33:08people who often have different skill
- 00:33:10sets and they work together to
- 00:33:11accomplish a hard
- 00:33:13task I can't let you guys get away
- 00:33:16without without telling us something
- 00:33:18about the future and what's coming and
- 00:33:20so um you don't have to give us release
- 00:33:23dates I understand you don't know but uh
- 00:33:26if you if you look out I I think the
- 00:33:28furthest anyone can look out in AI right
- 00:33:29now is like well tell me if you can see
- 00:33:31the future but like let's say like 6
- 00:33:33months 12 months like what's an
- 00:33:35experience that you imagine is going to
- 00:33:37be possible or prevalent I think a lot
- 00:33:40about um like to Breaking the well I
- 00:33:43think a lot about this all the time but
- 00:33:45like the um two maybe two words to be
- 00:33:47like plant seeds in in everybody's mind
- 00:33:50like one is proactivity like how do the
- 00:33:51models become more proactive like once
- 00:33:53they know about you and they're
- 00:33:54monitoring like they're reading your
- 00:33:56email in a good not creepy way and
- 00:33:58they're like uh because you authorized
- 00:33:59them to and then they like you know spot
- 00:34:02an interesting Trend or you start your
- 00:34:03day with something that's a like um like
- 00:34:05a proactive like uh recap of what's
- 00:34:08going on some conversations you're going
- 00:34:09to have I I I prid some research for you
- 00:34:11hey your next meeting is coming up like
- 00:34:13here's what you might want to talk about
- 00:34:14I saw you have this like presentation
- 00:34:16coming up here's the first draft that I
- 00:34:17put together like that kind of
- 00:34:18proactivity I think is going to be
- 00:34:20really really powerful and then the
- 00:34:21other part is being more asynchronous so
- 00:34:23like uh I think o1 is like early UI in
- 00:34:27this exploration which is like it's
- 00:34:29going to do a lot and it's going to tell
- 00:34:30you kind of what it's going to do along
- 00:34:31the way and like you can sit there and
- 00:34:33wait for it but you could also like be
- 00:34:34like it's going to think for a while I'm
- 00:34:35going to go like do something else maybe
- 00:34:37tab back maybe it like can tell me when
- 00:34:39it's done like yeah I expanding the time
- 00:34:41Horizon both in terms of like you didn't
- 00:34:43ask a question it just told you
- 00:34:44something I think that's going to be
- 00:34:45interesting and then you did ask a
- 00:34:47question and you're going to be like
- 00:34:48great like I'm going to go reason about
- 00:34:50it I'm going to go research it I might
- 00:34:52have to ask another human about it like
- 00:34:53and then I'm going to like maybe come up
- 00:34:55with my first answer I'm going to vet
- 00:34:56that answer you'll hear back from me in
- 00:34:58like an hour like Breaking Free of those
- 00:35:00like uh constraints of like expecting an
- 00:35:02answer immediately I think will let you
- 00:35:04do things like hey I have this like
- 00:35:06whole like mini project plan like go
- 00:35:08flesh it out or like not just like I
- 00:35:10want you to like change this one thing
- 00:35:11on the screen but like fix this bug for
- 00:35:13me like take my PRD and like adapt it
- 00:35:16for these new market conditions like
- 00:35:18adapt it for these three different
- 00:35:19marketing conditions that emerg like
- 00:35:20being able to push those Dimensions I
- 00:35:22think is what I'm personally most
- 00:35:23excited about on the product side yeah I
- 00:35:26completely agree with all of that that
- 00:35:28um and it's the models are going to get
- 00:35:31smarter at an accelerating rate I think
- 00:35:33which is also part of how all of that uh
- 00:35:35comes to pass another thing that will be
- 00:35:38really exciting is seeing the models
- 00:35:40able to interact in all the same ways
- 00:35:42that we as humans interact you know
- 00:35:44right now you mostly type to these
- 00:35:46things and you know I mostly type to a
- 00:35:48lot of my friends on WhatsApp and other
- 00:35:49things but I also speak I also can see
- 00:35:54and uh we just we launched this advanced
- 00:35:57voice mode Rel relatively recently I was
- 00:35:59in uh I was in Korea and
- 00:36:02Japan having
- 00:36:04conversations and I would just I would
- 00:36:06often be with somebody with whom I had
- 00:36:09no common language whatsoever before
- 00:36:11this we could not have said a word to
- 00:36:13each other and instead I was like Hey
- 00:36:16chat gbt I want you to act as a
- 00:36:17translator when I say something in
- 00:36:19English I want you to say it in Korean
- 00:36:21and when you hear something in Korean
- 00:36:23say it back to me in English and all of
- 00:36:24a sudden I had this Universal translator
- 00:36:26and I was having business conversations
- 00:36:29with another person uh and it was
- 00:36:32magical and you think what that can do
- 00:36:35like not just in a business context but
- 00:36:36think about people's willingness to
- 00:36:38travel to new places if you don't ever
- 00:36:39have to be worried about not speaking
- 00:36:41the language and you've got this like
- 00:36:42Star Trek Universal translator in your
- 00:36:44pocket you know and so experiences like
- 00:36:47that I think it's going to become
- 00:36:49commonplace fast but it's magical and
- 00:36:51I'm excited about that in combination
- 00:36:54with all the stuff Mike was just
- 00:36:56saying oh one of my favorite pastimes
- 00:37:00now just you know since uh voice mode
- 00:37:03release is actually watching there's a
- 00:37:05genre of Tik Tok of well this just
- 00:37:07speaks to how old I am like there's a
- 00:37:09genre of Tik Tok where you just like uh
- 00:37:11it's just young people talking to voice
- 00:37:13mode like pouring their heart out using
- 00:37:15it all these ways where I'm like oh my
- 00:37:17God like there's this old term being
- 00:37:19like digitally native or mobile native
- 00:37:21and I'm like I like pretty strongly
- 00:37:24believe in this AI thing and I would not
- 00:37:26think to interact in this way but people
- 00:37:29who are 14 years old are like well I
- 00:37:31expect the AI to be able to do that and
- 00:37:33I love that have you ever given it to
- 00:37:35your kids uh I haven't yet my kids are
- 00:37:37like five and seven Kevin knows them so
- 00:37:39we but we'll get there I mean mine are
- 00:37:41eight and 10 but like on a car ride
- 00:37:43they'll be like can I talk to chat GPT
- 00:37:45yes and they will ask it the most
- 00:37:47bizarre things they will just have
- 00:37:49weirdo conversations with it but they're
- 00:37:52perfectly happy talking to an AI yeah
- 00:37:54actually one of my favorite experiences
- 00:37:56and maybe we'll close and ask you for
- 00:37:57like the most surprising Behavior kids
- 00:37:59or not is uh um like when my parents
- 00:38:04read to me like I got L I was lucky if I
- 00:38:07got to choose the book and it wasn't my
- 00:38:08dad being like we're going to read this
- 00:38:10physics study I'm interested in right my
- 00:38:13kids I don't know if it's just like
- 00:38:14parenting in the Bay Area but my kids
- 00:38:16are like okay Mom make the images right
- 00:38:19I want to tell a story about the dragon
- 00:38:22unicorn in this setting I'm going to
- 00:38:23tell you exactly how it's going to
- 00:38:25happen create it in real time and I'm
- 00:38:27like like that's a big ask I'm glad you
- 00:38:30believe and like know that's possible
- 00:38:32but it's it's a wild way to like create
- 00:38:34your own entertainment too what is the
- 00:38:36um most surprising Behavior you've seen
- 00:38:38in your own products
- 00:38:41recently I think it's a behavior and a
- 00:38:45relationship like people really start
- 00:38:49understanding the Nuance of like what
- 00:38:51Claud is we just have like a a new
- 00:38:53revenge of the model and it's like they
- 00:38:55get the Nuance like it's like I guess
- 00:38:57the behavor behavior is like almost
- 00:38:58befriending or like really like
- 00:39:00developing a lot of like 2-way empathy
- 00:39:02around what's happening and then like
- 00:39:03the is like oh you know the new model
- 00:39:05like felt like it was smarter but maybe
- 00:39:07a little more distant but maybe you know
- 00:39:09and it's like it's like that kind of
- 00:39:10like Nuance which like you like I it's
- 00:39:13it's given me as a product person a lot
- 00:39:15more empathy around like you're not just
- 00:39:16shipping a product you're shipping like
- 00:39:19intelligence and intelligence and
- 00:39:21empathy are like what makes like
- 00:39:23interpersonal relationships important
- 00:39:24and if somebody show up and they're like
- 00:39:25I was upgraded like I say know I scored
- 00:39:282% higher on this math score but like
- 00:39:30I'm Different in this way you'd be like
- 00:39:31oh I got to adapt now and maybe you know
- 00:39:33be a little worried about it so like
- 00:39:35that that's been an interesting Journey
- 00:39:37for me like understanding the mentality
- 00:39:39for people that when they're using our
- 00:39:40products yeah Model Behavior is
- 00:39:43absolutely a product role like the the
- 00:39:46personality of the model is is key and
- 00:39:49there are interesting questions around
- 00:39:50how much should it customize uh versus
- 00:39:52how much should you know open AI have
- 00:39:54one personality and Claude has some
- 00:39:56distinct personality
- 00:39:58and are people going to use one versus
- 00:39:59the other because they happen to like it
- 00:40:01I mean that's that's a very human thing
- 00:40:03right we're friends with different
- 00:40:04people because we happen to like
- 00:40:05different people better than others and
- 00:40:06it's um that's an interesting thing to
- 00:40:09to think about we did something recently
- 00:40:13um and it sort of went viral on Twitter
- 00:40:16people started asking the model based on
- 00:40:19everything you know about me based on
- 00:40:20all of our past interactions you know
- 00:40:22what what would you say about me and the
- 00:40:25model will will respond and it will like
- 00:40:27give you give it a description of what
- 00:40:29it you know kind of thinks based on all
- 00:40:31of your past
- 00:40:32interactions and it is this sort of
- 00:40:35you're you're starting to interact with
- 00:40:36it almost like some sort of person or
- 00:40:39entity in interesting ways and um
- 00:40:42anyways it was fascinating to see
- 00:40:43people's reaction to
- 00:40:46that Kevin Mike thank you so much for
- 00:40:48doing this and giving us a glimpse into
- 00:40:50the future thank you so much
- AI
- Product Development
- User Feedback
- Proactivity
- AI Capabilities
- Product Managers
- User Experience
- Communication
- Machine Learning
- Future Technology