What is the 12 Days of OpenAI?

It's an event where OpenAI introduces new features or models every weekday for 12 days.

What does the 01 model offer?

The 01 model is smarter, faster, and supports multimodal input. It's an improvement from the 01 preview model with enhanced coding and problem-solving abilities.

What is Chat GPT Pro?

Chat GPT Pro is a premium tier of Chat GPT offering unlimited access to advanced models and features, including 01 Pro mode.

How much does Chat GPT Pro cost?

Chat GPT Pro costs $200 a month.

01 Pro mode allows for more compute use, providing even better performance for complex problems.

Who should consider using Chat GPT Pro?

Power users who require more advanced capabilities such as technical work in math, coding, and complex problem-solving would benefit most from Chat GPT Pro.

What are the improvements in the 01 model over the preview?

The 01 model has faster response times, improved multimodal reasoning, and reduced error rates compared to the preview model.

What is a unique feature of the 01 model?

Its ability to think and reason before responding, leading to more detailed and accurate answers.

What kind of problems is the 01 Pro mode best for?

01 Pro mode excels in tough problems in math, science, and programming.

What developer features are being added?

OpenAI plans to add web browsing, file uploads, structured outputs, and more functionality for developers.

OpenAI o1 and o1 pro mode in ChatGPT — 12 Days of OpenAI: Day 1

00:14:55

https://www.youtube.com/watch?v=iBfQTnA2n2s

Résumé

TLDROpenAI has launched two major updates as part of their '12 Days of OpenAI' initiative. The first is the new 01 AI model, an advanced version from its preview stage that offers improved speed, intelligence, and multimodal capabilities. The model excels in coding, math, and problem-solving, enhancing its utility across scientific and technical fields. Alongside, OpenAI introduced Chat GPT Pro, a premium service tier costing $200 per month that provides unlimited access to OpenAI’s models and features such as 01 Pro mode. This Pro mode processes complex tasks by leveraging more computational power to deliver optimal results. The advancements include faster processing, multimodal input handling for text and image combined tasks, and reduced error margin. The 01 model and Chat GPT Pro are aimed particularly at power users in technical domains. Developers can also look forward to enhanced features soon. The announcement underscores OpenAI's commitment to refining AI technology to meet the escalating demands of complex, real-world applications.

A retenir

🎉 OpenAI's '12 Days of OpenAI' event is underway, launching new tech daily.
🚀 The 01 model is smarter and faster, with enhanced coding and math capabilities.
🔍 Multimodal input is now supported in the 01 model, combining text and image learning.
💡 Chat GPT Pro offers unlimited model access, ideal for advanced technical users.
🔥 01 Pro mode utilizes more computing power for complex problem-solving.
💻 Upcoming features for developers include web browsing and file uploads.
📈 01 model reduces errors by 34% and increases thinking speed by 50%.
🔬 01 excels in scientific and engineering tasks, offering state-of-the-art performance.
🧠 The model's thinking-before-responding ability improves response accuracy.
💡 Pro mode is beneficial for hard science, math, and programming challenges.

Chronologie

00:00:00 - 00:05:00
OpenAI is launching a new initiative called the "12 Days of OpenAI," where they will launch or demo a new feature every weekday. On the first day, they reveal two major updates: the full version of their new AI model, 01, and a subscription tier called ChatGPT Pro. The 01 model offers improvements over the GPT 4.0 preview, being smarter, faster, and multimodal, particularly enhancing its performance in coding. ChatGPT Pro is targeted at power users who require more computational capacity and includes features like advanced voice mode and the 01 PR mode, which provides even better performance for complex tasks. ChatGPT Pro is priced at $200 per month and aims to offer unlimited access to OpenAI's best models.
00:05:00 - 00:14:55
The 01 model, now available to Plus and Pro subscribers, has been improved to better handle everyday tasks and complex inquiries. It processes inputs faster and with greater accuracy than previous versions, reducing mistakes by 34% while increasing response speed by 50%. The team demonstrated its capabilities through a history question, showcasing its improved performance. Additionally, 01 supports multimodal inputs, including text and images, enhancing its reasoning abilities. During a demonstration of a space data center problem, the model accurately handled underspecified parameters to deliver a correct analysis, illustrating its advanced reasoning capability. The 01 model also excels in standard benchmarks, with state-of-the-art performance. ChatGPT Pro's 01 Prom mode is highlighted for tackling advanced challenges in areas like chemistry, showcasing its capability to solve complex problems with enhanced computational power.

Carte mentale

Vidéo Q&R

What is the 12 Days of OpenAI?
It's an event where OpenAI introduces new features or models every weekday for 12 days.
What does the 01 model offer?
The 01 model is smarter, faster, and supports multimodal input. It's an improvement from the 01 preview model with enhanced coding and problem-solving abilities.
What is Chat GPT Pro?
Chat GPT Pro is a premium tier of Chat GPT offering unlimited access to advanced models and features, including 01 Pro mode.
How much does Chat GPT Pro cost?
Chat GPT Pro costs $200 a month.
What is 01 Pro mode?
01 Pro mode allows for more compute use, providing even better performance for complex problems.
Who should consider using Chat GPT Pro?
Power users who require more advanced capabilities such as technical work in math, coding, and complex problem-solving would benefit most from Chat GPT Pro.
What are the improvements in the 01 model over the preview?
The 01 model has faster response times, improved multimodal reasoning, and reduced error rates compared to the preview model.
What is a unique feature of the 01 model?
Its ability to think and reason before responding, leading to more detailed and accurate answers.
What kind of problems is the 01 Pro mode best for?
01 Pro mode excels in tough problems in math, science, and programming.
What developer features are being added?
OpenAI plans to add web browsing, file uploads, structured outputs, and more functionality for developers.

Voir plus de résumés vidéo

Accédez instantanément à des résumés vidéo gratuits sur YouTube grâce à l'IA !

Sous-titres

Défilement automatique:

00:00:02
[Music]
00:00:17
[Music]
00:00:21
hello welcome to the 12 days of open AI
00:00:24
we're going to try something that as far
00:00:25
as we know no tech company has done
00:00:27
before which is every day for the next
00:00:28
12 every week day we are going to launch
00:00:31
or demo some new thing that we built and
00:00:33
we think we've got some great stuff for
00:00:35
you starting today we hope you'll really
00:00:36
love it and you know we'll try to make
00:00:39
this fun and fast and not take too long
00:00:41
but it'll be a way to show you what
00:00:42
we've been working on and a little
00:00:44
holiday present from us so we'll jump
00:00:46
right into this first day uh today we
00:00:47
actually have two things to launch the
00:00:49
first one is the full version of 01 we
00:00:51
have been very hard at work we've
00:00:53
listened to your feedback you want uh
00:00:54
you like o one preview but you want it
00:00:56
to be smarter and faster and be
00:00:58
multimodal and be better in instruction
00:01:00
following a bunch of other things so
00:01:01
we've put a lot of work into this and
00:01:03
for scientists engineers coders we think
00:01:06
they will really love this new model uh
00:01:08
I'd like to show you quickly about how
00:01:10
it performs so you can see uh the jump
00:01:13
from GPT 40 to o1 preview across math
00:01:16
competition coding GP QA Diamond um and
00:01:20
you can see that 01 is a pretty big step
00:01:22
forward um it's also much better in a
00:01:24
lot of other ways but raw intelligence
00:01:26
is something that we care about coding
00:01:27
performance in particular is an area
00:01:29
where people people are using the model
00:01:30
a lot so in just a minute uh these guys
00:01:33
will demo some things about a one
00:01:35
they'll show you how it does at speed
00:01:37
how it does at really hard problems how
00:01:39
it does with multimodality but first I
00:01:41
want to talk just for a minute about the
00:01:42
second thing we're launching today a lot
00:01:45
of people uh Power users of chat gbt at
00:01:47
this point they really use it a lot and
00:01:49
they want more compute than $20 a month
00:01:51
can buy so we're launching a new tier
00:01:53
chat gbt pro and pro has unlimited
00:01:56
access to our models uh and also things
00:01:58
like advanced voice mode it also has a
00:02:01
uh a new thing called 01 PR mode so 01
00:02:04
is the smartest model in the world now
00:02:06
except for 01 being used in PR mode and
00:02:09
for the hardest problems that people
00:02:10
have uh 01 PR mode lets you do even a
00:02:13
little bit better um so you can see a
00:02:15
competition math you can see a GP QA
00:02:17
Diamond um and these boosts may look
00:02:19
small but in in complex workflows where
00:02:21
you're really pushing the limits of
00:02:22
these models it's pretty significant uh
00:02:25
I'll show you one more thing about Pro
00:02:27
about the pro mode so one that people
00:02:30
really have said they want is
00:02:31
reliability and here you can see how the
00:02:34
reliability of an answer from prom mode
00:02:36
Compares to1 and this isn't even
00:02:37
stronger Delta and again for our Pro
00:02:40
users we've heard a lot about how much
00:02:41
people want this chat PT Pro is $200 a
00:02:44
month uh launches today over the course
00:02:46
of this these 12 days we have some other
00:02:48
things to add to it that we think you
00:02:50
also really love um but Unlimited Model
00:02:52
use and uh this new 01 prom mode so I
00:02:55
want to jump right in and we'll show
00:02:56
some of those demos that we talked about
00:02:59
uh and these are some of the guys that
00:03:00
helped build 01 uh with many other
00:03:03
people behind them on the team thanks
00:03:05
Sam hi um I'm H onean I'm Jason and I'm
00:03:09
Max we're all research scientists who
00:03:10
worked on building 01 o1 is really
00:03:13
distinctive because it's the first model
00:03:14
we've trained that thinks before it
00:03:16
responds meaning it gives much better
00:03:18
and often more detailed and more correct
00:03:20
responses than other models you might
00:03:21
have tried 01 is being rolled out today
00:03:24
to all uh plus and soon to be Pro
00:03:27
subscribers on chat gbt replacing o1 PR
00:03:31
o1 model is uh faster and smarter than
00:03:34
the o1 preview model which we launched
00:03:36
in September after the launch many
00:03:38
people asked about the multimodel input
00:03:40
so we added that uh so now the oan model
00:03:43
live today is able to region through
00:03:46
both images and text
00:03:48
jointly as Sam mentioned today we're
00:03:50
also going to launch a new tier of Chad
00:03:52
gbt called chbt pro chbt pro offers
00:03:56
unlimited access to our best models like
00:03:59
01 40 and advanced voice chbt Pro also
00:04:03
has a special way of using 01 called 01
00:04:06
Pro mode with o1 Pro mode you can ask
00:04:09
the model to use even more compute to
00:04:11
think even harder on some of the most
00:04:13
difficult
00:04:14
problems we think the audience for chat
00:04:17
gbt Pro will be the power users of chat
00:04:19
gbt those who are already pushing the
00:04:21
models to the limits of their
00:04:22
capabilities on tasks like math
00:04:25
programming and writing it's been
00:04:26
amazing to see how much people are
00:04:28
pushing a one preview how much people
00:04:30
who do technical work all day get out of
00:04:32
this and uh we're really excited to let
00:04:33
them push it further yeah sure we also
00:04:36
really think that 01 will be much better
00:04:37
for everyday use cases not necessarily
00:04:40
just really hard math and programming
00:04:42
problems in particular one piece of
00:04:43
feedback we received about o1 preview
00:04:45
constantly was that it was way too slow
00:04:47
it would think for 10 seconds if you
00:04:48
said High to it and we fixed that was
00:04:50
really annoying it it was kind of funny
00:04:52
honestly it really thought it cared
00:04:55
really thought hard about saying hi back
00:04:56
yeah um and so we fixed that 01 will now
00:04:59
think much more intelligently if you ask
00:05:01
it a simple question it'll respond
00:05:03
really quickly and if you ask it a
00:05:04
really hard question it'll think for a
00:05:05
really long time uh we ran a pretty
00:05:08
detailed Suite of human evaluations for
00:05:09
this model and what we found was that it
00:05:11
made major mistakes about 34% less often
00:05:14
than o one preview while thinking fully
00:05:17
about 50% faster and we think this will
00:05:19
be a really really noticeable difference
00:05:21
for all of you so I really enjoy just
00:05:23
talking to these models I'm a big
00:05:25
history buff and I'll show you a really
00:05:26
quick demo of for example a sort of
00:05:28
question that I might ask one of these
00:05:30
models so uh right here I on the left I
00:05:33
have 01 on the right I have o1 preview
00:05:36
and I'm just asking at a really simple
00:05:37
history question list the Roman EMP of
00:05:39
the second century tell me about their
00:05:41
dates what they did um not hard but you
00:05:44
know GPT 40 actually gets this wrong a
00:05:46
reasonable fraction of the time um and
00:05:49
so I've asked o1 this I've asked o1
00:05:51
preview this I tested this offline a few
00:05:53
times and I found that 01 on average
00:05:55
responded about 60% faster than1 preview
00:05:58
um this could be a little bit aable
00:05:59
because right now we're in the process
00:06:01
of swapping all our gpus from 01 Pro
00:06:04
preview to 01 so actually 01 thought for
00:06:07
about 14 seconds 01 preview still
00:06:11
going there's a lot of Roman emperors
00:06:13
there's a lot of Roman emperors yeah 40
00:06:15
actually gets this wrong a lot of the
00:06:16
time there are a lot of folks who rolled
00:06:17
for like uh 6 days 12 days a month and
00:06:20
it sometimes forgets those can you do
00:06:22
them all for memory including the six
00:06:23
day people
00:06:25
no yep so here we go 01 thought for
00:06:28
about 14 seconds preview thought for
00:06:30
about 33 seconds these should both be
00:06:32
faster once we finish deploying but we
00:06:33
wanted this to go live right now exactly
00:06:35
um so yeah we we think you'll really
00:06:37
enjoy talking to this model we we found
00:06:39
that it gave great responses it thought
00:06:40
much faster it should just be a much
00:06:42
better user experience for everyone so
00:06:44
one other feature we know that people
00:06:45
really wanted for everyday use cases
00:06:47
that we've had requested a lot is
00:06:49
multimodal inputs and image
00:06:50
understanding and hungan is going to
00:06:52
talk about that now yep to illustrate
00:06:54
the multimodal input and reasoning uh I
00:06:57
created this toy problem uh with some
00:07:00
handdrawn diagrams and so on so here it
00:07:03
is it's hard to see so I already took a
00:07:05
photo of this and so let's look at this
00:07:08
photo in a laptop so once you upload the
00:07:11
image into the chat GPT you can click on
00:07:14
it and um to see the zoomed in version
00:07:17
so this is a system of a data center in
00:07:20
space so maybe um in the future we might
00:07:24
want to train AI models in the space uh
00:07:28
I think we should do that but the Power
00:07:30
number looks a little low one G okay but
00:07:33
the general idea rookie numbers in this
00:07:35
rookie numbers rookie okay yeah so uh we
00:07:38
have a sun right here uh taking in power
00:07:41
on this solar panel and then uh there's
00:07:44
a small data center here it's exactly
00:07:46
what they look like yeah GPU Rex and
00:07:49
then pump nice pump here and one
00:07:52
interesting thing about um operation in
00:07:55
space is that on Earth we can do air
00:07:58
cooling water cooling to cool the gpus
00:08:00
but in space there's nothing there so we
00:08:03
have to radiate this um heat into the
00:08:06
deep space and that's why we need this
00:08:09
uh giant radiator cooling panel and this
00:08:12
problem is about finding the lower bound
00:08:14
estimate of the cooling panel area
00:08:18
required to operate um this 1 gaw uh uh
00:08:22
data center probably going to be very
00:08:24
big yeah let's see how big is let's see
00:08:28
so that's the problem and going to this
00:08:30
prompt and uh yeah this is essentially
00:08:33
asking for that so let me uh hit go and
00:08:36
the model will think for
00:08:39
seconds by the way most people don't
00:08:41
know I've been working with henan for a
00:08:42
long time henan actually has a PHD in
00:08:46
thermodynamics which it's totally
00:08:48
unrelated to Ai and you always joke that
00:08:50
you haven't been able to use your PhD
00:08:52
work in your job until today so you can
00:08:55
you can trust hungan on this analysis
00:08:57
finally finally uh thanks for hyping up
00:09:00
now I really have to get this right uh
00:09:03
okay so the model finished thinking only
00:09:06
10 seconds it's a simple problem so
00:09:08
let's see if how the model did it so
00:09:11
power input um so first of all this one
00:09:14
gwatt that was only drawn in the paper
00:09:17
so the model was able to pick that up
00:09:19
nicely and then um radiative heat
00:09:21
transfer only that's the thing I
00:09:23
mentioned so in space nothing else and
00:09:25
then some simplifying um uh choices and
00:09:29
one critical thing is that I
00:09:30
intentionally made this problem under
00:09:32
specified meaning that um the critical
00:09:36
parameter is a temperature of the
00:09:37
cooling panel uh I left it out so that
00:09:41
uh we can test out the model's ability
00:09:43
to handle um ambiguity and so on so the
00:09:47
model was able to recognize that this is
00:09:50
actually a unspecified but important
00:09:53
parameter and it actually picked the
00:09:55
right um range of param uh temperature
00:09:58
which is about the room temperature and
00:10:00
with that it continues to the analysis
00:10:03
and does a whole bunch of things and
00:10:05
then found out the area which is 2.42
00:10:09
million square meters just to get a
00:10:10
sense of how big this is this is about
00:10:13
2% of the uh land area of San Francisco
00:10:16
this is huge not that bad not that bad
00:10:19
yeah oh
00:10:20
okay um yeah so I guess this this uh
00:10:24
reasonable I'll skip through the rest of
00:10:26
the details but I think the model did a
00:10:28
great job job um making nice consistent
00:10:33
assumptions that um you know make the
00:10:35
required area as little as possible and
00:10:38
so um yeah so this is the demonstration
00:10:42
of the multimodal reasoning and this is
00:10:45
a simple problem but o1 is actually very
00:10:48
strong and on standard benchmarks like
00:10:50
mm muu and math Vista o1 actually has
00:10:54
the state-ofthe-art
00:10:55
performance now Jason will showcase the
00:10:58
the pr mode
00:10:59
great so I want to give a short demo of
00:11:02
uh chb1 Pro mode um people will find uh
00:11:07
o1 prom mode the most useful for say
00:11:09
hard math science or programming
00:11:11
problems so here I have a pretty
00:11:13
challenging chemistry problem that o1
00:11:16
preview gets usually Incorrect and so I
00:11:19
will uh let the model start
00:11:22
thinking um one thing we've learned with
00:11:24
these models is that uh for these very
00:11:27
challenging problems the model can think
00:11:29
up to a few minutes I think for this
00:11:31
problem the model usually thinks
00:11:32
anywhere from 1 minute to up to 3
00:11:35
minutes um and so we have to provide
00:11:37
some entertainment for for people while
00:11:39
the model is thinking so I'll describe
00:11:41
the problem a little bit and then if the
00:11:43
model's still thinking when I'm done
00:11:45
I've prepared a dad joke for for us uh
00:11:48
to fill the rest of the time um so I
00:11:51
hope it think for a long
00:11:52
time you can see uh the problem asks for
00:11:56
a protein that fits a very specific
00:11:59
specific set of criteria so uh there are
00:12:01
six criteria and the challenge is each
00:12:04
of them ask for pretty chemistry domain
00:12:06
specific knowledge that the model would
00:12:08
have to
00:12:09
recall and the other thing to know about
00:12:11
this problem uh is that none of these
00:12:14
criteria actually give away what the
00:12:16
correct answer is so for any given
00:12:18
criteria there could be dozens of
00:12:20
proteins that might fit that criteria
00:12:23
and so the model has to think through
00:12:24
all the candidates and then check if
00:12:26
they fit all the
00:12:27
criteria okay so you could see the model
00:12:30
actually was faster this time uh so it
00:12:33
finished in 53 seconds you can click and
00:12:36
see some of the thought process that the
00:12:38
model went through to get the answer uh
00:12:40
you could see it's uh thinking about
00:12:42
different candidates like neuro Lian
00:12:44
initially um and then it arrives at the
00:12:46
correct answer which is uh retino chisen
00:12:49
uh which is
00:12:51
great um okay so to summarize um we saw
00:12:54
from Max that o1 is smarter and faster
00:12:59
than uh o1 preview we saw from hangan
00:13:02
that oan can now reason over both text
00:13:05
and images and then finally we saw with
00:13:08
Chach BT Pro mode uh you can use o1 to
00:13:11
think about uh the the to to to to
00:13:15
reason about the hardest uh science and
00:13:17
math problems yep there's more to come
00:13:20
um for the chpt pro tier uh we're
00:13:23
working on even more computer intensive
00:13:26
tasks to uh Power longer and bigger
00:13:28
tasks ask for those who want to push the
00:13:31
model even further and we're still
00:13:34
working on adding tools to the o1 um
00:13:37
model such as web browsing file uploads
00:13:41
and things like that we're also hard at
00:13:43
work to bring o1 to to the API we're
00:13:45
going to be adding some new features for
00:13:47
developers structured outputs function
00:13:49
calling developer messages and API image
00:13:52
understanding which we think you'll
00:13:53
really enjoy we expect this to be a
00:13:55
great model for developers and really
00:13:57
unlock a whole new frontier of aent
00:13:59
things you guys can build we hope you
00:14:00
love it as much as we
00:14:02
do that was great thank you guys so much
00:14:05
congratulations uh to you and the team
00:14:07
on on getting this done uh we we really
00:14:10
hope that you'll enjoy 01 and prom mode
00:14:13
uh or Pro tier uh we have a lot more
00:14:15
stuff to come tomorrow we'll be back
00:14:16
with something great for developers uh
00:14:19
and we'll keep going from there before
00:14:21
we wrap up can can we hear your joke yes
00:14:24
uh so um I made this joke this
00:14:27
morning the the joke is this so Santa
00:14:31
was trying to get his large language
00:14:33
model to do a math problem and he was
00:14:36
prompting it really hard but it wasn't
00:14:37
working how did he eventually fix
00:14:40
it no idea he used reindeer enforcement
00:14:48
learning thank you very much thank you