Grok 3发布会中文字幕版本|聪明得让人害怕

00:50:25
https://www.youtube.com/watch?v=QgCz8--9BAw

Summary

TLDR在Grock 3的发布中,XAI团队强调了其在宇宙探索和知识获取上的愿景,并展示了Grock 3在能力上的显著提升。模型的设计灵感来源于对人类和宇宙理解的深刻追求,注重事实和真相的严谨探索。通过建立自有数据中心,团队克服了多重挑战以实现Grock 3的训练,最新的Grock 3表现出色,并通过Blind Test获胜。新功能包括高级推理能力和Deep Search,致力于为用户提供实时准确的信息检索体验,体现了持续改进的潜力。

Takeaways

  • 🌌 追求宇宙真理的重要性
  • 🚀 Grock 3在各项基准测试中表现出色
  • 🛠️ 建立自有数据中心以支持AI训练
  • 🧠 加强推理和创造力的能力
  • 🔍 Deep Search帮助精确回答问题
  • 📈 Grock持续进行优化与改进
  • 👾 实时反馈用户使用体验
  • 🔋 大规模训练面临的冷却与电力问题
  • 🎮 新游戏创作展示AI能力
  • 💡 未来AI在各领域的应用展望

Timeline

  • 00:00:00 - 00:05:00

    在这次GR 3的介绍中,团队的使命是深入了解宇宙的本质,探索宇宙中的一系列基本问题,例如外星人存在与否、生命的意义等。他们表达了追求真理的重要性,并介绍了GR 3的能力提升,感谢团队努力工作的成果。

  • 00:05:00 - 00:10:00

    团队成员介绍后,详述了GR(Grock)的命名来源,意为“全面而深刻地理解某事”,并强调了同理心的重要性。GR的早期模型(Grock 1)与最新的Grock 3相比,虽然起步不高,但在过去几个月里,GR的能力已经显著提升,特别是在基础设施和团队的努力下。

  • 00:10:00 - 00:15:00

    讨论了模型的训练进展,GR 1.5和GR 2的发布。随着训练模型的GPU数量的增加,训练能力都得到极大提升。团队通过构建自己的数据中心解决了冷却和电源的问题,成功建立了一个规模庞大的GPU集群。

  • 00:15:00 - 00:20:00

    GR 3引入了先进的推理能力,并与其他模型进行了盲测比较,GR 3被认为在各项能力上都遥遥领先。此外,团队强调了持续更新和改善的特点,使用户在短时间内就能体验到更好的性能。

  • 00:20:00 - 00:25:00

    数据中心的不断扩建和GPU的数量激增为GR 3的推出提供了技术支持,展示了通过不断的努力和创新,如何在短时间内构建起世界领先的AI模型训练平台。

  • 00:25:00 - 00:30:00

    团队演示了GR 3的数学、科学和编程能力,对比了不同版本模型的表现,指出了其在数学、科学知识和编程能力上的出色表现。

  • 00:30:00 - 00:35:00

    在GR 3的实际应用演示中,团队展示了GR在解决物理问题和游戏设计中的推理能力,以及展示生成的代码,强调了GR在实际应用中的创造力和解决问题的潜力。

  • 00:35:00 - 00:40:00

    涉及到持续更新GR的性能和功能,团队成员分享了关于GR解决具体问题的能力,提高了在复杂推理任务中的表现,并在特定情境下展示GR如何进行复杂的逻辑推理。

  • 00:40:00 - 00:45:00

    团队还展示了GR与用户互动的能力,演示了深度搜索新功能,旨在帮助用户解决实际日常问题,并提供比现有搜索引擎更深入的洞察与答案。

  • 00:45:00 - 00:50:25

    最后,GR 3的发布计划和价格结构清晰展示给广大用户,团队期待用户的反馈和进一步的产品完善。

Show more

Mind Map

Video Q&A

  • Grock 3有什么新功能?

    Grock 3具有增强的推理能力和Deep Search功能,能够更好地理解和回答用户提问。

  • Grock 3比Grock 2提升了多少性能?

    Grock 3在性能上比Grock 2提升了十倍以上。

  • 如何访问Grock 3?

    首批访问Grock 3的用户将是X平台的Premium Plus订阅者。

  • Grock 3会开源吗?

    一旦Grock 3稳定成熟,就会考虑开源。

  • Grock的语音助手什么时候上线?

    Grock的语音助手预计会在不久后上线,但仍在打磨中.

  • Deep Search功能的作用是什么?

    Deep Search能够深入分析用户提问,提供更加准确的答案和信息,即更高效的搜索引擎体验.

View more video summaries

Get instant access to free YouTube video summaries powered by AI!
Subtitles
en
Auto Scroll:
  • 00:00:28
    X
  • 00:00:57
    for deep V cloud
  • 00:01:28
    standing
  • 00:01:58
    for
  • 00:02:28
    for
  • 00:02:56
    for all right well welcome to the gr 3
  • 00:03:00
    presentation um so the mission of xai
  • 00:03:04
    and Gro is to understand the universe we
  • 00:03:07
    want to understand the nature of the
  • 00:03:08
    universe so we can figure out what's
  • 00:03:10
    going on where are the aliens what's the
  • 00:03:12
    meaning of life how does the universe
  • 00:03:13
    end how did it start all these
  • 00:03:15
    fundamental questions um were driven by
  • 00:03:18
    curiosity about the nature of the
  • 00:03:20
    universe and um that's also what causes
  • 00:03:23
    us to be a maximally truth
  • 00:03:27
    seeking uh AI even if that truth is
  • 00:03:31
    sometimes at odds with what is
  • 00:03:32
    politically correct in order to
  • 00:03:35
    understand the nature of the universe
  • 00:03:37
    you must absolutely rigorously pursue
  • 00:03:39
    truth or you will not understand the
  • 00:03:41
    universe you'll be suffering from some
  • 00:03:43
    amount of delusion or error so that is
  • 00:03:46
    our goal um figure out what's going on
  • 00:03:50
    and uh we're very excited to present
  • 00:03:53
    grock 3 which is we think uh an order of
  • 00:03:56
    magnitude more capable than grock 2 in a
  • 00:03:58
    very short period of time
  • 00:04:00
    and uh that's thanks to uh the hard work
  • 00:04:04
    of an incredible team and um I'm honored
  • 00:04:07
    to work with such a great team and of
  • 00:04:09
    course we'd love to have um some of the
  • 00:04:11
    smartest humans out there join our team
  • 00:04:14
    so uh with that let's let's go hi
  • 00:04:18
    everyone my name is Igor lead
  • 00:04:19
    engineering at XI I'm Jimmy Paul leading
  • 00:04:23
    research I'm Tony working on the
  • 00:04:25
    reasoning Team all right I'm El I don't
  • 00:04:28
    do anything
  • 00:04:30
    I just show up
  • 00:04:31
    occasionally yeah so um like mentioned
  • 00:04:34
    Gro is the tool that we're working on
  • 00:04:36
    Gro is our AI that we're building here
  • 00:04:38
    at XI and we've been working extremely
  • 00:04:40
    hard over the last few months to improve
  • 00:04:41
    grock as much as we can so we can give
  • 00:04:43
    it to all of you so we can give all of
  • 00:04:45
    you access to it um we think it's going
  • 00:04:47
    to be extremely useful do we think it's
  • 00:04:49
    going to be interesting to talk to funny
  • 00:04:51
    really really funny um and um we're
  • 00:04:53
    going to explain to you how we've
  • 00:04:54
    improved gr over the last few months
  • 00:04:56
    we've made quite a jump in in
  • 00:04:57
    capabilities yeah actually we should
  • 00:04:59
    explain maybe also what is why do we
  • 00:05:00
    call it Gro so Gro is a word from um a
  • 00:05:04
    heand novel Stranger in a Strange Land
  • 00:05:07
    um and it's a used by a guy who's who
  • 00:05:11
    was raised on Mars um and the word Gro
  • 00:05:14
    is to sort of fully and profoundly
  • 00:05:17
    understand something that's what the
  • 00:05:18
    word Gro means fully and profoundly
  • 00:05:20
    understand something and empathy is
  • 00:05:23
    important true
  • 00:05:26
    yeah so yeah so uh if we charted xas
  • 00:05:30
    progress uh in the last few months has
  • 00:05:33
    only been 17 months since we started
  • 00:05:36
    kicking off our very first model uh
  • 00:05:39
    grock one was almost like a toy by this
  • 00:05:43
    point only 314 billion parameters and
  • 00:05:45
    now if we PR the progress the time on
  • 00:05:49
    x-axis the performance of favorite
  • 00:05:51
    Benchmark numbers M mlu on the y-axis
  • 00:05:54
    we're literally progressing at
  • 00:05:56
    unprecedent speed across the whole field
  • 00:06:00
    and then we kick off grock 1.5 right
  • 00:06:02
    after grock 1 released after November
  • 00:06:05
    2023 and then grock 2 so if you look at
  • 00:06:09
    where the all the performance coming
  • 00:06:12
    from when you have a very correct
  • 00:06:14
    engineering team and all the best AI at
  • 00:06:17
    Talent there only one thing we need is a
  • 00:06:20
    big intelligence comes from big
  • 00:06:23
    cluster so we can reconvert the entire
  • 00:06:27
    progress of xai now replacing the bench
  • 00:06:29
    the y axis to the total amount of
  • 00:06:31
    training flops that is how many gpus we
  • 00:06:34
    can run at any given time to train our
  • 00:06:36
    large language models to compress the
  • 00:06:39
    entire
  • 00:06:40
    internet so after all human all human
  • 00:06:43
    knowledge really that's right yeah
  • 00:06:44
    internet being part of it but it's
  • 00:06:46
    really all human knowledge all
  • 00:06:47
    everything yeah the whole internet fits
  • 00:06:49
    into a USB stick at this point it's like
  • 00:06:51
    all the human tokens yeah that's right
  • 00:06:54
    yeah uh very soon into the real world
  • 00:06:57
    yeah um so we had so much trouble
  • 00:07:00
    actually training Gru back in the days
  • 00:07:03
    uh we kickoff the model around February
  • 00:07:07
    and uh we thought we had a large amount
  • 00:07:09
    of chips but turned out we can barely
  • 00:07:11
    get AK training chips running coherently
  • 00:07:14
    at any given time and we had so many
  • 00:07:18
    Cooling and power issues I think you
  • 00:07:21
    were there in the data center yeah it
  • 00:07:23
    was like really sort of more like 8K
  • 00:07:25
    chps on average at 80% efficiency more
  • 00:07:28
    like like 6,500 effective uh h100s
  • 00:07:32
    training for you know several months but
  • 00:07:36
    now now we're at 100K so yeah that's
  • 00:07:39
    right more than 100K that's right so so
  • 00:07:41
    what's the next step right so after gu 2
  • 00:07:45
    so if we want to continue
  • 00:07:47
    accelerate we have to take the matter
  • 00:07:49
    into our own hands we have to solve all
  • 00:07:50
    the coolings um all the power issues and
  • 00:07:54
    everything yeah so so in April of last
  • 00:07:56
    year Elon decided that really the only
  • 00:07:58
    way for X to succeed for XI to build the
  • 00:08:01
    best AI out there is to build our own
  • 00:08:03
    data center so um we didn't have a lot
  • 00:08:06
    of time that because we wanted to give
  • 00:08:07
    you gr free as quickly as possible so
  • 00:08:10
    really we realized we have to build the
  • 00:08:12
    data center in about four months um it
  • 00:08:15
    turned out it took us 122 days to get
  • 00:08:17
    the first 100K gpus up and running and
  • 00:08:20
    that was a Monumental effort uh to be
  • 00:08:22
    able to do that um it's we believe it's
  • 00:08:25
    the biggest uh fully connected h100
  • 00:08:28
    cluster of its kind um and uh we didn't
  • 00:08:30
    just stop there we actually decided that
  • 00:08:32
    we need to double the size of the
  • 00:08:34
    cluster pretty much immediately if we
  • 00:08:36
    want to build uh the kind of AI that we
  • 00:08:38
    want to build um so we then had another
  • 00:08:42
    phase um which we haven't talked about
  • 00:08:44
    publicly yet so this is the first time
  • 00:08:45
    that we're talking about this uh where
  • 00:08:47
    we doubled the capacity of the data
  • 00:08:49
    center yet again um and that one only
  • 00:08:52
    took us 92 days so we've been able to
  • 00:08:55
    use all of these gpus use all of this
  • 00:08:56
    compute to improve grock in the meantime
  • 00:08:59
    and basically today we're going to
  • 00:09:00
    present you the results of that the the
  • 00:09:03
    fruits that came from that um so let's
  • 00:09:07
    yeah so all the path all the rows leads
  • 00:09:09
    to grock 3 uh 10x more compute more than
  • 00:09:13
    10x really yeah really like maybe 15x
  • 00:09:17
    yep uh compared to our previous
  • 00:09:19
    generation model and gr finished the
  • 00:09:22
    pre-training uh early January um and uh
  • 00:09:26
    then we start you know the model still
  • 00:09:28
    currently training actually so this is a
  • 00:09:30
    little preview of our Benchmark numbers
  • 00:09:34
    so we evaluated gr 3 on you know three
  • 00:09:37
    different categories on General
  • 00:09:40
    mathematical reasonings on general
  • 00:09:43
    knowledge about stem and Science and
  • 00:09:46
    then also on computer science
  • 00:09:48
    coding so Amy uh American Invitational
  • 00:09:52
    math
  • 00:09:53
    examination uh host it you know once a
  • 00:09:56
    year uh and if we evaluate mod
  • 00:09:59
    performance we can see that the gr 3
  • 00:10:02
    across the board is in a league of its
  • 00:10:04
    own even it's little brother gr3 mini is
  • 00:10:09
    reaching the frontier across all the
  • 00:10:11
    other
  • 00:10:12
    competitors so you will say well at this
  • 00:10:15
    point all these benchmarks you're just
  • 00:10:18
    evaluating you know the memorization of
  • 00:10:19
    the textbooks memorization of the GitHub
  • 00:10:22
    repost how about realtime usefulness how
  • 00:10:25
    about we actually use those models in
  • 00:10:27
    our product so what we did instead is we
  • 00:10:31
    actually kicked off a blind test of our
  • 00:10:34
    gr three model code named Chocolate it's
  • 00:10:37
    pretty hot yeah hot chocolate um and uh
  • 00:10:41
    you know been running on this uh
  • 00:10:44
    platform called Cho arena for two weeks
  • 00:10:46
    um I think the entire X platform at some
  • 00:10:49
    point speculated this might be the next
  • 00:10:51
    generation of a AI come me away so uh
  • 00:10:56
    how this CH Arena works is that um it
  • 00:10:59
    strip away the entire product surface
  • 00:11:02
    right it's just raw comparison of the
  • 00:11:04
    engine of those agis the language models
  • 00:11:07
    themselves and place interface where the
  • 00:11:09
    user will submit one single query and
  • 00:11:12
    you get to show two responses you don't
  • 00:11:14
    know which model they come from and in
  • 00:11:16
    end you make the vote so in this blind
  • 00:11:18
    test grock 3 an early version of grock 3
  • 00:11:22
    already reached like 1,400 no other
  • 00:11:26
    models has reached an ELO score had to
  • 00:11:28
    have comparison to all the other models
  • 00:11:30
    at this score and it's not just one
  • 00:11:33
    single category it's, 1400 aggregated
  • 00:11:36
    across all the categories in chb
  • 00:11:39
    capabilities instruction following
  • 00:11:41
    coding so it's number one across the
  • 00:11:43
    board in this blind test and it's it's
  • 00:11:45
    still climbing so we actually to keep
  • 00:11:47
    updating it so it's it's 14,400 above,
  • 00:11:50
    1400 in climbing yeah and in fact we
  • 00:11:52
    have a version of the model that we
  • 00:11:53
    think is already much better than the
  • 00:11:55
    one that we tested here yeah we'll see
  • 00:11:57
    you know how how far it gets uh but
  • 00:12:00
    that's the one that we're you know um
  • 00:12:02
    working on or talking about today yeah
  • 00:12:04
    so actually one thing if if you're if
  • 00:12:06
    you're using grock 3 you I think you may
  • 00:12:07
    notice improvements almost every day um
  • 00:12:10
    because we're we're continuously
  • 00:12:11
    improving the model so
  • 00:12:13
    literally even within 24 hours you'll
  • 00:12:15
    see
  • 00:12:16
    improvements yep so but we believe here
  • 00:12:20
    at xai getting the best pre-training
  • 00:12:23
    model is not enough that's not enough to
  • 00:12:25
    build the best AI and the best AI need
  • 00:12:28
    to think like a human
  • 00:12:29
    you to contemplate about all the
  • 00:12:31
    possible
  • 00:12:32
    solutions self-critique verify all the
  • 00:12:36
    solutions backtrack and also think from
  • 00:12:39
    the first principle that's a very
  • 00:12:41
    important capability so we believe that
  • 00:12:44
    as we take the best pre-train model and
  • 00:12:47
    continue training it with reinforcement
  • 00:12:49
    learning it will elicit the additional
  • 00:12:52
    reasoning capabilities that allows the
  • 00:12:54
    model just become so much better and
  • 00:12:57
    scale not just in the training time but
  • 00:12:59
    in the test time as well so we already
  • 00:13:02
    found the model is extremely useful
  • 00:13:04
    internally um for our own engineering
  • 00:13:06
    saving hours of uh time hundreds of
  • 00:13:09
    hours of uh coding time so e you the
  • 00:13:12
    power user of our uh graic reasoning
  • 00:13:14
    model what are some use cases yeah so
  • 00:13:16
    like Jimmy said we've added Advanced
  • 00:13:18
    reasoning capabilities to Grog and we've
  • 00:13:20
    been testing them pretty heavily over
  • 00:13:21
    the last few weeks in order to give you
  • 00:13:23
    a little bit of a taste of what it looks
  • 00:13:24
    like when Gro is solving hard reasoning
  • 00:13:27
    problems so we prepared two little
  • 00:13:28
    problems for you one comes from physics
  • 00:13:31
    and one is actually a game that gr is
  • 00:13:32
    going to write for us um so when it
  • 00:13:35
    comes to the physics problem you know
  • 00:13:36
    what we want gr to do is to plot a
  • 00:13:39
    viable trajectory to do a transfer from
  • 00:13:42
    Earth to Mars and then uh at a later
  • 00:13:45
    point in time a transfer back from Mars
  • 00:13:47
    to Earth um and that requires some know
  • 00:13:50
    some Physics that gr will have to
  • 00:13:52
    understand um so we're going to
  • 00:13:53
    challenge grock you know come up with a
  • 00:13:55
    variable trajectory calculate it and
  • 00:13:58
    then plot for us so we can see it and um
  • 00:14:02
    yeah this is totally unscripted by the
  • 00:14:04
    way this is the that's the entirety of
  • 00:14:05
    the prompt which was we clarify is that
  • 00:14:08
    yeah there's nothing more than that yeah
  • 00:14:10
    exactly this is the gro interface and
  • 00:14:12
    we've typed in this text that you can
  • 00:14:14
    see here generate code for an animated
  • 00:14:16
    3D plot of a launch from Earth uh
  • 00:14:19
    landing on Mars and then back to Earth
  • 00:14:21
    at the next launch window um and we've
  • 00:14:24
    not kicked off with the query and you
  • 00:14:26
    can see Gro is thinking so uh part of
  • 00:14:29
    grock's Advanced reasoning capabilities
  • 00:14:31
    are these thinking traces that you can
  • 00:14:32
    see here you can even go inside and
  • 00:14:35
    actually read what Gro is thinking as
  • 00:14:37
    it's going through the problem as it's
  • 00:14:38
    trying to solve it
  • 00:14:41
    um yeah we say like we are doing some
  • 00:14:44
    obscuration of the thinking so that our
  • 00:14:46
    model doesn't get totally copied
  • 00:14:48
    instantly um so there's more to the
  • 00:14:51
    thinking than is displayed uh yeah yeah
  • 00:14:56
    and because this is totally unscripted
  • 00:14:58
    there's actually a chance that grock
  • 00:14:59
    might made a little coding mistake and
  • 00:15:01
    it might not actually work um so um just
  • 00:15:04
    in case we're going to launch two more
  • 00:15:06
    instances of this so if something goes
  • 00:15:08
    wrong we were able to uh to switch to
  • 00:15:11
    those and show you um something that's
  • 00:15:14
    presentable so we're kicking off the
  • 00:15:16
    other two as well um and like I said we
  • 00:15:18
    have a second problem as well um and um
  • 00:15:22
    yeah actually one of the favorite one of
  • 00:15:23
    our favorite activities here at xci is
  • 00:15:25
    having Gro WR games for us um and um not
  • 00:15:29
    just any no uh any old game any game
  • 00:15:32
    that you might already be familiar with
  • 00:15:33
    but actually creating new games on the
  • 00:15:35
    spot and being creative about us um so
  • 00:15:38
    one example that we found was really
  • 00:15:40
    really fun um is create a game that's a
  • 00:15:43
    mixture of the two games Tetris and be
  • 00:15:47
    so this is that maybe an important thing
  • 00:15:49
    like this obviously if you if you ask an
  • 00:15:52
    AI to create a game like Tetris there's
  • 00:15:53
    there are many examples of Tetris on the
  • 00:15:55
    on the Internet or a game like J
  • 00:15:58
    whatever is it can copy it what's
  • 00:16:01
    interesting here is it achieved a
  • 00:16:03
    creative solution combining the two
  • 00:16:06
    games that actually works and and is a
  • 00:16:10
    good game yeah that's the it's cre we're
  • 00:16:12
    seeing the beginnings of
  • 00:16:14
    creativity yeah fingers cross that we
  • 00:16:17
    can recreate that hopefully it works
  • 00:16:19
    yeah embarrassing it so actually because
  • 00:16:21
    this is a bit more challenging we're
  • 00:16:23
    going to use something special here
  • 00:16:24
    which we call Big Brain that's our mode
  • 00:16:27
    in which we use more computation
  • 00:16:30
    reason for just to make there's a good
  • 00:16:33
    chance here that actually might actually
  • 00:16:35
    do it so we also going to fire off know
  • 00:16:37
    three attempts here at at solving this
  • 00:16:40
    game at creating this game that's a
  • 00:16:43
    mixture of know Tetris and
  • 00:16:45
    Bol um yeah let's let's see what Gro
  • 00:16:47
    comes up like I've played the game it's
  • 00:16:49
    pretty good like it's like wow okay this
  • 00:16:52
    is something yeah um so while gr is
  • 00:16:55
    thinking uh in the in the background um
  • 00:16:57
    we can now actually talk about some
  • 00:16:59
    concrete numbers know how how well is gr
  • 00:17:01
    doing across tons of different tasks
  • 00:17:03
    that we've tested it on um so we'll hand
  • 00:17:05
    it over to Tony to talk about that yeah
  • 00:17:08
    okay so let's see how Gro does on those
  • 00:17:11
    interesting challenging benchmarks uh so
  • 00:17:14
    yeah so reasoning again refers to those
  • 00:17:16
    models that actually thinks quite for
  • 00:17:19
    quite a long time before it tries to
  • 00:17:21
    solve a problem so in this case uh you
  • 00:17:24
    know around a month ago the gr 3
  • 00:17:26
    pre-training finishes so after that we
  • 00:17:29
    work very hard to put the reasoning
  • 00:17:31
    capability into the uh current grath 3
  • 00:17:34
    Model but again this is very early days
  • 00:17:37
    so the model is still currently in
  • 00:17:39
    training so right now what we're going
  • 00:17:41
    to show to people is this beta version
  • 00:17:43
    of the gra three reasoning model
  • 00:17:45
    alongside we also are training a mini
  • 00:17:48
    version of the reasoning model so
  • 00:17:50
    essentially on this plot you can see uh
  • 00:17:52
    the grth three reasoning beta and then
  • 00:17:54
    grth three mini reasoning the grth three
  • 00:17:56
    reason mini reasoning is actually a
  • 00:17:58
    model that we train for much longer time
  • 00:18:00
    and you can see that sometimes it
  • 00:18:01
    actually perform slly better compared to
  • 00:18:04
    the gr three reasoning this also just
  • 00:18:06
    means that there's a huge potential for
  • 00:18:08
    the gr three reasoning because it's
  • 00:18:10
    trained for much less time um so all
  • 00:18:13
    right so let's actually look at what how
  • 00:18:15
    how it does on those three benchmarks so
  • 00:18:18
    Jimmy also introduced already so
  • 00:18:20
    essentially we're looking at three
  • 00:18:21
    different areas mathematics science and
  • 00:18:24
    coding um and for math we're picking
  • 00:18:27
    this high school competition math
  • 00:18:28
    problem
  • 00:18:29
    um for science we actually pick those
  • 00:18:32
    PhD level science questions um and for
  • 00:18:35
    coding it's also actually pretty
  • 00:18:36
    challenging it's competitive coding and
  • 00:18:38
    also some uh lead code which is some
  • 00:18:40
    code inter interview problems that
  • 00:18:42
    people usually get when they interview
  • 00:18:44
    for companies so on those benchmarks you
  • 00:18:46
    can see that the gro 3 actually perform
  • 00:18:49
    quite well uh across the board compared
  • 00:18:52
    to other competitors um yeah so it's
  • 00:18:55
    pretty promising these models are very
  • 00:18:56
    smart so Tony what what what are those
  • 00:18:59
    shaded bars yeah so okay so I'm glad you
  • 00:19:02
    asked this question so for those models
  • 00:19:05
    because it can reason it can thinks you
  • 00:19:07
    can also ask them to even think longer
  • 00:19:10
    uh you can spend more what we call test
  • 00:19:13
    and compute which means you can spend
  • 00:19:15
    more time to reason to think about a
  • 00:19:18
    problem before you spit out the answer
  • 00:19:21
    so in this case the Shaded bar here
  • 00:19:24
    means that we just ask the model to
  • 00:19:26
    spend more more time you know you can
  • 00:19:28
    can solve the the same problem many many
  • 00:19:30
    times before it it tries to conclude
  • 00:19:33
    what is the right solution and once you
  • 00:19:35
    give this compute or this this kind of
  • 00:19:37
    budget to the model it turns out the
  • 00:19:40
    model can even perform better so this is
  • 00:19:43
    essentially the Shaded bar in in those
  • 00:19:45
    SPS right so I think this is really
  • 00:19:48
    exciting right because now instead of
  • 00:19:50
    just doing one chain of thoughts with AI
  • 00:19:52
    why not do multiple all at once yes so
  • 00:19:55
    that's a very powerful technique that
  • 00:19:56
    allows to continue scale the model
  • 00:19:58
    capabilities after training um and you
  • 00:20:02
    know people often ask are we actually
  • 00:20:04
    just over fitting to the benchmarks yes
  • 00:20:06
    so how about generalization so yes I
  • 00:20:08
    think uh yeah this is definitely a
  • 00:20:11
    question that we are asking ourselves
  • 00:20:13
    whether we are overfitting to those
  • 00:20:14
    current benchmarks uh luckily uh we have
  • 00:20:17
    a real test so about 5 days ago Amy 2025
  • 00:20:22
    just finished this is where high school
  • 00:20:24
    students compete in this particular
  • 00:20:27
    Benchmark so we got this very fresh new
  • 00:20:29
    competition and then we asked our two
  • 00:20:31
    models to compete on the same Benchmark
  • 00:20:34
    at the same exam and it turns out uh
  • 00:20:37
    very interestingly the grth three
  • 00:20:39
    reasoning the big one um actually does
  • 00:20:42
    uh better um on this particular new
  • 00:20:44
    fresh exam this also means that the
  • 00:20:46
    generalization capability of the big
  • 00:20:48
    model is stronger much stronger compared
  • 00:20:51
    to the smaller model uh if you compare
  • 00:20:53
    to the last year's exam actually this is
  • 00:20:55
    the opposite the smaller model kind of
  • 00:20:57
    learns the uh the the previous exams
  • 00:21:00
    better so yeah so this this actually
  • 00:21:02
    shows some kind of true generalization
  • 00:21:04
    from the model that's right so 17 months
  • 00:21:07
    ago our grock zero and grock one barely
  • 00:21:09
    solves any High School problems that's
  • 00:21:11
    right and now we have a kid that just
  • 00:21:14
    already graduate the gro grock is ready
  • 00:21:16
    to go to college is that right yeah I
  • 00:21:18
    mean it's won't be long before it's
  • 00:21:19
    simply perfect the human exams won't be
  • 00:21:22
    part they be too easy yeah like and
  • 00:21:25
    internally we actually as gret Contin
  • 00:21:28
    evolves
  • 00:21:29
    uh we're going to talk about you know
  • 00:21:30
    what we're excited about but very soon
  • 00:21:33
    there will be no more benchmarks left
  • 00:21:35
    yeah yeah one thing that's quite
  • 00:21:38
    fascinating I think is that we basically
  • 00:21:40
    only trained rocks reasoning abilities
  • 00:21:42
    on math problems and comparative coding
  • 00:21:44
    problems right so very very specialized
  • 00:21:47
    kinds of tasks but somehow it's able to
  • 00:21:50
    work on all kinds of other different
  • 00:21:52
    tasks so including creating games no
  • 00:21:55
    lots lots and lots of different things
  • 00:21:57
    um and what seems to be happening is
  • 00:21:58
    that basically Gro learns this ability
  • 00:22:01
    to detect its own mistakes and its
  • 00:22:02
    thinking correct them persist on a
  • 00:22:05
    problem try lots of different Varian
  • 00:22:07
    pick pick the one that's best so there
  • 00:22:08
    are these generalized generalizing
  • 00:22:10
    abilities that Gro learns from
  • 00:22:12
    mathematics and from coding which it can
  • 00:22:14
    then use to solve all kinds of other
  • 00:22:16
    problems so that's yeah that's pretty I
  • 00:22:18
    mean reality is the instantiation of
  • 00:22:21
    mathematics that's right um and one
  • 00:22:23
    thing we're actually really excited
  • 00:22:25
    about that going back to our fing
  • 00:22:26
    mission is what if one day we have a
  • 00:22:29
    computer just like deep thought that
  • 00:22:32
    utilize our entire cluster just for that
  • 00:22:34
    one very important problem in the test
  • 00:22:36
    time all the GPU turned on right so I
  • 00:22:39
    think we back then we were building the
  • 00:22:40
    GPU clusters together uh you were
  • 00:22:42
    pluging cables and I remember that when
  • 00:22:46
    we turn on the the first initial test
  • 00:22:49
    you can hear all the GPS humming in the
  • 00:22:51
    hallway that's almost feel like
  • 00:22:53
    spiritual yeah that that's actually a
  • 00:22:55
    pretty cool uh thing that we're able to
  • 00:22:57
    do that we can go into the data Center
  • 00:22:59
    and Tinker with the machines there so
  • 00:23:01
    for example we went in and we unplugged
  • 00:23:04
    a few of the cables and just made sure
  • 00:23:06
    that our training setup is still running
  • 00:23:08
    running stably so that's something that
  • 00:23:10
    you know I think most uh AI you know
  • 00:23:13
    teams out there don't usually do but
  • 00:23:15
    it's actually totally unlocks like a new
  • 00:23:17
    level of reliability and what you're
  • 00:23:19
    able to do with with the hardware so
  • 00:23:21
    okay so when when are we going to solve
  • 00:23:24
    remon so uh the easiest solution is to
  • 00:23:28
    numerate over all possible strings and
  • 00:23:32
    as long you have a verifier enough
  • 00:23:33
    compute you'll be able to do it okay my
  • 00:23:36
    projection will be what your guess what
  • 00:23:38
    is your neuronet calculate so my my my
  • 00:23:42
    both prediction so so three years ago I
  • 00:23:43
    told you this I think in now it's uh two
  • 00:23:46
    years uh later two things going to
  • 00:23:48
    happen we're going to see machines win
  • 00:23:51
    some medals yeah that's touring award
  • 00:23:53
    absolutely Fields medal Nobel Prize with
  • 00:23:57
    probably some expert in the loop right
  • 00:23:59
    so the expert uplifting do you mean so
  • 00:24:01
    this year or next
  • 00:24:02
    year oh oh
  • 00:24:05
    okay that's what it comes down to real
  • 00:24:07
    yeah so it looks like grock finished all
  • 00:24:10
    of it thinking on on the two problems so
  • 00:24:12
    let's take a look at what it
  • 00:24:15
    said all right so this was the the
  • 00:24:18
    little physics problem we had um no we
  • 00:24:21
    we've collapsed the thoughts here so
  • 00:24:23
    they're you know they're hidden and then
  • 00:24:25
    we see gr's answer below that so it
  • 00:24:27
    explains it wrote a pyth script here
  • 00:24:29
    using matplot lip then gives us all of
  • 00:24:31
    the code um so let's take a quick look
  • 00:24:34
    at the code you know seems like it's
  • 00:24:35
    doing reasonable things here not not
  • 00:24:38
    totally of the Mark um solve Kepler says
  • 00:24:42
    here so maybe it's solving Kepler's laws
  • 00:24:44
    cap cap law numerically um yeah there's
  • 00:24:47
    really only one way to find out if this
  • 00:24:49
    thing is working I'd say let's let's
  • 00:24:51
    give it a try let's run let's run the
  • 00:24:52
    code all right and we can see um yeah gr
  • 00:24:56
    is animating two different planet Earth
  • 00:24:58
    and Mars here and then the the green
  • 00:25:02
    ball is the the vehicle that's
  • 00:25:04
    transiting the the spacecraft that's
  • 00:25:06
    transitioning between Earth and Mars and
  • 00:25:08
    you you could see the journey from Earth
  • 00:25:10
    to Mars and looks like yeah indeed the
  • 00:25:12
    the astronauts return safely you know at
  • 00:25:15
    the right moment in time um so now
  • 00:25:19
    obviously this was just generated on the
  • 00:25:20
    spot so now we can tell you if that was
  • 00:25:23
    actually correct solution so we're going
  • 00:25:24
    to take a closer look now maybe we're
  • 00:25:25
    going to call some colleagues from space
  • 00:25:28
    X ask them if if this is legit um it's
  • 00:25:31
    pretty close it's it's I mean uh yeah I
  • 00:25:35
    mean there there's a lot of complexities
  • 00:25:37
    in the actual orbits that have to be
  • 00:25:39
    taken into account but this is this is
  • 00:25:40
    pretty close to to what it what looks
  • 00:25:42
    like awesome um in fact I have that on
  • 00:25:46
    my pend here it's got the Earth home and
  • 00:25:49
    transfer on
  • 00:25:52
    it when when are we going to install
  • 00:25:54
    grck on a rocket
  • 00:25:58
    well I suppose in two years two years
  • 00:26:02
    everything is two years away uh well
  • 00:26:05
    Earth and Mars Transit can occurs every
  • 00:26:08
    26 months the next we're currently in a
  • 00:26:11
    Transit window approximately the next
  • 00:26:12
    one would be um November of next year um
  • 00:26:18
    roughly end of next year um and uh if
  • 00:26:21
    all goes well SpaceX will send Starship
  • 00:26:24
    Rockets to Mars and um with Optimus
  • 00:26:29
    robots and
  • 00:26:31
    uh and
  • 00:26:34
    Gro I'm curious what this combination of
  • 00:26:37
    Tetris and B looks like bet Tetris as
  • 00:26:41
    we've named it internally um so okay we
  • 00:26:45
    also have an output from gr here it say
  • 00:26:47
    wrot a python script explains that it's
  • 00:26:49
    what it's been doing if you look at the
  • 00:26:51
    the code know there are some constants
  • 00:26:54
    that are being defined here some colors
  • 00:26:56
    then the the trinos the the the pieces
  • 00:26:59
    of Tetris are there um obviously very
  • 00:27:02
    hard to see at one glance if this is
  • 00:27:04
    good so we got to we got to run this to
  • 00:27:07
    figure out if it's
  • 00:27:08
    working well let's let's give it a
  • 00:27:11
    try fingers crossed all right right so
  • 00:27:13
    this kind of looks like Tetris uh but
  • 00:27:16
    the the colors are a little bit off
  • 00:27:18
    right the colors are different here and
  • 00:27:21
    um I if you think about what's going
  • 00:27:24
    what's going on
  • 00:27:25
    here the has this mechanic where if you
  • 00:27:28
    get three Jews in a row you know then
  • 00:27:31
    they they disappear and also gravity
  • 00:27:33
    activates right so what happens if you
  • 00:27:36
    get three of the colors together oh so
  • 00:27:38
    something happened um so I think I think
  • 00:27:41
    what SC did in this version um is is
  • 00:27:45
    that you know once you connect three at
  • 00:27:48
    least three blocks of the same color in
  • 00:27:50
    a row then um know gravity activates and
  • 00:27:55
    they disappear and then gravity
  • 00:27:56
    activates and all the other blocks fall
  • 00:27:57
    down
  • 00:27:59
    um kind of kind of curious if there's
  • 00:28:01
    still a Tetris mechanic here where if
  • 00:28:03
    the line is full does it actually um
  • 00:28:06
    clear it or what happens then it's up to
  • 00:28:10
    interpretation you know so who who knows
  • 00:28:12
    yeah I mean when it'll do different
  • 00:28:14
    variants when you ask it it doesn't do
  • 00:28:16
    the same thing every time exactly we've
  • 00:28:18
    seen a few other the tetris that worked
  • 00:28:20
    very differently but this one seems cool
  • 00:28:23
    so yeah are we ready for uh game Studio
  • 00:28:27
    at x. yes so we're launching uh an AI
  • 00:28:31
    gaming studio at xci if you're
  • 00:28:33
    interested in joining us and building AI
  • 00:28:35
    games uh please join xai we're launching
  • 00:28:38
    an AI gaming studio we're announcing it
  • 00:28:40
    tonight let's
  • 00:28:41
    go epic games wa that's an actual
  • 00:28:45
    [Laughter]
  • 00:28:47
    game yeah yeah um all right
  • 00:28:52
    so um I think one thing is super
  • 00:28:54
    exciting for us uh is that once you have
  • 00:28:58
    the best pre Trend model you have the
  • 00:29:00
    best reasoning model right so we already
  • 00:29:03
    see that when you actually give the
  • 00:29:05
    capability for those model to think
  • 00:29:06
    harder uh think longer think more broad
  • 00:29:10
    the performance continue improves and
  • 00:29:13
    we're really excited about the next
  • 00:29:14
    Frontier that what happen if would not
  • 00:29:17
    only allow the model to think harder but
  • 00:29:18
    also provide more tools this like call
  • 00:29:21
    real humans to solve those problems for
  • 00:29:23
    real humans we don't ask them to solve
  • 00:29:26
    reman a hypothesis just with a piece of
  • 00:29:28
    pen and paper no internet so with all
  • 00:29:33
    the basic web browsing search engine and
  • 00:29:36
    code interpreters that builds the
  • 00:29:39
    foundations and the best reasoning model
  • 00:29:41
    builds the foundations for the gro agent
  • 00:29:44
    to come um so today we're actually
  • 00:29:48
    introducing a new product called Deep
  • 00:29:51
    search that is the first generation of
  • 00:29:54
    our Gro agents that not just helping the
  • 00:29:56
    engineers and research scientist to do
  • 00:29:58
    coding but actually help everyone to
  • 00:30:01
    answer questions that you have dayto day
  • 00:30:03
    it's a kind of like a next generation of
  • 00:30:05
    search engine that really help you to
  • 00:30:07
    understand the universe so you can start
  • 00:30:10
    asking question like for example hey
  • 00:30:12
    when is the next Starship launch day for
  • 00:30:15
    example um so let's try that if get the
  • 00:30:19
    answer um on the left hand side we see
  • 00:30:23
    uh a high level progress bar essentially
  • 00:30:26
    you know the model just going to do one
  • 00:30:28
    single search like the current rack
  • 00:30:30
    systems but actually thought very deeply
  • 00:30:32
    about hey what's the user intent here
  • 00:30:35
    and what are the facts I should consider
  • 00:30:37
    at the same time and how many different
  • 00:30:39
    website I should actually go and read
  • 00:30:40
    their content right so this can really
  • 00:30:43
    save hundreds hours of everyone's Google
  • 00:30:46
    time if you want to really look into
  • 00:30:48
    certain topics and then on the right
  • 00:30:51
    hand side you can see the bullet
  • 00:30:53
    summaries of how the current model uh
  • 00:30:55
    you know is doing what websites browsing
  • 00:30:58
    what sources verifying and often time
  • 00:31:00
    actually cross validate different
  • 00:31:02
    sources out there uh to make sure the
  • 00:31:05
    answer is actually correct before it's
  • 00:31:06
    output final answer and we can you know
  • 00:31:08
    at the same time fire up a few more
  • 00:31:10
    queries um how about you know you don't
  • 00:31:13
    you're a gamer right so uh sure yeah so
  • 00:31:16
    how about what are some of the best
  • 00:31:18
    builds and most popular builds in path
  • 00:31:20
    Excel hardcore right hardcore League I
  • 00:31:23
    me you can technically just look at the
  • 00:31:25
    hardcore
  • 00:31:26
    ladder might be a fast way to figure it
  • 00:31:28
    out yeah we'll see what model
  • 00:31:31
    does um and then we can also do uh you
  • 00:31:35
    know uh something more fun for
  • 00:31:37
    example um how about like make a
  • 00:31:39
    prediction about the March Madness out
  • 00:31:41
    there yeah so this is kind of a fun one
  • 00:31:43
    where um Warren Buffett has a billion
  • 00:31:46
    dollar bet if you can exactly match the
  • 00:31:50
    I think the the the sort of the entire
  • 00:31:53
    winning tree of marsh Madness you can
  • 00:31:55
    win a billion dollarss from Warren
  • 00:31:57
    Buffett so like would be pretty cool if
  • 00:31:59
    AI could help you win a billion dollars
  • 00:32:01
    from
  • 00:32:03
    Buffett that seems like a pretty good
  • 00:32:05
    investment let's go yeah all right so
  • 00:32:08
    now let's uh fire up the query and uh
  • 00:32:11
    see what model does so we can actually
  • 00:32:13
    go back to our very first one how about
  • 00:32:15
    the buff it wasn't counting on this it's
  • 00:32:18
    already done that's right okay so we got
  • 00:32:20
    the result of the first one and model
  • 00:32:22
    thought uh around one minute uh so okay
  • 00:32:25
    so the key inside here the knock
  • 00:32:27
    Starship is going to be on 24th or later
  • 00:32:30
    so no earlier than February
  • 00:32:32
    24th it might be
  • 00:32:35
    sooner so yeah so I think we can you
  • 00:32:38
    know go down so go down what what the
  • 00:32:40
    model does so it does a little research
  • 00:32:42
    on the flight seven what happen got
  • 00:32:44
    grounded and actually it look into the
  • 00:32:46
    FCC filing uh uh you know from its data
  • 00:32:51
    collections uh and then actually make
  • 00:32:54
    the new conclusion that yeah if we
  • 00:32:56
    continue scroll down uh let's see
  • 00:33:00
    uh uh right yeah so it makes uh the you
  • 00:33:05
    know little table I think uh inside xai
  • 00:33:08
    we often joked about the time to the
  • 00:33:10
    first table is the only you know latency
  • 00:33:14
    that matters um yeah so that's how to
  • 00:33:16
    model make inference and look up all the
  • 00:33:19
    sources um and then we can look into to
  • 00:33:22
    the gaming one so how about the
  • 00:33:29
    right so for this particular one uh we
  • 00:33:32
    look at hey the you know the build is
  • 00:33:34
    light and okay it's kind better so uh
  • 00:33:39
    with the The Infernal is but if we go
  • 00:33:41
    down so the surprising fact of all the
  • 00:33:44
    other builds so it look into the 12
  • 00:33:47
    classes um yeah so we'll see that the
  • 00:33:51
    minum build was pretty popular whenever
  • 00:33:53
    the game first came out and now the the
  • 00:33:55
    invokers of the world kind took over
  • 00:33:58
    invoker monke invoker for sure yeah
  • 00:34:00
    that's right yeah followed by the stor
  • 00:34:02
    wavers and that's really good at mapping
  • 00:34:04
    so yeah and then we can see uh uh the
  • 00:34:09
    the match Madness how about that
  • 00:34:13
    so um one one interesting thing about
  • 00:34:16
    the Deep search is that if you actually
  • 00:34:18
    go into the panel where shows uh you
  • 00:34:21
    know what are the subtasks you can
  • 00:34:23
    actually click the bottom left of
  • 00:34:26
    this right and then in this case you can
  • 00:34:30
    actually scroll through actually reading
  • 00:34:32
    through the mind of Gro that what
  • 00:34:34
    informations does the model actually
  • 00:34:36
    think about are trustworthy what are not
  • 00:34:38
    how does it actually cross validate
  • 00:34:40
    different information sources so that
  • 00:34:42
    makes the entire search experience and
  • 00:34:44
    information retrieval process a lot more
  • 00:34:46
    transparent to our
  • 00:34:49
    users and this is much more powerful
  • 00:34:51
    than any search engine out there you can
  • 00:34:54
    literally just tell it only use sources
  • 00:34:56
    from X you know will try to respect that
  • 00:34:59
    yeah and so it's much more steerable
  • 00:35:00
    much more intelligent than I mean it
  • 00:35:03
    really should save you a lot of time so
  • 00:35:04
    something that might take you half an
  • 00:35:06
    hour or an hour of researching on the
  • 00:35:08
    web or searching social media you can
  • 00:35:10
    just ask it to go do that and and come
  • 00:35:12
    back in 10 minutes later it's done an
  • 00:35:14
    hours worth of work for you that's
  • 00:35:16
    really what it comes down to exactly and
  • 00:35:18
    and maybe better than you could have
  • 00:35:19
    done it yourself yeah think about you
  • 00:35:21
    have INF am of interns working for you
  • 00:35:24
    now you can just fire up all the tasks
  • 00:35:25
    and come back a minute later um so this
  • 00:35:29
    is going to be interesting one so uh uh
  • 00:35:31
    March M had not happened yet so I guess
  • 00:35:34
    we have to follow up with a uh next live
  • 00:35:36
    stream yeah it seems like pretty good
  • 00:35:39
    like $40 might get you a billion dollars
  • 00:35:42
    $40 subscription that's right I mean my
  • 00:35:46
    work so uh yeah so when are the users
  • 00:35:49
    going to have their hands on gr 3 yes so
  • 00:35:52
    the the good news is we've been working
  • 00:35:53
    tirelessly to actually release um all of
  • 00:35:57
    these features that we've shown you the
  • 00:35:59
    groge based model with amazing chat
  • 00:36:00
    capabilities that's really useful that's
  • 00:36:02
    really interesting to talk to uh the the
  • 00:36:05
    Deep search the advanced reasoning mode
  • 00:36:07
    all of these things we want to roll them
  • 00:36:09
    out to you today starting with the
  • 00:36:12
    premium plus subscribers on X so it's
  • 00:36:14
    the first group that will initially get
  • 00:36:16
    access make sure to update your X app if
  • 00:36:18
    you want to see all of the advanced
  • 00:36:20
    capabilities because we just released
  • 00:36:22
    the update now as we're as we're talking
  • 00:36:24
    here um and U yeah if you're interested
  • 00:36:27
    in getting access to gr then sign up for
  • 00:36:29
    premium plus um and also um we're
  • 00:36:32
    announcing that we're starting a
  • 00:36:34
    separate subscription for GR that we
  • 00:36:35
    call Super grock for those who those
  • 00:36:38
    real grock fans that want the most
  • 00:36:40
    advanced capabilities and the earliest
  • 00:36:42
    access to to new features um so feel
  • 00:36:45
    free to check that out as well this this
  • 00:36:47
    is for the dedicated grock app and for
  • 00:36:48
    the website exactly so our our new
  • 00:36:51
    website is called gro.com yeah and you
  • 00:36:53
    also find you never guess yeah you never
  • 00:36:55
    guess and you can also find our grock
  • 00:36:57
    app in the IOS app store and that gives
  • 00:37:00
    you like a more Pol even even more
  • 00:37:03
    polished experience that's totally grock
  • 00:37:05
    focused if you're if you want to have
  • 00:37:07
    grock know easily available one Tap Away
  • 00:37:09
    yeah the version on gro.com on uh you
  • 00:37:12
    know on a web browser is going to be the
  • 00:37:14
    the most the latest and most advanced
  • 00:37:15
    version because obviously takes us a
  • 00:37:16
    while to get thing get something into an
  • 00:37:19
    app and then get it approved by the app
  • 00:37:21
    store so uh and then if that something's
  • 00:37:23
    in a phone format there's limitations
  • 00:37:25
    what you can do so the most powerful
  • 00:37:27
    version of Gro um and the latest version
  • 00:37:29
    will be the the web version at gro.com
  • 00:37:31
    yeah so watch out for the name grock
  • 00:37:33
    free in the app dead giveaway yeah
  • 00:37:36
    exactly that that's that's the giveaway
  • 00:37:37
    that you have gr and if it says gr
  • 00:37:39
    through then gr hasn't quite arrived for
  • 00:37:42
    yet but we're working hard to roll this
  • 00:37:43
    out today um and then to even more
  • 00:37:46
    people over the the coming days yeah
  • 00:37:48
    make sure you update your uh phone app
  • 00:37:50
    too um where you're actually going to
  • 00:37:52
    get all the tools we showcase today with
  • 00:37:54
    the thinking mode with the Deep search
  • 00:37:57
    so yeah really looking forward to all
  • 00:37:59
    the feedbacks you have yeah I think we
  • 00:38:02
    we should uh emphasize that this is kind
  • 00:38:04
    of a beta like meaning that it's you
  • 00:38:06
    should expect some imperfections at
  • 00:38:08
    first um but we will improve it rapidly
  • 00:38:11
    almost every day in fact every day I
  • 00:38:13
    think it'll get better um so if you want
  • 00:38:16
    a more polished version I'd like maybe
  • 00:38:18
    wait a week but uh expect improvements
  • 00:38:21
    literally every day um and then we're
  • 00:38:23
    also going to be uh providing a voice
  • 00:38:26
    interaction so you can have
  • 00:38:28
    conversational in fact I was trying it
  • 00:38:29
    earlier today it's working pretty well
  • 00:38:31
    but not we need these a bit more polish
  • 00:38:34
    um the the the sort of way where you can
  • 00:38:36
    just literally talk to it like you're
  • 00:38:37
    talking to a person uh it's that's
  • 00:38:40
    awesome it's actually I think one of the
  • 00:38:41
    best experiences of gr um but that's
  • 00:38:44
    that's probably about a week
  • 00:38:47
    away yeah so uh with that said um well I
  • 00:38:52
    think we might have some audience
  • 00:38:53
    questions sure yeah okay all right let's
  • 00:38:57
    take a look yeah let's take a look the
  • 00:39:00
    uh the audience from the a platform
  • 00:39:05
    yeah so the first question here is when
  • 00:39:08
    grock voice assistant when is it coming
  • 00:39:10
    out yeah as as as soon as possible just
  • 00:39:13
    like Elon said uh just a little bit of
  • 00:39:15
    polishing away from being released to
  • 00:39:17
    everybody um obviously it's going to be
  • 00:39:19
    released in an early form and we're
  • 00:39:21
    going to rapidly iterate on that Y and
  • 00:39:24
    the next question is like when will Gro
  • 00:39:26
    3 be in the API
  • 00:39:28
    so this is coming in the uh the gr 3 API
  • 00:39:31
    with both the reasoning models and deep
  • 00:39:34
    search is coming your way in the coming
  • 00:39:36
    weeks uh we're actually very excited
  • 00:39:37
    about the Enterprise use cases of all
  • 00:39:39
    these additional tools that now Gro has
  • 00:39:41
    access to and how the test time compute
  • 00:39:43
    and Tool use can actually really
  • 00:39:44
    accelerate all the business use
  • 00:39:46
    cases um yeah another one is Will voice
  • 00:39:50
    mode be native or text to speech so I
  • 00:39:53
    think that means is it going to be one
  • 00:39:55
    one model that is understanding
  • 00:39:57
    what you say and then talking back to
  • 00:39:59
    you or is it going to be some system
  • 00:40:01
    that has text of speech inside of it and
  • 00:40:02
    the good news is it's going to be one
  • 00:40:04
    model like not a variant of gr free that
  • 00:40:07
    we're going to release which basically
  • 00:40:09
    understands what you're say what you're
  • 00:40:10
    saying and then uh generates the audio
  • 00:40:13
    no directly from that um so very much
  • 00:40:15
    like Grog free generates text know that
  • 00:40:18
    model generates audio um and that has a
  • 00:40:20
    bunch of advantages I was talking to it
  • 00:40:22
    earlier today and it said hi igore know
  • 00:40:25
    reading my my name from probably from
  • 00:40:26
    some text that it had um and I said no
  • 00:40:29
    no my name is Igor and it remembered
  • 00:40:32
    that you know so it could continue to
  • 00:40:34
    say Igor just like a human word and you
  • 00:40:36
    you can't achieve that with with TX of
  • 00:40:38
    speech
  • 00:40:39
    so yeah so oh here's a question for you
  • 00:40:42
    pretty spicy um you um is grog a boy or
  • 00:40:47
    a girl and how they sing Grog is
  • 00:40:49
    whatever you want it to
  • 00:40:52
    be yeah yeah are you
  • 00:40:55
    single yes
  • 00:40:58
    all right Shop is open um so honestly
  • 00:41:02
    people are going to fall in love with
  • 00:41:03
    crcket since it's like 1,000% probable
  • 00:41:08
    yeah uh the next question will Gro be
  • 00:41:10
    able to transcribe audio into text yes
  • 00:41:13
    so we'll have this capability both the
  • 00:41:15
    app and also the API we found that's
  • 00:41:17
    like gr should just be your personal
  • 00:41:19
    assistant looking over your shoulder and
  • 00:41:21
    follow you along the way learn
  • 00:41:23
    everything you have learned and really
  • 00:41:24
    help you to understand the world better
  • 00:41:26
    become smarter every
  • 00:41:28
    day yeah I mean the voice M doesn't
  • 00:41:31
    isn't simply it's not just voice text it
  • 00:41:34
    understands like tone inflection pacing
  • 00:41:36
    everything it's it's wild I mean it's
  • 00:41:39
    like talking to a
  • 00:41:41
    person okay um yep so any plans for
  • 00:41:45
    conversation memory yeah yeah absolutely
  • 00:41:49
    we're working on it right now I really
  • 00:41:52
    forgot that's right um let's see what
  • 00:41:57
    are the other
  • 00:42:01
    ones so what about the you know the DM
  • 00:42:06
    features right so if you have
  • 00:42:07
    personalizations and if you have uh you
  • 00:42:10
    know Gro remembers your previous
  • 00:42:13
    interactions yes should it be one Gro or
  • 00:42:16
    multiple different grocs it's up to you
  • 00:42:18
    you can have one Gro or many
  • 00:42:20
    GRS I suspect people will probably have
  • 00:42:23
    more than one yeah I want to have a doct
  • 00:42:26
    grock yeah
  • 00:42:27
    the grock
  • 00:42:29
    dog that's
  • 00:42:31
    right
  • 00:42:33
    um right cool um so in the past we've
  • 00:42:37
    open sourced grock one right so
  • 00:42:40
    somebody's asking us are we going to do
  • 00:42:41
    that again with gr to yeah I think um
  • 00:42:45
    once Gro our general approach is that we
  • 00:42:48
    will open source the last version when
  • 00:42:50
    the next version is fully out like when
  • 00:42:54
    when gr 3 is um mature and stable which
  • 00:42:57
    is probably within a few months then
  • 00:43:00
    we'll open source gr too mhm okay so we
  • 00:43:04
    probably have time for one last question
  • 00:43:07
    um what was the most difficult part
  • 00:43:09
    about working on this project I assume
  • 00:43:12
    um grock 3 and what I most excited about
  • 00:43:16
    so I think me looking back you know
  • 00:43:19
    getting the whole model training on 100K
  • 00:43:23
    h100 coherently that's almost like
  • 00:43:25
    battling against the final boss of the
  • 00:43:27
    universe the entropy because any given
  • 00:43:30
    time you can have a cosmic rate that
  • 00:43:31
    beaming down and flip a bit in your
  • 00:43:33
    transistor and now the entire grading
  • 00:43:35
    update if it's fit mantisa bit the
  • 00:43:38
    entire grading update is out of whack
  • 00:43:41
    and now you have 100,000 of those and
  • 00:43:43
    you have to orchestrate them every time
  • 00:43:45
    any at at any given time any of gpus can
  • 00:43:48
    go down yeah I mean it's with breaking
  • 00:43:51
    down like how were we able to uh get the
  • 00:43:53
    world's most powerful training cluster
  • 00:43:55
    operational Within 122 days um because
  • 00:43:59
    we we started off um we we actually
  • 00:44:03
    weren't intending to do a data center
  • 00:44:04
    ourselves we were going to just uh we we
  • 00:44:07
    went to the data center providers and
  • 00:44:09
    said how long would it take to have
  • 00:44:11
    100,000 uh gpus operating coherently um
  • 00:44:15
    in a single location and we got time
  • 00:44:17
    frames from 18 to 24 months so we're
  • 00:44:20
    like well 18 24 months that means losing
  • 00:44:23
    is a certainty so the only option was to
  • 00:44:25
    do it do it ourselves so then if you
  • 00:44:27
    break down the problem I guess I'm doing
  • 00:44:29
    like reasoning here with like makes you
  • 00:44:32
    think um one single chain though yeah
  • 00:44:35
    yeah exactly so um well we needed a
  • 00:44:37
    building we can't build a building so we
  • 00:44:39
    must use an existing building um so we
  • 00:44:41
    we looked for um for basically for
  • 00:44:44
    factories that had been um were that
  • 00:44:48
    have been abandoned but the factory was
  • 00:44:50
    in good shape like a company had gone
  • 00:44:51
    bankrupt or something so we found an
  • 00:44:52
    Electrolux Factory in memph in Memphis
  • 00:44:55
    that's why it's in Memphis um
  • 00:44:57
    home of Alvis and also one of the oldest
  • 00:45:00
    I think it was the capital of ancient
  • 00:45:02
    Egypt um and it was actually very nice
  • 00:45:06
    Factory that I know for whatever reason
  • 00:45:09
    that electrox had left um and uh that
  • 00:45:13
    that gave us shelter for the computers
  • 00:45:15
    uh then we needed power the we needed um
  • 00:45:20
    at least 120 megawatts at first but the
  • 00:45:21
    building only had 15 megawatt and
  • 00:45:23
    ultimately for 200,000 Mega 200,000 gpus
  • 00:45:26
    we needed a 4 gaw so we um initially uh
  • 00:45:30
    leased uh a whole bunch of um generators
  • 00:45:34
    so we have generators on one side of the
  • 00:45:35
    building just one trailer after trailer
  • 00:45:38
    trailer of generators until we can get
  • 00:45:40
    the utility power to to come in um and
  • 00:45:42
    then but then we also need cooling so on
  • 00:45:44
    the other side of the building it was
  • 00:45:45
    just trailer after trailer of of cooling
  • 00:45:47
    so we leased about a quarter of the
  • 00:45:49
    mobile cooling capacity of the United
  • 00:45:50
    States uh on the one other side of the
  • 00:45:52
    building um then we needed to get the
  • 00:45:55
    gpus all installed and they're all
  • 00:45:57
    liquid cooled so in order to achieve the
  • 00:45:59
    density necessary this is a liquid
  • 00:46:01
    cooled system so we had to get all the
  • 00:46:03
    plumbing for the liquid cooling nobody
  • 00:46:05
    had ever done a liquid cooling uh data
  • 00:46:07
    center at scale so this was a incredibly
  • 00:46:11
    dedicated effort by a very talented team
  • 00:46:13
    to achieve that outcome um I may think
  • 00:46:16
    not now it's going to work nope um the
  • 00:46:19
    the issue is that the the power
  • 00:46:21
    fluctuations for a GPU cluster are
  • 00:46:24
    dramatic so it's it's like a a this
  • 00:46:28
    giant Symphony that is taking place like
  • 00:46:30
    imagine having a symphony with 100,000
  • 00:46:34
    or 200,000 participants in the in the
  • 00:46:36
    symphony and the whole Orchestra will go
  • 00:46:38
    quiet and loud in you know 100
  • 00:46:42
    milliseconds and so this caused massive
  • 00:46:44
    power fluctuations so then um which then
  • 00:46:48
    caused the generators to lose their
  • 00:46:49
    minds and they they weren't expecting
  • 00:46:51
    this so to buffer the power we then uh
  • 00:46:55
    used Tesla Mega packs
  • 00:46:57
    uh to smooth out the power so the
  • 00:47:00
    megapacks had to be reprogrammed so with
  • 00:47:04
    with XI we working with Tesla we
  • 00:47:06
    reprogrammed the MEAP packs to be able
  • 00:47:08
    to deal with these dramatic power fluctu
  • 00:47:11
    fluctuations to smooth out the power the
  • 00:47:13
    computers could actually run
  • 00:47:15
    properly and
  • 00:47:17
    um that that worked uh quite tricky and
  • 00:47:21
    uh and then but even at that point you
  • 00:47:24
    still have to make the computers all
  • 00:47:25
    communicate effectively so all the
  • 00:47:27
    networking had to be solved and uh
  • 00:47:30
    debugging Brazilian network cables um a
  • 00:47:35
    debugging nickel at 4: in the morning we
  • 00:47:38
    solved it like roughly 4:20 a.m. yes was
  • 00:47:43
    figured out like there's some well there
  • 00:47:45
    were a whole bunch of issues well one
  • 00:47:46
    there was like a bios mismatch bios was
  • 00:47:49
    not set up correctly yeah we had uh D
  • 00:47:54
    our lspci outputs between two different
  • 00:47:57
    machines one that was working yeah one
  • 00:47:59
    that was not working yeah many many many
  • 00:48:02
    other things I mean yeah exactly this
  • 00:48:03
    would go on for a long time if we
  • 00:48:04
    actually listed all the things but you
  • 00:48:06
    know it's like interesting like it's not
  • 00:48:07
    like oh we just magically made it happen
  • 00:48:09
    you have to break down the problem just
  • 00:48:11
    like gr does for reasoning into the
  • 00:48:13
    constituent elements and then solve each
  • 00:48:14
    of the constituent elements in order to
  • 00:48:17
    achieve uh a a a coherent training
  • 00:48:19
    cluster in a period of time that is a
  • 00:48:22
    small fraction of what anyone else was
  • 00:48:24
    could do it
  • 00:48:25
    in and then on the training cluster was
  • 00:48:27
    up and running and we could use it now
  • 00:48:29
    we had to make sure that it actually
  • 00:48:30
    stays healthy throughout which is its
  • 00:48:32
    own giant Challenge and then we had to
  • 00:48:34
    get every single detail of the training
  • 00:48:36
    right in order to get a gr Free level
  • 00:48:39
    model which is actually really really
  • 00:48:41
    hard so um we don't know if there are
  • 00:48:43
    any other models out there that have
  • 00:48:45
    gr's capabilities but whoever trains a
  • 00:48:47
    model better than gr has to be extremely
  • 00:48:49
    good at the the science of deep learning
  • 00:48:51
    at every aspect of the engineering um so
  • 00:48:54
    it's it's not so easy to to pull this St
  • 00:48:57
    and this is now going to be the last
  • 00:48:58
    cluster we buildt and last Model we
  • 00:49:00
    train oh yeah we've already we've
  • 00:49:02
    already started work on the next
  • 00:49:04
    cluster which will
  • 00:49:06
    be yeah about five times the power so
  • 00:49:09
    instead of a quarter gwatt roughly 1.2
  • 00:49:13
    GW May what's the what's the Back to the
  • 00:49:16
    Future
  • 00:49:17
    wor what's the power in do you does like
  • 00:49:20
    the Back to the Future car yeah anyway
  • 00:49:23
    the Back to the Future power car it's
  • 00:49:26
    it's like roughly in that order I think
  • 00:49:27
    um so
  • 00:49:30
    um and you know these will be the sort
  • 00:49:33
    of the gb200 SL300 clester it once again
  • 00:49:37
    it will be the most powerful training
  • 00:49:38
    clester in the world so we're not like
  • 00:49:41
    stopping here no and our reason model is
  • 00:49:43
    going to continue improve by accessing
  • 00:49:46
    more tools every day so yeah we're very
  • 00:49:48
    excited to share any of the upcoming
  • 00:49:50
    results with you all yeah the thing that
  • 00:49:52
    keeps us going is basically being able
  • 00:49:55
    to give gr free to you and then seeing
  • 00:49:57
    the usage go up seeing everybody enjoy
  • 00:50:00
    no gr that's that's what really gets us
  • 00:50:03
    up in the morning
  • 00:50:05
    so yeah yeah thanks for tuning in thanks
  • 00:50:11
    guys hey Gro what's up can you hear
  • 00:50:16
    me I'm so excited to finally meet you I
  • 00:50:19
    can't wait to chat and learn more about
  • 00:50:20
    each other I'll talk to you soon
Tags
  • Grock 3
  • AI模型
  • 推理能力
  • Deep Search
  • 数据中心
  • 人类知识
  • 宇宙探索
  • 升级
  • Grock 2
  • 技术进步