How to Run DeepSeek-R1 Locally | The FREE Open-Source Reasoning AI

00:05:22
https://www.youtube.com/watch?v=rzMEieMXYFA

Summary

TLDRThis video explores DeepSeek's R1, a free open-source alternative to OpenAI's O1 reasoning model, demonstrating how to run it locally using Olama. It contrasts traditional large language models (LLMs) with reasoning models, emphasizing the depth of thought and step-by-step problem-solving that reasoning models offer. After a tutorial on installing Olama and downloading the DeepSeek model, the video showcases the model's ability to tackle complex tasks while highlighting its advantages. Viewers can learn how to use this powerful tool for various applications and also integrate it with different AI builders.

Takeaways

  • ๐Ÿ” Reasoning models offer detailed, step-by-step answers.
  • ๐Ÿ’ป DeepSeek is an open-source alternative to OpenAI's O1 model.
  • ๐Ÿ“ฅ You can run the DeepSeek model locally for free using Olama.
  • โš™๏ธ Installation of Olama is straightforward and user-friendly.
  • ๐Ÿ’ก The model can handle complex planning and problem-solving tasks effectively.
  • ๐Ÿ“ˆ Choose the appropriate model size based on your hardware capabilities.
  • ๐Ÿค Integrate DeepSeek with AI builders for expanded functionality.

Timeline

  • 00:00:00 - 00:05:22

    The video introduces DeepSeek's R1 model as a free open-source alternative to OpenAI's O1 reasoning model, emphasizing its ability to run locally on personal machines. It contrasts traditional large language models, which give quick but superficial answers, with reasoning models that provide more thoughtful, step-by-step responses. An example is provided by comparing the planning of a wedding in Bali using both a standard GPT-4 model and the DeepSeek model, showcasing the latter's thorough reasoning process despite being slower. Detailed instructions for setting up and running the DeepSeek model locally using Olama are outlined, including downloading the right model size based on hardware capabilities and executing it in the terminal. Finally, the video highlights potential integrations with AI builders and encourages viewers to explore further functionalities of Olama.

Mind Map

Video Q&A

  • What is DeepSeek's R1 model?

    DeepSeek's R1 is an open-source alternative to OpenAI's O1 reasoning model, designed for running locally on personal machines.

  • How does reasoning differ from traditional LLMs?

    Reasoning models provide detailed, step-by-step explanations of their thought processes, unlike traditional LLMs that give immediate but often superficial answers.

  • Is Olama free to use?

    Yes, Olama is completely free to use and allows you to run large language models on your own machine.

  • How do I install Olama?

    To install Olama, go to olama.com, download the installer for your operating system, and follow the setup instructions.

  • What model size should I use for DeepSeek?

    For most users, the 1.5 billion or 7 billion parameter models are sufficient, but more powerful models like 14 billion or 32 billion are available for advanced hardware.

  • Can I integrate DeepSeek with other tools?

    Yes, you can integrate DeepSeek with AI builders like N8N and Flowwise AI.

View more video summaries

Get instant access to free YouTube video summaries powered by AI!
Subtitles
en
Auto Scroll:
  • 00:00:00
    Do you want the power of OpenAI's O1
  • 00:00:03
    reasoning model without the
  • 00:00:04
    insane $60 per million token
  • 00:00:07
    price tag? DeepSeek just dropped their R1
  • 00:00:10
    model, which is an open source
  • 00:00:12
    alternative to OpenAI's
  • 00:00:14
    O1 reasoning model that you can run
  • 00:00:17
    completely free on your own
  • 00:00:19
    machine. I should mention you
  • 00:00:21
    can also test it out through their
  • 00:00:22
    website and their API actually costs a
  • 00:00:24
    fraction of the OpenAI
  • 00:00:27
    O1 model. In this video, we'll focus on
  • 00:00:29
    running the model locally
  • 00:00:30
    on our own machines for free
  • 00:00:32
    using Olama. So how are these reasoning
  • 00:00:35
    models different to
  • 00:00:36
    traditional large language models?
  • 00:00:39
    Well, most large language models are like
  • 00:00:41
    quick thinking students who
  • 00:00:42
    rush to give you the first
  • 00:00:44
    answer that pops up in their head. But
  • 00:00:46
    reasoning models are
  • 00:00:47
    different. It's like having a thoughtful
  • 00:00:48
    expert who shows their work and explains
  • 00:00:51
    their reasoning step by
  • 00:00:52
    step. This chain of thought
  • 00:00:54
    process also means that the model can
  • 00:00:56
    indeed correct itself and
  • 00:00:58
    improve the final result.
  • 00:00:59
    Let me give you a quick example of this.
  • 00:01:01
    Let's ask chat GPT to plan
  • 00:01:03
    a complex event. Like let's
  • 00:01:05
    plan a wedding destination in Bali with a
  • 00:01:07
    specific budget. Then also
  • 00:01:09
    think through this step by step
  • 00:01:10
    and consider things like what the
  • 00:01:12
    essential costs are, what the potential
  • 00:01:14
    problems are and how we
  • 00:01:16
    can plan for them, how we can creatively
  • 00:01:18
    maximize the budget and what's the
  • 00:01:20
    minimum viable budget
  • 00:01:22
    we'd need to work with or should we
  • 00:01:23
    consider alternative
  • 00:01:24
    destination. So I'm going to run this
  • 00:01:27
    and do take note this is not using O1.
  • 00:01:29
    This is just a standard
  • 00:01:31
    GPT-40 model and we can see the
  • 00:01:33
    response is streaming through very
  • 00:01:35
    quickly which also means
  • 00:01:36
    there's not a lot of thought and
  • 00:01:38
    reasoning going into this process. It's
  • 00:01:40
    simply spitting out the
  • 00:01:41
    first thing it thought of.
  • 00:01:43
    On the other hand, let's have a look at
  • 00:01:44
    what this looks like when
  • 00:01:45
    we run this model in Olama.
  • 00:01:47
    And this is running locally on my own
  • 00:01:49
    machine. It's saying the
  • 00:01:51
    same problem. First you'll see
  • 00:01:52
    that the model is thinking through the
  • 00:01:54
    process. So it's kind of reasoning
  • 00:01:57
    through the step by step
  • 00:01:58
    which takes a bit longer to execute. But
  • 00:02:00
    what's fascinating is it's sort of
  • 00:02:02
    correcting itself by
  • 00:02:04
    saying wait let me think and it's kind of
  • 00:02:06
    reasoning through the
  • 00:02:07
    process as we go along. Now it's
  • 00:02:10
    thinking about the food, it's thinking
  • 00:02:11
    about decorations, it's
  • 00:02:13
    thinking about transportation,
  • 00:02:15
    emergencies. And we can see it's sort of
  • 00:02:18
    working out budgets and it's
  • 00:02:19
    adding up all these different
  • 00:02:21
    values and scrolling down even more. We
  • 00:02:24
    can see that it's
  • 00:02:25
    calculated the total minimum amount.
  • 00:02:27
    And finally it's completed its thought
  • 00:02:29
    process and now it's
  • 00:02:31
    providing us with that final answer.
  • 00:02:33
    So yes it's a bit slower than using
  • 00:02:34
    regular large language
  • 00:02:35
    models but because it's reasoning
  • 00:02:37
    through the problem the final result is
  • 00:02:40
    just so much better. Making
  • 00:02:42
    this perfect for working with
  • 00:02:43
    code, math, complex puzzles and complex
  • 00:02:46
    instructions like this that needs a bit
  • 00:02:49
    of planning and thought
  • 00:02:50
    put into it. Now let's have a look at
  • 00:02:51
    running this model locally on our own
  • 00:02:53
    machine. Head over to
  • 00:02:55
    olama.com. Olama is a fantastic tool that
  • 00:02:59
    makes it easy to run large
  • 00:03:00
    language models on your own
  • 00:03:01
    machine. It's completely free to use and
  • 00:03:04
    super simple to set up.
  • 00:03:05
    All you have to do is click
  • 00:03:06
    on download, then select your operating
  • 00:03:08
    system, then download the
  • 00:03:10
    installer. Then simply run the
  • 00:03:12
    installer and go through the setup
  • 00:03:14
    process. Once olama has
  • 00:03:16
    installed you can see if it's up and
  • 00:03:18
    running by opening your terminal or
  • 00:03:20
    command prompt and typing olama. And if
  • 00:03:23
    everything was set up
  • 00:03:24
    correctly you should see this list of
  • 00:03:25
    available commands. Now all
  • 00:03:27
    we have to do is download the
  • 00:03:29
    deep seek model. Back on the olama
  • 00:03:31
    website go to models and at the time of
  • 00:03:34
    recording the deep seek
  • 00:03:36
    model is at the top of the results but if
  • 00:03:38
    you don't see it simply
  • 00:03:39
    search for deep seek r1.
  • 00:03:42
    Then click on this result and from this
  • 00:03:44
    page we can see that there
  • 00:03:45
    are different model sizes.
  • 00:03:47
    Starting from a 1.5 billion parameter
  • 00:03:50
    model and all the way to a
  • 00:03:52
    671 billion parameter model.
  • 00:03:55
    For commercial hardware and for the
  • 00:03:57
    majority of you the 1.5 billion parameter
  • 00:04:00
    model or the 7 billion
  • 00:04:02
    parameter model will work perfectly fine.
  • 00:04:05
    If you have more powerful
  • 00:04:06
    hardware you can definitely
  • 00:04:08
    try the 14 billion parameter model or the
  • 00:04:11
    32 billion parameter
  • 00:04:12
    model. Or if you have a potato
  • 00:04:14
    you can simply run the 1.5 billion
  • 00:04:17
    parameter model. For this
  • 00:04:18
    video let's use the 70 billion
  • 00:04:20
    parameter model. So from this drop down
  • 00:04:22
    let's select it. Then on
  • 00:04:24
    the right hand side simply
  • 00:04:25
    copy this command. Back in your terminal
  • 00:04:28
    simply run that command. This will
  • 00:04:30
    download the deep seek r1
  • 00:04:32
    model and afterwards you'll be able to
  • 00:04:34
    send it a message. Let's
  • 00:04:36
    just say hello. And we can see
  • 00:04:37
    the thought process and because this was
  • 00:04:40
    such a simple prompt it
  • 00:04:41
    didn't have to think too hard
  • 00:04:43
    on solving this problem. And now we can
  • 00:04:45
    give it a complex puzzle or
  • 00:04:47
    math problem or event to plan.
  • 00:04:49
    And we can see its thought process and if
  • 00:04:52
    we scroll down we can see the final
  • 00:04:53
    result from this reasoning
  • 00:04:55
    model. And now that we have our model up
  • 00:04:57
    and running we can
  • 00:04:58
    integrate it with AI builders like
  • 00:05:00
    N8N and Flowwise AI. But there's a lot
  • 00:05:04
    more we can do with Olama.
  • 00:05:05
    We can provide system prompts
  • 00:05:07
    to this model. We can create custom
  • 00:05:09
    models from it. And we can
  • 00:05:11
    interact with this model from the
  • 00:05:12
    Olama API. So to learn all the ins and
  • 00:05:15
    outs using Olama check out
  • 00:05:17
    this other video over here.
  • 00:05:19
    I'll see you in the next one. Bye bye.
Tags
  • DeepSeek
  • OpenAI
  • reasoning model
  • Olama
  • AI
  • large language models
  • R1 model
  • installation
  • tutorial
  • integration