What is a large language model (LLM)?

A large language model (LLM) is a type of AI that processes and generates text based on input prompts.

What is the difference between LLMs and chatbots?

LLMs are the underlying technology, while chatbots are tools that use LLMs to provide user-friendly interfaces.

What is a context window?

A context window refers to the amount of text (measured in tokens) that an AI can process at one time.

A token is a unit of text that an AI uses to read and understand input; it can be a word, punctuation, or part of a word.

What does temperature mean in AI parameters?

Temperature controls the creativity of the AI's responses; higher values lead to more creative outputs, while lower values yield more predictable results.

What are system prompts?

System prompts are instructions that define how the AI should behave or respond consistently.

How can I improve AI responses for writing?

You can improve responses by using system prompts for style consistency and editing AI responses to guide future outputs.

What is the needle in the haystack problem?

This problem occurs when providing too much context, making it harder for the AI to identify specific information.

What are reasoning models?

Reasoning models are advanced AI that can think through problems before providing answers, resulting in higher quality responses.

How should I summarize a book for AI?

Instead of inputting an entire book, summarize each chapter individually to provide concise context for the AI.

Master These Basics if You Want to Write with AI

00:21:12

https://www.youtube.com/watch?v=4HgzRPpyv9w

Zusammenfassung

TLDRThe video serves as a comprehensive guide for authors interested in using AI, explaining fundamental concepts such as large language models (LLMs), reasoning models, context windows, and tokens. It highlights the distinction between LLMs and tools like chatbots, emphasizing the importance of understanding these terms to effectively utilize AI in writing. The speaker discusses the significance of testing different models for various writing tasks and provides insights into AI parameters like temperature, which affects creativity in responses. Additionally, the video covers how to structure prompts using system prompts, user inputs, and AI responses to enhance the writing process. Overall, it aims to equip authors with the knowledge needed to leverage AI effectively.

Mitbringsel

⚡️ Understand the difference between LLMs and chatbots.
🧠 Reasoning models provide higher quality answers.
📜 Context windows determine how much text AI can process.
🔑 Tokens are the units AI uses to read text.
🎨 Temperature controls the creativity of AI responses.
📚 Summarize books instead of inputting them whole.
🛠️ Use system prompts for consistent AI behavior.
🔍 Test different models for specific writing tasks.
💡 Edit AI responses to improve future outputs.
💬 Engage with AI communities for support and resources.

Zeitleiste

00:00:00 - 00:05:00
The video introduces essential AI terminology for authors, emphasizing the importance of understanding concepts like large language models (LLMs) and wrapper tools. The analogy of electricity and appliances is used to explain how LLMs serve as the raw power behind various applications, including chatbots, which are merely interfaces for utilizing LLMs. The speaker highlights the need for authors to grasp these distinctions to effectively leverage AI in their writing processes.
00:05:00 - 00:10:00
The speaker discusses the difference between regular models and reasoning models in AI. Reasoning models are newer and provide higher quality answers by thinking before responding, making them particularly useful for tasks like editing and brainstorming. Authors are encouraged to experiment with different LLMs to find the best fit for their specific writing needs, as each model has unique strengths and weaknesses.
00:10:00 - 00:15:00
The video explains the concepts of context windows and tokens, clarifying that tokens are not equivalent to words. A context window refers to the amount of text an AI can process at once, with larger windows allowing for more extensive prompts. However, the speaker warns against overwhelming the AI with too much information, suggesting summarizing content instead. A general rule of thumb is that 100 tokens roughly equal 75 words, and authors should be cautious about inputting entire books into AI.
00:15:00 - 00:21:12
The final segment covers AI parameters like temperature and max tokens, which influence the AI's response style and creativity. The temperature setting adjusts the predictability of responses, while max tokens limit the amount of text the AI can read. The speaker emphasizes the importance of understanding these parameters to optimize AI interactions, and concludes by explaining the components of prompts: system prompts, user inputs, and AI responses, encouraging authors to utilize these effectively for better results.

Mind Map

Video-Fragen und Antworten

What is a large language model (LLM)?
A large language model (LLM) is a type of AI that processes and generates text based on input prompts.
What is the difference between LLMs and chatbots?
LLMs are the underlying technology, while chatbots are tools that use LLMs to provide user-friendly interfaces.
What is a context window?
A context window refers to the amount of text (measured in tokens) that an AI can process at one time.
What is a token?
A token is a unit of text that an AI uses to read and understand input; it can be a word, punctuation, or part of a word.
What does temperature mean in AI parameters?
Temperature controls the creativity of the AI's responses; higher values lead to more creative outputs, while lower values yield more predictable results.
What are system prompts?
System prompts are instructions that define how the AI should behave or respond consistently.
How can I improve AI responses for writing?
You can improve responses by using system prompts for style consistency and editing AI responses to guide future outputs.
What is the needle in the haystack problem?
This problem occurs when providing too much context, making it harder for the AI to identify specific information.
What are reasoning models?
Reasoning models are advanced AI that can think through problems before providing answers, resulting in higher quality responses.
How should I summarize a book for AI?
Instead of inputting an entire book, summarize each chapter individually to provide concise context for the AI.

Weitere Video-Zusammenfassungen anzeigen

Erhalten Sie sofortigen Zugang zu kostenlosen YouTube-Videozusammenfassungen, die von AI unterstützt werden!

Untertitel

Automatisches Blättern:

00:00:00
Everything I'm about to cover in this
00:00:02
video is the kind of basic information
00:00:04
and terms and like a glossery index of
00:00:07
what you need to know as an author who
00:00:10
wants to use AI. A lot of even
00:00:12
experienced people who write with AI
00:00:14
don't understand a lot of these terms
00:00:16
that I'm going to be sharing with you
00:00:18
today. So, I want to make sure you have
00:00:20
all of these systems and terms in mind
00:00:23
so that you can understand how AI works
00:00:25
and how you can help it best serve you
00:00:28
as an author. So let's dive in.
00:00:35
The first thing that I wish more people
00:00:37
understood is the difference between a
00:00:38
large language model or LLM and a tool
00:00:42
like a chatbot or another what we call
00:00:45
rapper tools. And to help with this, I
00:00:47
like to use an analogy of electricity
00:00:49
and an appliance. You can use
00:00:51
electricity in a variety of different
00:00:54
ways. Obviously, if you're in your
00:00:55
kitchen and you have a food processor
00:00:58
and a microwave and a blender, all of
00:01:01
them use the same electricity, but they
00:01:03
use it in different ways. And in this
00:01:06
case, the large language model is like
00:01:08
the electricity. It's the raw power that
00:01:10
is creating the results that you want.
00:01:13
However, you can have a tool which is
00:01:16
similar to the appliance that uses that
00:01:19
power in unique ways and it might be
00:01:22
more useful for a writer versus a coder
00:01:25
or what have you. So you might have a
00:01:27
wrapper tool that's specifically geared
00:01:29
in a workflow format for someone who
00:01:32
wants to code and you might have another
00:01:34
one that uses the same large language
00:01:36
model uh that is using it for writing or
00:01:39
whatever the case is. and even chat bots
00:01:42
which we often think as being synonymous
00:01:44
with an AI large language model. Things
00:01:46
like chat GPT or claude. These chat bots
00:01:49
are actually rapper tools in themselves.
00:01:51
They are actually really simple rapper
00:01:53
tools that all they have is a simple
00:01:55
chat interface and maybe a few other
00:01:57
bells and whistles. But if I go to say
00:01:59
chat GBT here, you can see up here in
00:02:01
the corner I can select between
00:02:03
different large language models. ChatGBT
00:02:06
is not the same as a large language
00:02:07
model. is simply a tool that
00:02:09
incorporates a large language model into
00:02:11
it like an appliance uses electricity.
00:02:14
Now, where this metaphor breaks down a
00:02:16
little bit is the fact that there are
00:02:17
multiple LLMs and as far as I know,
00:02:19
there's not multiple types of
00:02:21
electricity, right? That slight
00:02:22
difference and each form of electricity.
00:02:25
Each large language model kind of has
00:02:26
its own strengths and weaknesses, but by
00:02:29
itself, a large language model needs
00:02:31
some kind of tool built around it in
00:02:34
order to really interface with it
00:02:35
effectively. Otherwise, it's just a
00:02:37
simple prompt and a response. It doesn't
00:02:39
have memory. It doesn't have a whole lot
00:02:41
of uh features. All right. The second
00:02:43
thing I want to make sure I get across
00:02:44
to you is the difference between regular
00:02:47
models and reasoning models. Now, this
00:02:49
is a relatively new development, at
00:02:51
least new in the world of AI, where we
00:02:53
started to get these reasoning models
00:02:55
that think before they give you an
00:02:58
answer. And by doing so, they actually
00:03:00
give you usually higher quality answers.
00:03:02
And they're particularly good for any
00:03:04
kind of task that involves reasoning or,
00:03:08
you know, something that a human would
00:03:09
actually spend time to think about it.
00:03:11
In the writing world, the best cases for
00:03:14
reasoning models are things like
00:03:16
editing. Because before, when you had a
00:03:18
large language model, it wouldn't
00:03:20
necessarily give you good advice about
00:03:22
your book. If you gave it your book and
00:03:23
said, "Hey, please edit this book for
00:03:25
me." it wouldn't necessarily do a good
00:03:27
job because it would give you stuff that
00:03:28
sounds like what an editor would say,
00:03:31
but it wasn't able to actually analyze
00:03:33
and think and make decisions based on
00:03:36
the book that it's reading. It just gave
00:03:39
you stuff that sounded correct. Now, a
00:03:41
thinking model, while not perfect in
00:03:43
that regard, is much better at actually
00:03:45
thinking through what may or may not be
00:03:48
a problem with your book, and it's able
00:03:50
to give you much better answers. There
00:03:52
are other useful applications of
00:03:53
thinking models that are usually good
00:03:55
for brainstorming or outlining. Anything
00:03:57
that requires a little bit heavier
00:03:59
thought they're particularly good at.
00:04:01
But there are still really good use
00:04:03
cases for non-reasoning models.
00:04:06
Particularly, for some reason, the
00:04:07
actual writing of pros tends to be
00:04:10
better or at least the same for a
00:04:13
cheaper price as the reasoning models.
00:04:16
So, something that I recommend for
00:04:17
pretty much all authors is this idea
00:04:20
that you should test out which large
00:04:22
language models do what because not only
00:04:24
are they all different and they have
00:04:25
different strengths, but you have your
00:04:27
reasoning versus your non-reasoning and
00:04:29
you should definitely check out with
00:04:30
every prompt that you have that you use
00:04:32
regularly, you should definitely test
00:04:35
which one is better for one or the
00:04:37
other. All right, the next thing I want
00:04:38
to be clear on is what is a context
00:04:41
window and a token. These are terms
00:04:43
you'll hear a lot around AI. And let me
00:04:45
explain them uh a little bit for you.
00:04:47
First, if we come into Open Router, uh
00:04:50
which is a a tool you'll see me use.
00:04:52
It's not really important for this
00:04:53
video, but in open router.ai, you can
00:04:56
kind of look at pretty much all of the
00:04:58
large language models that are publicly
00:05:01
available on the market. Not everything,
00:05:02
but uh the vast majority. And so if we
00:05:05
look at one of them just uh whatever's
00:05:07
here, you'll see it has a 96k context
00:05:12
window. Um and uh if you look at others
00:05:15
like let's look at some of the big ones
00:05:17
here like Llama Llama 4 Maverick has a
00:05:20
1.05 million context window. Now it just
00:05:24
so happens with Llama 4 like I wouldn't
00:05:27
trust this based on reports I'm I'm
00:05:30
hearing but technically it has that big
00:05:33
of a context window. What that means is
00:05:35
that it can process in this case one
00:05:38
over 1 million tokens in its prompt. So
00:05:41
you can give an enormous prompt with 1
00:05:44
million tokens. It'll be able to read
00:05:46
everything and understand everything in
00:05:48
theory within that prompt. So with a 1
00:05:51
million context window that's actually
00:05:53
really large. In theory, you should be
00:05:55
able to have it like you could have it
00:05:57
read all of your organization's
00:05:58
documents, all of your books, whatever,
00:06:00
and it would be able to understand all
00:06:02
of that. Now, in some cases, that isn't
00:06:04
always the case. It doesn't really turn
00:06:06
out that way. There's this thing called
00:06:08
the needle in the haystack problem where
00:06:10
if you give it a whole bunch of stuff,
00:06:11
it actually gets worse at identifying
00:06:13
small bits of information, kind of like
00:06:15
a human would, honestly. But regardless,
00:06:18
you should be able to give it a massive
00:06:21
amount of context in this case and have
00:06:23
it understand that. But first, we have
00:06:26
to understand what exactly is a token.
00:06:28
Uh because a token is not the same as
00:06:30
words. This is not the same as 1 million
00:06:32
words that it can understand. A token is
00:06:35
a specific unit that an AI uses to to
00:06:40
read text. And not all large language
00:06:43
models use tokens in the same way. If we
00:06:46
go to this tool developed by OpenAI, you
00:06:49
can see how the OpenAI models look at
00:06:52
tokens. So if I just like copy this text
00:06:54
here and place it in here, we can now
00:06:57
see exactly what these tokens look like.
00:07:00
And so you might have you know a lot of
00:07:02
them do translate to single words like
00:07:04
you know language models process text
00:07:06
using tokens those are all individual
00:07:09
words are processed as one token but
00:07:11
then we have the comma is also a single
00:07:13
token you'll see punctuation is often a
00:07:16
single token but you also have words
00:07:17
like this open ais is actually three
00:07:20
tokens there's open AI and then the
00:07:22
apostrophe s you'll also see other words
00:07:25
sometimes split up so over here we have
00:07:27
tokenized as two tokens right there. And
00:07:31
in general, so that's just the way it
00:07:33
works. So sometimes it splits words up.
00:07:35
Uh sometimes you have a single word per
00:07:37
token. Sometimes you have punctuation
00:07:39
function as a single token, etc. So the
00:07:42
rule of thumb that's pretty widely
00:07:44
considered in AI circles is that if you
00:07:47
have say 100 tokens, that's roughly
00:07:50
equivalent to 75 words. So, if you look
00:07:53
at this million token context window,
00:07:57
you can assume that you can fit in at
00:08:00
least 750,000 words into that context
00:08:04
window. So, that's enough for several
00:08:06
books that you could put in there and
00:08:07
have it read. Now, once again, the
00:08:10
needle in the haystack problem results
00:08:12
in, you know, if you give it too much
00:08:14
context, it actually performs worse in
00:08:17
general. I'm sure that will get better
00:08:18
over time. It already has gotten better
00:08:20
over time. I would say a standard
00:08:22
context window that you'll see on a lot
00:08:24
of models is about 200,000. That is
00:08:26
plenty for any of the use cases that I
00:08:29
have. Usually, my prompts do not exceed
00:08:31
15 to 20,000 words. So, all I would need
00:08:34
is a good, I don't know, 50,000 context
00:08:36
window to feel safe there. I definitely
00:08:39
don't need this million-doll context
00:08:41
window for my needs as a writer. And uh
00:08:44
if you are putting in entire books into
00:08:47
an AI to read, I would be skeptical of
00:08:51
the results you're going to get from
00:08:52
that. You will get better results by
00:08:54
having AI summarize each chapter
00:08:56
individually and then taking that
00:08:58
summary of your book and using that in
00:09:00
your context because it'll be fewer
00:09:02
words but still get the point across of
00:09:05
your book if you want it to understand
00:09:06
the context. So, say you're writing a
00:09:08
series and you're writing book two and
00:09:10
you want to make sure it understands
00:09:12
what happened in book one. Don't put
00:09:14
your entire book one in into your
00:09:15
context window. Uh, not only will that
00:09:18
dilute the prompt a little bit, but it
00:09:20
will also cost you a whole lot more
00:09:22
money. So, I would just summarize that
00:09:24
book, summarize each chapter
00:09:25
individually and then provide that
00:09:27
summary in the context. So, little
00:09:30
tricks like that make it very easy. All
00:09:32
right. Next thing I want to talk about
00:09:33
is temperature and other AI parameters.
00:09:36
So if I'm in the openAI playground here,
00:09:40
this is something that you cannot do
00:09:41
inside of chat GBT or most other chat
00:09:43
bots. You have to do this inside of
00:09:45
their own API, which you don't really
00:09:47
need to understand right now. I will do
00:09:49
more videos about APIs later, but you
00:09:51
can think of an API as the cord that
00:09:54
connects the appliance to the
00:09:56
electricity to the plug. Right? So we
00:09:58
have chat GBT, but then we also have
00:10:01
opening eyes playground. One of the
00:10:03
benefits of this is that you can select
00:10:05
a bunch of other models that are not
00:10:06
available in chat GBT. Uh so if we
00:10:09
select one uh and then select these
00:10:11
little settings right here, you'll see
00:10:13
it actually gives us temperature, max
00:10:16
tokens, and top P. If we go to back to
00:10:18
open router and set up a chat here,
00:10:21
let's just open a new chat. We'll pull
00:10:24
in a just a random let's do Gemini 2.5
00:10:27
Pro here. And then click these three
00:10:29
dots and then go to sampling parameters.
00:10:31
You see even more. So we have max
00:10:34
tokens, temperature, top P, top K,
00:10:37
frequency penalty, presence penalty,
00:10:39
repetition penalty, min P, and top A.
00:10:42
Most of these you do not need to worry
00:10:44
about. They don't really have much of an
00:10:47
effect or at least a desirable effect on
00:10:50
your words. But there are some that you
00:10:52
should probably play around with,
00:10:54
especially if you're not getting the
00:10:55
results that you want out of the prompts
00:10:57
that you're giving AI. First of all, max
00:10:58
tokens. This one is the most
00:11:00
straightforward. This one uh is just
00:11:02
like how many tokens can it read. So say
00:11:05
it has that million token window. In
00:11:08
this case it doesn't because this
00:11:09
actually no it does because this is um
00:11:11
this is Gemini 2.5 Pro which does have a
00:11:14
million token context window. But let's
00:11:17
say you don't really need that. You just
00:11:18
want to bring it down to like say
00:11:19
100,000 or 75,000 or something like you
00:11:22
can mess with that. Chat memory. You
00:11:25
won't see this everywhere. Uh but here
00:11:27
in open router this is just showing you
00:11:29
like how far back the chat goes. This
00:11:31
again will save you money if you are
00:11:34
having really really long chats but it
00:11:36
will result in the AI forgetting some of
00:11:40
the older chats if you have set this too
00:11:42
low. So you can set this like pretty
00:11:43
much to 420 which is the top here. So
00:11:46
your chat can go as far as 420 responses
00:11:49
before it gets starts to forget. Then we
00:11:52
have temperature. Now, this is probably
00:11:54
the most important one that you will
00:11:56
look at. And uh temperature, a lot of
00:11:59
people call it the creativity meter.
00:12:01
It's not exactly a fair comparison
00:12:03
because it's a little more nuanced than
00:12:04
that, but you can essentially think of
00:12:06
it in that way that the more we turn
00:12:08
this up, the more creative it gets. And
00:12:10
the more we turn this down, the less
00:12:13
creative and more predictable it gets.
00:12:15
Now, sometimes you want predictability.
00:12:17
Sometimes you want, you know, if you
00:12:18
have a certain automation and you want
00:12:20
the results to always be the same kind
00:12:23
of response, you can turn the
00:12:25
temperature down and it will get more
00:12:28
predictable in its responses. But
00:12:29
sometimes, especially as authors, when
00:12:32
we're trying to write something, we
00:12:34
wanted to maybe be a little bit more
00:12:36
creative. And so sometimes tweaking this
00:12:38
up just a little bit can be useful to
00:12:40
test to see, is it actually going to
00:12:42
write better when you do that. Now the
00:12:43
problem is if you raise this too high it
00:12:46
starts to just become gibberish because
00:12:48
what's happening is uh large language
00:12:50
models are predictive models. They use
00:12:52
probability to determine what word
00:12:54
should come next. And the higher you
00:12:57
send this temperature dial the less
00:13:00
predictable it's going to be. So it's
00:13:02
going to start to throw in some words
00:13:04
that maybe weren't necessarily the the
00:13:06
most logical words. But in some cases
00:13:09
you might want that. Uh, but if you turn
00:13:11
it up too high, it's just going to turn
00:13:12
into gibberish, and you don't want that
00:13:14
either. So, it's something to play
00:13:16
around with. Most of the time, I see
00:13:17
people keeping it within like a three
00:13:21
before or after. Uh, if by default, most
00:13:24
large language models start at one, but
00:13:27
you can bring it down to like 7 or up to
00:13:29
1.3 just kind of in that range to be
00:13:32
more or less safe without it turning
00:13:34
into gibberish or anything like that.
00:13:35
Top P is similar. So top p if you lower
00:13:39
this top p it makes the the responses a
00:13:42
little bit more predictable and this is
00:13:44
because it's actually restricting the
00:13:46
number of tokens that the model is
00:13:48
using. So if I brought this down here to
00:13:50
like 0.5 that means half of the words
00:13:53
that could be allowed are not being
00:13:55
allowed. So usually I keep this at the
00:13:57
at the top maybe you'd bring it down
00:13:58
just a little bit so you start avoiding
00:14:00
some of those really flowery overused
00:14:03
words and instead use a little bit more
00:14:05
predictable words. But that's just
00:14:07
another thing to predict. Top K I don't
00:14:09
really deal with. Frequency, presence,
00:14:11
and repetition penalty are all somewhat
00:14:13
useful because these kind of determine
00:14:16
how frequently your AI is going to be
00:14:19
using certain words or ideas. And so
00:14:21
playing around with this if you want to,
00:14:23
you know, reduce repetition can be
00:14:25
another useful one. But again, I, you
00:14:27
know, some of these are very subtle. I
00:14:29
wouldn't necessarily play around with
00:14:30
these too much unless you're really an
00:14:32
expert at AI and know what you're doing.
00:14:34
The one you are most likely to use the
00:14:36
most is this one, temperature. And I do
00:14:38
recommend you play around with
00:14:39
temperature just to see which one is
00:14:41
best. All right, last but not least, I
00:14:43
want to talk about the difference
00:14:44
between a system prompt, an AI response,
00:14:47
and a user input. So, we often talk
00:14:49
about prompts, right? Like for instance,
00:14:51
in my StoryHacker Silver group, I have a
00:14:53
bunch of prompts that I've given people
00:14:56
that are usually just like most of them
00:14:58
are just a single prompt. You enter it
00:14:59
into a chatbot and it gives you a
00:15:01
response. There's actually a lot more
00:15:02
nuance to prompting than that. And
00:15:05
sometimes your prompts can get a little
00:15:06
bit more complex. The prompts that I
00:15:08
have in my silver group, which by the
00:15:10
way you can check out below, is um I
00:15:13
have some prompts in there for
00:15:14
novelcfter. Novelcfter splits thing
00:15:16
things up into multiple parts. So it's
00:15:19
not just one prompt. It's actually one
00:15:21
prompt split into three groups. And
00:15:23
those three groups are system prompt, AI
00:15:25
response, and user prompt. So to show
00:15:27
you that here in open router, let's just
00:15:29
pull up Gemini 2.5 Pro again. And you
00:15:31
have right here when you click on these
00:15:34
three dots, you'll get this little box
00:15:36
that says system prompt. Likewise, if we
00:15:38
go here to OpenAI's dashboard, you can
00:15:41
see there's a box for the system message
00:15:43
here. And actually chat GBT, if you're
00:15:46
using chat GPT or other chat bots like
00:15:49
it, there is a feature in most chat bots
00:15:51
that give you similar results as the
00:15:52
system prompt. And you can do it by
00:15:55
creating a custom GPT or you can go here
00:15:58
to uh customize chat GBT and enter in
00:16:02
information here where it asks you like
00:16:04
what traits should chat GPT have and
00:16:06
then you can put in style information
00:16:09
different things in here and it
00:16:10
functions much the same as a system
00:16:11
prompt. Uh but let's go back to open
00:16:14
router so I can show this off. So a
00:16:16
system prompt is essentially the things
00:16:19
that you want it to always know. So, if
00:16:22
you have a regular task that you want AI
00:16:25
to perform, you put that in here. Uh,
00:16:28
for instance, a really simple version of
00:16:30
a good system prompt that I use all the
00:16:32
time is I just have in the system
00:16:34
prompt, I say, when I give you text, I
00:16:36
want you to summarize it in one sentence
00:16:38
or one paragraph or whatever. In that
00:16:40
case, in fact, let's do that right now.
00:16:42
When I give you text, summarize it in
00:16:47
one sentence. And then I just click out
00:16:48
of this. And now anytime I give it text,
00:16:50
it's just going to automatically uh do
00:16:53
what I asked it to because it always
00:16:55
remembers that that is the way it must
00:16:57
always behave. So the system prompt is
00:16:59
essentially establishing the parameters
00:17:02
that it must always do. Now if I go to
00:17:04
let's go to my website. So let's just
00:17:07
pick one of the articles on my website
00:17:09
and just copy this entire thing. Nice
00:17:12
long article. And we go to this and
00:17:15
paste the whole article in there. It
00:17:17
knows because it has the system prompt
00:17:19
that it now needs to summarize this in
00:17:22
one sentence. Now, it so happens that
00:17:24
Gemini 2.5 Pro is a reasoning model. So,
00:17:27
it it did a little bit of reasoning here
00:17:29
that you can look through, but it's here
00:17:31
it is the sentence, the single sentence
00:17:33
summary where it says, "This guide
00:17:35
provides writers with a detailed
00:17:36
four-act beat sheet for plotting cozy
00:17:37
mysteries covering essential story
00:17:39
structure, character development, and
00:17:40
genre conventions to craft a compelling
00:17:42
who done it." So, it did what I asked,
00:17:44
right? So, that's the system prompt, and
00:17:45
it's probably the most important on this
00:17:46
list. However, there are two other
00:17:48
prompt components. Let's just get rid of
00:17:51
this system prompt for a minute. Say we
00:17:53
don't really have a specific tasks that
00:17:57
we want the AI to do all of the time. We
00:17:59
just want to provide it with a simple
00:18:03
chat response right now. So, we just
00:18:06
want to provide it with a simple
00:18:08
question and have it answer us in a
00:18:11
simple way. So, there's no system prompt
00:18:13
at work here, but let's just say, "Give
00:18:15
me the lyrics to Mary had a little
00:18:20
lamb." This right here that I've just
00:18:22
entered in is the user prompt, and it's
00:18:24
generally the thing that we're most
00:18:26
familiar with prompting. It's the thing
00:18:29
that we ask it to do where we give it a
00:18:32
task. And a lot of time we put all of
00:18:35
our data, all of our info into that
00:18:38
single prompt when it might be better to
00:18:41
split up portions of that prompt into
00:18:44
the system prompt. Like for instance, if
00:18:46
I'm writing a book and I want the style
00:18:49
to be relatively consistent throughout
00:18:51
everything that I do, I put the style
00:18:53
prompt and maybe examples of the type of
00:18:55
writing I want, put that in the system
00:18:58
prompt because that's the stuff I wanted
00:18:59
to remember all the time. But then in
00:19:01
the user prompt, I give it specific
00:19:03
instructions for that specific part of
00:19:05
the scene that I wanted to write next.
00:19:07
So that's the user prompt. It's the most
00:19:09
straightforward. I think we all
00:19:10
understand how that works. And now the
00:19:12
response that it gave me, this is the
00:19:15
third component of a prompt and that is
00:19:17
the AI response. Now you might think
00:19:19
that's not a prompt Jason, that is a
00:19:22
response. And while that's true, there
00:19:25
is in some cases uh for instance, if
00:19:28
you're here in open router, I can
00:19:29
actually go through and edit this. So I
00:19:32
could edit this and um you know, say I
00:19:35
don't like the style of the response
00:19:37
that it gave me, I can rewrite it
00:19:39
myself. And then after I hit save, it
00:19:41
will think that my edited version is
00:19:44
what it said. And so one of the unique
00:19:46
things that a lot of people don't take
00:19:48
advantage of with AI is the fact that if
00:19:52
you put data into the AI response or you
00:19:56
edit the data from an AI response, then
00:19:58
what it gives you next. If you like say
00:20:01
you're continuing on with the next part
00:20:02
of your scene or something, it will
00:20:05
better match the what you saw in its
00:20:08
first response. So, let's say rather
00:20:10
than asking for lyrics of Mary had a
00:20:11
little lamb, I just asked it to write a
00:20:13
part of my scene, but there were some
00:20:15
bits in there that I didn't like. And
00:20:16
so, I changed the wording. I edited it
00:20:19
pretty heavily to make sure it sounded
00:20:21
like I wanted it to sound. And then I
00:20:23
asked it to write the next part of the
00:20:25
scene after that. Well, it will look at
00:20:28
what it wrote in the past. And because
00:20:31
it thinks that it wrote that, for some
00:20:33
reason, that helps it to be more
00:20:35
effective at writing the next bit. And
00:20:37
so that's just something to keep in
00:20:39
mind. Anyway, I hope these tips have
00:20:40
been super useful for you. Let me know
00:20:42
in the comments if you want to see
00:20:43
anything else like this, any other
00:20:45
things about AI that confuse you. I'll
00:20:48
be sure to try and take a look at those.
00:20:49
And in the meantime, go ahead and check
00:20:51
out my groups down below. My silver
00:20:53
group is uh really low cost. It's
00:20:55
onetime fee. You can get all my prompts
00:20:57
and all my frameworks in there, plus
00:20:59
access to a really thriving community
00:21:01
with thousands of members at this point.
00:21:03
And then there's also my gold group down
00:21:04
below which is on a wait list right now
00:21:07
but you can go ahead and I'll be opening
00:21:08
that pretty soon.