Who is Simon Willison?

Simon Willison is an experienced software engineer known for co-creating the Django framework and other open-source contributions.

What is the main topic of the podcast?

The main topic is the use of AI tools, specifically large language models, in coding and their impact on software engineering.

How did Simon Willison's career benefit from AI tools?

Simon uses AI tools to enhance productivity by handling coding tasks more efficiently and taking on more ambitious projects.

What is the Code Interpreter Mode in ChatGPT?

This feature allows ChatGPT to write and execute python code to answer questions, effectively querying databases or processing data files.

What challenges does AI pose to programmers?

AI tools challenge programmers by performing coding tasks faster, prompting an existential reflection on their role.

What is Simon Willison's perspective on using AI in programming?

Simon sees AI as a tool to enhance productivity and plans to combine his programming knowledge with AI for better results.

What are some misconceptions about AI tools in software engineering?

A common misconception is that AI tools will completely replace human programmers, but they are more effectively used as assistants.

What impact have large language models had on software engineering according to the podcast?

Large language models have significantly improved productivity by automating routine coding tasks and expanding the scope of projects engineers can tackle.

How has open source influenced software engineering productivity?

Open source has drastically reduced development costs and increased the availability of reusable software components.

AI tools for software engineers, but without the hype – with Simon Willison (Co-Creator of Django)

01:12:43

https://www.youtube.com/watch?v=uRuLgar5XZw

Resumen

TLDRIn this episode of the Pragmatic Engineering Podcast, Simon Willison, an experienced software engineer and open-source contributor, discusses the impact and use of AI tools in programming. These tools, like ChatGPT's Code Interpreter Mode, can write and execute code, posing both opportunities and challenges for programmers. Although AI can perform basic coding tasks rapidly, leading to reflections on the roles of programmers, Simon highlights how these tools can significantly boost efficiency and enable more ambitious projects. He describes his experimental approach to understanding these tools, comparing their impact to previous major advances like open-source software and Firebug. Despite the existential dread some programmers feel, Simon argues that combining human expertise with AI leads to overwhelming competitive advantages. He advises experimenting with AI tools and integrating them into more advanced software practices for experienced and new software engineers. The podcast also debunks common misconceptions, such as the belief that AI will soon replace programmers entirely, emphasizing instead that AI functions best as an augmentation to human effort.

Para llevar

🤖 AI tools like ChatGPT can significantly boost programming productivity.
💡 Large language models pose both opportunities and existential challenges for developers.
🔧 Simon advocates using AI as a tool to enhance existing programming skills.
📈 AI allows developers to tackle more ambitious projects by handling trivial tasks.
🧠 Understanding AI requires experimentation and building an intuition for their use.
🛠️ The key to effective AI use is knowing what tasks AI excels at and where it falls short.
🧰 Integrating AI with personal programming knowledge results in a strong competitive advantage.
🌍 Open-source has been a major factor in improving software development efficiency.
📚 Developers should continue experimenting with AI to discover new ways to enhance their workflows.
⚖️ Ethical considerations in AI use are important and should not be overlooked.

Cronología

00:00:00 - 00:05:00
In this introduction, there is a discussion on the experience of using AI models for coding, particularly with the introduction of Code Interpreter mode in ChatGPT. The speaker expressed a feeling of existential dread upon realizing how efficiently AI could solve problems they believed were integral to their professional identity. They balanced this concern with an optimistic view of using AI tools to enhance programming skills, leveraging their programming knowledge to outpace those unfamiliar with coding.
00:05:00 - 00:10:00
The podcast, Pragmatic Engineering, focuses on software engineering insights from big tech and startups, featuring experienced engineers and their lessons. The host discusses their conversation with Simon Willison, an independent researcher with extensive experience using large language models (LLMs) for personal productivity. Plans for an in-depth conversation with Simon about practical applications of AI in development workflows and misconceptions about LLMs are introduced. Simon’s background, involving significant contributions to open source and transitioning from startups to large companies, sets the stage for insights into using AI in programming.
00:10:00 - 00:15:00
Simon discusses his early exposure to machine learning and LLMs, beginning with GPT-2 and evolving into heavy use with GPT-3. Initially unimpressed by early models, Simon found GPT-3 significantly more capable. He began using it for coding tasks, like navigating JQ programming, noting improvement in handling more complex tasks with the advent of ChatGPT. The implementation shift from cumbersome completion prompts to a more intuitive chat interface marked a major advancement, making interactions more user-friendly and efficient for solving coding issues.
00:15:00 - 00:20:00
Simon recounts the transformative impact of ChatGPT’s Code Interpreter, which executed SQL queries efficiently on datasets, illustrating the tool’s capabilities compared to his own software. This experience led Simon to reconsider how AI could enhance his projects, integrating AI features like SQL query generation in his software. He reflects on this technological leap as both humbling and motivating, propelling him to innovate beyond AI capabilities by blending traditional software with AI advancements.
00:20:00 - 00:25:00
The podcast host plugs a sponsor, Codium, an AI tool optimized for enterprise-ready software development, enhancing productivity through sophisticated features. Simon elaborates on his exploration of various LLM tools, from local model setups to mainstream platforms like ChatGPT and Copilot, that evolved his coding stack. The conversation covered the challenges and learning curves associated with becoming proficient with AI tools, emphasizing the importance of understanding each model’s strengths and weaknesses for achieving enhanced productivity.
00:25:00 - 00:30:00
Simon elaborates on how using LLMs has changed his coding approach, particularly after three years of embracing these tools alongside traditional programming. He explains that efficient use of AI tools requires understanding their potential and limits, such as their language preferences (e.g., Python and JavaScript) and weakness (e.g., current events and math problems). His experience with daily explorations of different AI models, combined with patience and learning from trial and error, allowed him to integrate AI into his workflow effectively.
00:30:00 - 00:35:00
Simon explores the concept of ‘fine-tuning’ with LLMs, detailing the challenges and effectiveness of adapting models using custom training data. He introduces the concept of RAG (Retrieval-Augmented Generation), a simpler alternative to fine-tuning that uses existing data to solve problems more effectively. This approach exemplifies how non-specialists can effectively scale AI to address specific needs. The discussion highlights the intricacies of model training and the practical reality of using AI in real-world applications.
00:35:00 - 00:40:00
The conversation moves to Simon's day-to-day use of AI tools, highlighting the use of Claude 3.5 Sonet and GPT-4 for coding optimization. Simon finds Claude 3.5 Sonet particularly effective, with superior features over typical models due to its quick adaptive coding and problem-solving capabilities. He describes how these tools fit into his daily development environment, significantly boosting coding efficiency and allowing more time for higher-level problem-solving, thus redefining his coding practices and expanding project possibilities.
00:40:00 - 00:45:00
Discussions delve into personal strategies using advanced AI tools like Copilot and Claude. He emphasizes building intuition and harnessing contextual AI to navigate complex coding challenges. Practical examples illustrate how Simon uses AI to navigate specific coding languages, leveraging them to tackle larger, more ambitious projects. He shares insights on increasing code efficiency and exploring diverse programming languages by outsourcing trivial language syntax details to AI, thus broadening the scope of potential projects.
00:45:00 - 00:50:00
Simon reflects on transformative technologies in his programming career, such as Firebug’s impact on web development productivity, likening AI advancements to previous innovations. He underscores open source as a major career booster, with platforms like GitHub revolutionizing collaboration and code reuse. While AI brings a similar revolutionary potential, it necessitates dedication and understanding to leverage effectively. The conversation hints at parallels between current AI tools and historical tech trends that redefined engineering practices.
00:50:00 - 00:55:00
Simon discusses his enhanced productivity through AI tools, estimating them to make him two to three times more efficient in coding tasks even though coding is a fraction of overall work. LLMs primarily enhance his ability to tackle a wider range of languages and projects, making complex or unfamiliar tasks feasible without in-depth prior knowledge of the syntax. The conversation explores broader implications, such as how AI might alter demand for developers or expand possible project scopes, aligning with historical tech advancements.
00:55:00 - 01:00:00
Exploring AI adoption resistance, Simon examines ethical, economic, and professional implications of AI integration. He respects the ethical stance against AI due to data privacy concerns while advocating for embracing AI to stay competitive. Simon suggests that fear of job displacement can be mitigated by mastering AI to complement existing skills. He debates potential misconceptions around AI stagnation and technological plateaus, highlighting ongoing incremental improvements in AI that could redefine competitive edge.
01:00:00 - 01:05:00
Simon gives a realistic perspective on the potential and limitations of AI tools, emphasizing that tools like Chain of Thought prompting enhance problem-solving by structuring reasoning. He disputes notions of AI-driven job replacements, advocating for AI as auxiliary, improving existing roles without usurping them. By enhancing human decision-making with AI assistance, developers can tackle more complex challenges. Simon emphasizes that mastery of these tools, combined with human intellect, prolongs professional relevance in tech advancements.
01:05:00 - 01:12:43
Simon outlines suggestions for developers, promoting ongoing learning and experimentation with AI tools through personal projects. He advocates for embracing redundancy and leveraging AI for mundane tasks, freeing developers to focus on innovation. By engaging with various AI tools, developers can sharpen intuition, align AI applications with objectives, and push project boundaries. Practical tips include integrating AI tools incrementally in daily tasks while staying updated on AI developments to remain competitive and enhance career growth.

Mapa mental

Vídeo de preguntas y respuestas

Who is Simon Willison?
Simon Willison is an experienced software engineer known for co-creating the Django framework and other open-source contributions.
What is the main topic of the podcast?
The main topic is the use of AI tools, specifically large language models, in coding and their impact on software engineering.
How did Simon Willison's career benefit from AI tools?
Simon uses AI tools to enhance productivity by handling coding tasks more efficiently and taking on more ambitious projects.
What is the Code Interpreter Mode in ChatGPT?
This feature allows ChatGPT to write and execute python code to answer questions, effectively querying databases or processing data files.
What challenges does AI pose to programmers?
AI tools challenge programmers by performing coding tasks faster, prompting an existential reflection on their role.
What is Simon Willison's perspective on using AI in programming?
Simon sees AI as a tool to enhance productivity and plans to combine his programming knowledge with AI for better results.
What are some misconceptions about AI tools in software engineering?
A common misconception is that AI tools will completely replace human programmers, but they are more effectively used as assistants.
What impact have large language models had on software engineering according to the podcast?
Large language models have significantly improved productivity by automating routine coding tasks and expanding the scope of projects engineers can tackle.
How has open source influenced software engineering productivity?
Open source has drastically reduced development costs and increased the availability of reusable software components.

Ver más resúmenes de vídeos

Obtén acceso instantáneo a resúmenes gratuitos de vídeos de YouTube gracias a la IA.

Subtítulos

Desplazamiento automático:

00:00:00
every programmer who works with these
00:00:02
models the first time it spits out like
00:00:04
20 lines of actually good code that
00:00:06
solves your problem and does it faster
00:00:07
than you would there's that moment when
00:00:09
you're like hang on a second what am I
00:00:10
even for and then I tried this new
00:00:12
feature of chat GPT that they launched
00:00:14
last year called code interpreter mode
00:00:16
and I asked a question and it flawlessly
00:00:19
answered it by composing the right SQL
00:00:20
query running that using the python SQL
00:00:23
light library and spitting out the
00:00:24
answer what am I even for like I thought
00:00:26
my life's purpose was to solve this
00:00:28
problem that was a little bit exist
00:00:29
itial dread it is scary when you think
00:00:32
okay I earn a very good salary because I
00:00:35
have worked through the trivia of
00:00:37
understanding Python and JavaScript and
00:00:38
I'm better at that trivia than most
00:00:39
other people and now you've got this
00:00:41
machine that comes along and it's better
00:00:43
at the trivia than I am I feel like
00:00:45
there's a pessimistic in an optimistic
00:00:46
way the optimistic version I can use
00:00:48
these tools better than anyone else for
00:00:51
programming I can take my existing
00:00:52
program knowledge and when I combine it
00:00:54
with these tools I will run circles
00:00:56
around somebody who's never written a
00:00:58
code line of code in their life I can
00:01:00
just do the Step better welcome to the
00:01:02
pragmatic engineering podcast in this
00:01:05
show we cover software engineering at
00:01:06
Big Tech and startups from the inside
00:01:09
you'll get deep tipes with experienced
00:01:11
engineers and Tech professionals who
00:01:12
share their hard-earned lessons
00:01:14
interesting stories and practical advice
00:01:16
that they have on building software
00:01:18
after each episode you'll walk away what
00:01:20
pragmatic approaches you can use to
00:01:21
build stuff whether you're a software
00:01:23
engineer or a manager of
00:01:25
engineers in this first episode we go
00:01:27
into a really timely topic using gen AI
00:01:30
for coding now there's no shortage of AI
00:01:32
companies hyping up their capabilities
00:01:34
but we'll size up all of that I turned
00:01:37
to longtime software engineer Simon
00:01:38
Willison who is safe to refer to as an
00:01:41
independent investigator of large
00:01:42
language models because he's been using
00:01:44
them so much to improve his personal
00:01:46
productivity for the last four years
00:01:48
with Simon we have a refreshingly honest
00:01:50
conversation on how these tools actually
00:01:53
work for us developers as of now we talk
00:01:56
about common llm use cases like
00:01:58
fine-tuning and rack
00:02:00
Simon's day-to-day large language model
00:02:02
stack and misconceptions about large
00:02:04
language models this is the first
00:02:06
episode of many such deep Dives to come
00:02:08
subscribe to get notified of when new
00:02:10
episodes are out so Simon welcome to the
00:02:13
podcast hey it's really great to be here
00:02:16
so it's great to have you here you're an
00:02:18
experienced software engineer and you've
00:02:20
definitely been around the blog so some
00:02:22
people will know you from your prolific
00:02:24
open source contributions co-creating
00:02:26
the Django framework uh which is a rapid
00:02:28
web development tool written in Python
00:02:31
uh you're also the creator of a data set
00:02:34
tool for exploring and Publishing data
00:02:36
and then you're also a startup founder
00:02:38
right so I remember you were the the
00:02:41
founder of lanard a conference Direction
00:02:44
site which was funded by y combinator
00:02:47
acquired by event right and then you
00:02:48
were there for six years as an engineer
00:02:50
as a manager so you've really done all
00:02:52
all of the things open source founder
00:02:54
working at a large company yeah I got to
00:02:57
um I got to do the the the startup to
00:02:59
large company thing is is particularly
00:03:01
interesting you know like moving from
00:03:03
moving at the speed of a startup to
00:03:05
moving at the speed of a much larger
00:03:06
company where bugs matter and people
00:03:08
lose money if your software breaks when
00:03:11
I started
00:03:12
to notice you more is when around the
00:03:15
time when chat GPT came out and you were
00:03:17
very Hands-On in trying out what this
00:03:20
works for your development workflow you
00:03:21
shared a lot of things on your blog and
00:03:25
really this is what we're going to talk
00:03:26
about today uh your firsthand learnings
00:03:28
about how this AI development helps your
00:03:31
specific workflow where it doesn't help
00:03:33
and and what you've learned through this
00:03:36
how many years has it been two 3 years
00:03:39
of well um so I was on GPT 3 before chat
00:03:42
GPT came out so I'm at about I'm verging
00:03:45
on three years of using this stuff
00:03:47
frequently um it got exciting when chat
00:03:50
GPT came out gpt3 was interesting but
00:03:52
chat GPT that's when the whole world
00:03:54
started paying attention to it to kick
00:03:57
off I'm I'm interested
00:04:00
in how you got started with with these
00:04:03
large language model tools what what was
00:04:05
the you know first time you came across
00:04:07
them man and you're like all right let
00:04:09
me get as a goal so I've been paying
00:04:11
attention to the field of machine
00:04:13
learning on a sort of as a sort of like
00:04:15
side side interest for five or six years
00:04:18
I did the um the fast AI course Jeremy
00:04:20
Howard's course back in I think
00:04:22
2018 and then um when and then gpt2 came
00:04:27
out in was that 2019 20 yeah it's 2019
00:04:31
gpt2 was happening which was the first
00:04:34
of these models that you could see there
00:04:36
was something interesting there but it
00:04:38
was not very good like it could you
00:04:40
could give it text to sort of complete a
00:04:42
sentence and sometimes it would be
00:04:43
useful and I did an experiment back then
00:04:45
where I tried to generate New York Times
00:04:48
headlines for different decades by
00:04:50
feeding in say all the New York Times
00:04:52
headlines in the 1950s then the 1960s
00:04:54
and 1970s and then giving it stories to
00:04:56
complete now and I poked around for it
00:05:00
the the results were not exactly super
00:05:02
exciting um and I kind of lost interest
00:05:05
at that point to be honest and then gpt3
00:05:08
which came out in
00:05:09
2020 um but sort of began to be more
00:05:12
available in
00:05:13
2021 that's when things started getting
00:05:15
super interesting because GPT was the
00:05:17
first of these models that was large
00:05:19
enough that it could actually do useful
00:05:21
things and um one of the earliest code
00:05:24
things I was using it for was um I think
00:05:26
I was using it for JQ the The Little Jon
00:05:29
on programming language which I've
00:05:31
always found really difficult um it just
00:05:33
doesn't quite fit in my head and I was
00:05:35
finding that gpt3 if I prompted it in
00:05:37
the right way and this was a model where
00:05:39
you had to do the um the completion
00:05:41
prompt so you don't ask it a question
00:05:42
get an answer you say the JQ needed to
00:05:44
turn this into this is and then you stop
00:05:47
and you run that in the model and it
00:05:49
finishes the sentence which I think is
00:05:51
the reason most people weren't playing
00:05:52
with it it's a weird way of interacting
00:05:54
with something like in many ways the big
00:05:57
innovation of chat GPT was they had talk
00:05:59
they added a chat interface on top of
00:06:01
this model and so now you could you
00:06:03
didn't have to think in terms of
00:06:05
completions you could ask it a question
00:06:06
and get an answer back but yeah so it
00:06:08
was very clear back then sort of um and
00:06:12
that was running it for about 12 months
00:06:13
before chat GT came along there was
00:06:15
something really interesting about this
00:06:17
model and what it could do and that was
00:06:19
also the the point where it became clear
00:06:21
that code was actually something was
00:06:23
surprisingly good at and this um I
00:06:25
talked to somebody open AI I asked them
00:06:27
it's like were you expecting it to be
00:06:28
good at code and they said you know we
00:06:30
thought maybe but it wasn't one of our
00:06:32
original goals like the original goals
00:06:34
of these models were much more things
00:06:35
like translation from one human language
00:06:37
to another which um which they do
00:06:39
incredibly well um but when you think
00:06:41
about it the fact that they can write
00:06:43
code well isn't that surprising because
00:06:45
C code is so much simpler than like
00:06:47
English or Chinese or German like we put
00:06:50
it together what we know I I think it's
00:06:53
it's it's pretty obvious and I think you
00:06:54
know we'll talk about implications but
00:06:57
let's just jump a little bit ahead so I
00:06:59
think like I personally had a wow this
00:07:01
is amazing moment with uh llms and then
00:07:04
I've also had a bit of a like scared
00:07:06
moment of like is this could this
00:07:10
actually replace part of what I do or
00:07:13
not and you had a really interesting
00:07:15
story with that a proper like this is
00:07:17
scary moment can can you talk about that
00:07:20
I mean I've definitely I've had a few of
00:07:22
those I think every every programmer who
00:07:24
works with these models the first time
00:07:26
it spits out like 20 lines of actually
00:07:28
good code that solves your problem and
00:07:30
does it faster than you would there's
00:07:32
that moment when you're like hang on a
00:07:33
second what am I even for but I had a a
00:07:36
bigger version of that with um actually
00:07:38
with my my main open source project so I
00:07:40
I built this tool called data set which
00:07:42
is a uh it's a interface for querying
00:07:45
databases and um like analyzing data
00:07:47
creating Json apis on top of data all of
00:07:50
that kind of stuff and the thing I've
00:07:51
always been trying to solve with that is
00:07:53
I feel like every human being should be
00:07:54
able to ask questions of databases like
00:07:57
it's absurd that everyone's got all of
00:07:59
this data about them but we don't give
00:08:00
them tools that let them actually you
00:08:02
know dig in and explore it and and
00:08:04
filter it and try and answer questions
00:08:05
that way and then I tried this new
00:08:07
feature of um chat GPT that they
00:08:10
launched last year called code
00:08:11
interpreter mode this is the thing where
00:08:14
chat GPT you can ask a question it could
00:08:16
write some python code and then it can
00:08:18
execute that python code for you and use
00:08:20
the result to continue answering your
00:08:22
question and code inter mode has a
00:08:25
feature where you can upload files to it
00:08:27
so I uploaded a sqlite database file to
00:08:29
it like just the same database files
00:08:31
that I use in my own software and I
00:08:33
asked it the question and it flawlessly
00:08:35
answered it by composing the right SQL
00:08:37
query running that using the python SQL
00:08:39
light library and spitting out the
00:08:41
answer and I sat there looking at this
00:08:42
thinking on the one hand this is the
00:08:44
most incredible example of like being
00:08:46
able to ask questions of your data that
00:08:49
I've ever seen but on the other hand
00:08:51
what am I even for like I thought my
00:08:52
life's purpose was to solve this problem
00:08:55
and this thing this new tool is solving
00:08:57
my problem without even really thinking
00:08:59
about it like they didn't mention oh it
00:09:01
could do sqlite SQL queries as part of
00:09:03
what it does it's just like python um
00:09:06
and that was fun and well no that was a
00:09:08
little bit existential dread and the way
00:09:11
I've been coping with that is thinking
00:09:12
okay well my software needs to be better
00:09:15
than chat GPT code interpreter this
00:09:17
particular problem if I mix AI features
00:09:19
into it so I've started exploring what
00:09:21
the plugins for my software look like
00:09:23
that add large language model based like
00:09:26
build run a SQL query against this
00:09:27
schema all of that kind of stuff but
00:09:29
it's interesting like it did very much
00:09:31
change my mental model of the problem
00:09:33
that I was trying to solve because it
00:09:35
took such a big bite out of that problem
00:09:37
this episode is brought to you by codium
00:09:40
the AI tool of choice for professional
00:09:42
software developers that is cod
00:09:46
ium codium removes tedium from your
00:09:48
development through a suite of
00:09:49
state-of-the-art AI capabilities
00:09:52
available via extensions and all of your
00:09:53
favorite IDs such as vs code jet brains
00:09:56
Visual Studio eclipse xcode neovim
00:09:59
computer notebooks and more uniquely
00:10:02
codium is fully Enterprise ready as
00:10:04
proof it had multiple regulated Fortune
00:10:06
500 companies counted within it
00:10:08
thousands of Enterprise customers join
00:10:11
to 700,000 developers using codium
00:10:13
individual free plan and ask your
00:10:15
companies to consider a free trial of
00:10:16
the Enterprise plan to learn more about
00:10:19
codium visit codium
00:10:21
docomo that is
00:10:24
ci.com
00:10:27
pragmatic and what I noticed is you have
00:10:29
the experimenting a lot with trying out
00:10:31
how different llms will work you've been
00:10:34
running models locally you've been
00:10:36
obviously trying a lot of like you know
00:10:37
there's the usual suspect tools but but
00:10:39
even beyond that c can you share a
00:10:41
little bit on how your initial
00:10:44
Impressions were because you you were
00:10:46
already on the early versions of the
00:10:47
tool from from chat GPT to co-pilot to
00:10:50
some other things and how your stack has
00:10:52
changed or refined to actually make you
00:10:55
more productive because it sounds like
00:10:57
you are more productive now yes very
00:11:00
much so I mean yeah so I've I've been
00:11:01
calling myself an independent researcher
00:11:04
when when it comes to this kind of stuff
00:11:06
because I've got the time to to I can
00:11:08
dig into these things I write a lot like
00:11:10
I've been blogging about this since when
00:11:13
since when I first started investigating
00:11:14
it and yeah I mean um like I said gpt3 I
00:11:17
was basically using it through their
00:11:19
playground interface which still exists
00:11:21
today it's the the the the API debugging
00:11:23
tool for this stuff um and it was fine
00:11:26
like and I was using it to solve I
00:11:29
experimented with having it like write
00:11:31
documentation but I've always felt a bit
00:11:33
funny about publishing words that I
00:11:34
didn't write because I because I do so
00:11:36
much writing myself um and little bits
00:11:39
and pieces of code but I didn't really
00:11:40
get into the coding side until after
00:11:43
chat GPT came out and I did the Advent
00:11:46
of code that December and the sort of
00:11:48
monthlong programming challeng this was
00:11:50
2022 December right yes November to chat
00:11:53
November 30th is when chat came out and
00:11:56
so I spent December trying to learn rust
00:11:58
with it
00:11:59
assistant which didn't it was
00:12:03
interesting I got a reasonably Long Way
00:12:05
rust is actually I still don't know rust
00:12:07
rust the the memory management in Rust
00:12:10
is just difficult enough that language
00:12:12
models still have trouble with it like
00:12:14
one of my test of a new language model
00:12:16
is okay can it explain the rust rust
00:12:18
borrowing to me and they're getting to a
00:12:21
point where I'm almost understanding it
00:12:23
but it's it's an interesting sort of
00:12:24
stress test for this whereas if you use
00:12:26
these models for JavaScript and python
00:12:28
they're pH Nally good there's so much
00:12:30
more training data about JavaScript and
00:12:32
python out there than there is for for a
00:12:33
language like rust that honestly they
00:12:36
they they just completely sing and
00:12:37
that's great for me because the code the
00:12:40
the languages I use every day are Python
00:12:42
and JavaScript and SQL and those are the
00:12:45
three languages that language models are
00:12:47
best at so I'm perfectly positioned to
00:12:50
have these things be be useful and
00:12:51
helpful for me and I've also got an I I
00:12:55
I tend to pick like I said boring
00:12:57
technology like d Jango which the
00:12:59
language mods know already you know if
00:13:01
you're if you're sticking if if you
00:13:02
stick with Django they're going to be
00:13:04
able to do pretty much anything that you
00:13:05
ask of them but yeah so I tried learning
00:13:08
rust and that was a really good exercise
00:13:10
for just every day trying these things
00:13:12
out and seeing what could happen one of
00:13:14
the key things I've learned that I think
00:13:16
people don't necessarily acknowledge
00:13:18
these things are really difficult to use
00:13:20
and there's a lot of it's not just skill
00:13:23
there's a lot of intuition you have to
00:13:24
build up in order to use them
00:13:26
effectively like if you just sit down
00:13:28
and ask the question like you'd ask on
00:13:30
stack Overflow you'll probably not get a
00:13:32
great response and a lot of people do
00:13:35
that and then they write the whole thing
00:13:36
off they're like okay it didn't give me
00:13:38
what I wanted this is all hyp there's no
00:13:40
value here the trick is firstly you have
00:13:42
to learn how to prompt them you have to
00:13:44
more important you have to learn what
00:13:46
kind of things they're good at and what
00:13:47
kind of things they're bad at like I
00:13:49
know because I've spent so much time
00:13:51
with them that python JavaScript they're
00:13:52
great at rust they're not quite as good
00:13:54
at yet um I know that you shouldn't ask
00:13:56
them about current events because they
00:13:58
they've got a tring cut off in terms of
00:14:00
of of what they understand I know that
00:14:02
they're terrible at like mathematic math
00:14:04
math and logic puzzles don't ask them to
00:14:05
count anything which is bizarre because
00:14:09
computers are really good at maths and
00:14:11
Counting and looking things up and
00:14:13
language models those are the three
00:14:14
things they're not good at and there are
00:14:15
most supposedly our most advanced
00:14:17
computers but so you have to build this
00:14:19
quite intricate mental model of what
00:14:22
these things can do and how to get them
00:14:24
to do those things and if you build that
00:14:26
mental model if you put the work in you
00:14:28
can scream with them there is so you can
00:14:30
work so quickly at solving specific
00:14:33
problems when you say oh this is the
00:14:35
kind of thing that language model can do
00:14:37
and then you just Outsource it to your I
00:14:39
call it my weird intern sometimes
00:14:41
whereas other things you're like okay
00:14:42
well it's not even worth trying out on a
00:14:44
language model because I know from past
00:14:45
experience that it won't do a good job
00:14:46
with it so like as as a software
00:14:49
engineer I mean we do have a bit of an
00:14:51
injuring mindset but you know there's
00:14:52
when you see a new technology I mean you
00:14:54
know clearly this is this is this is
00:14:56
here it's not going away but there's two
00:14:58
ways you can look at it one is I think
00:15:00
you kind of explain you start playing
00:15:02
with it you start stress testing it you
00:15:03
see where it works where it doesn't and
00:15:06
the other one is you start from a theory
00:15:08
you understand how it's built how it
00:15:10
works what's behind the scenes and then
00:15:13
you start probing and and then you have
00:15:15
you know I think this is a little bit
00:15:16
with the way computer science is taught
00:15:18
like if you go to university like when I
00:15:20
went to computer science we started with
00:15:22
algebra and and and some like formal
00:15:26
methods and and languages and and kind
00:15:28
of coding was a little bit we got there
00:15:30
by the end and they're like well yeah I
00:15:31
guess I I now know what happens
00:15:33
underneath the compiler but obviously
00:15:35
there's the the other route as well it
00:15:37
in in your like you know view like was
00:15:41
there it sounds like you kind of like
00:15:42
jump straight into like let me see how
00:15:44
this actually works and let me not
00:15:46
overthink the theory which at the time
00:15:48
it was bit unclear right now if you
00:15:51
start with the theory it will hold you
00:15:54
back like this spe specific technology
00:15:57
it's weirdly um it's weirdly harmful to
00:16:00
spend too much time trying to understand
00:16:02
how they like how they actually work
00:16:04
before you start playing with them which
00:16:06
is very unintuitive like I I have
00:16:09
friends who say that um if you're a
00:16:10
machine learning researcher if you've
00:16:12
been training models and stuff for years
00:16:14
you're actually to disadvantage to start
00:16:16
using these tools than if you come in
00:16:17
completely fresh because because they
00:16:20
don't they're very weird you know they
00:16:22
don't react like you expect reg like
00:16:25
other machine learning models machine
00:16:26
learning people always jump straight to
00:16:28
fine tuning F tuning on these things is
00:16:30
mostly a waste of time like people it
00:16:32
takes people a long time to get to the
00:16:34
point like you know what there's no
00:16:36
point in F tuning at my own custom
00:16:38
version of this because next month just
00:16:41
to break it for fine tuning because I
00:16:42
think like we hear this word a lot but
00:16:44
by fine tuning
00:16:46
like you mean that you take you know the
00:16:49
model and then you add more training to
00:16:52
you run wrong training cycles and it's a
00:16:55
very confusing term because yeah so the
00:16:57
idea with fine tuning is you take an
00:16:58
exist model it might be one of the
00:16:59
openly licensed models or actually like
00:17:02
um I think Claude has this now GP and
00:17:04
open a have apis where you can upload
00:17:07
like a CSV file of a million examples
00:17:10
and they will and spend a lot of money
00:17:11
with them and they will give you a a
00:17:13
model try and tuned on that and it
00:17:15
sounds so tempting everyone's like wow I
00:17:17
could have a model that that that's
00:17:18
perfectly attuned to my specific needs
00:17:21
it's really difficult to do it's really
00:17:23
expensive and for most of the things
00:17:25
that people want to do it turns out it
00:17:27
it doesn't actually solve the problem
00:17:28
lots of people think I want the model to
00:17:31
know about my documentation my company's
00:17:33
Internal Documentation I want to answer
00:17:35
questions about that surely I fine tune
00:17:37
a model to solve that that it turns out
00:17:40
just BL just blame doesn't work because
00:17:42
the weight of all of the existing
00:17:44
knowledge the model has completely
00:17:46
overwhelms anything that you try and add
00:17:48
into it with fine tuning the models they
00:17:50
hallucinate more if you um on on
00:17:52
questions about things if you've done
00:17:54
that extra fine tuning step to add
00:17:55
knowledge which is a surprising thing
00:17:57
where fine tuning does work is for sort
00:18:00
of tasks like you can if you want a
00:18:02
model that's just really good at SQL you
00:18:04
can give it 10,000 examples of here's a
00:18:06
human question at a SQL schema and
00:18:08
here's the SQL query and that will make
00:18:10
it that will give you a model that is
00:18:11
stronger at that kind of activity but
00:18:13
for adding new fact into the model it
00:18:16
just doesn't work um which confuses
00:18:18
people um and so then you have to look
00:18:20
at the other techniques for solving that
00:18:22
problem there's a thing called rag which
00:18:24
is a very fancy acronym for a very
00:18:26
simple trick it stands for retrieval
00:18:28
augmented Generation all it means is the
00:18:31
user asks a question you search your
00:18:33
documentation for things that might be
00:18:34
relevant to that question you copy and
00:18:36
paste the whole lot into the model like
00:18:39
and these models can take quite a lot of
00:18:40
input now and then you put the user's
00:18:41
question at the end that's it right
00:18:43
super super simple don't get it's so
00:18:45
simple I I actually wrote an article
00:18:48
about it and I I had a one of the the
00:18:50
people who who guest wrote it built an
00:18:53
open- Source tool to well just a tool to
00:18:56
do your own rack training and you could
00:18:57
plug in Chad GB and you know I did it I
00:19:00
understand the code and the code itself
00:19:01
was very simple and I was like is is
00:19:03
this all there is to it like you just
00:19:05
break it up into you know chunks you get
00:19:07
some embedding so you can uh figure out
00:19:09
where where search will end you and then
00:19:10
you just add in that extra thing and the
00:19:12
only thing obviously you can go down to
00:19:14
the rabbit hole but for simple rag is
00:19:17
you decide on the context window size
00:19:18
for the most part and I was like and I
00:19:21
was amazed at how well as you said like
00:19:23
it seemed so simple so I looked at the
00:19:25
code and I said well this I mean I'm not
00:19:27
expecting much and when I tried it out
00:19:28
it work worked really well it's one of
00:19:31
those counter I I feels there are some
00:19:33
counterintuitive things yeah so rag it's
00:19:36
the hello world of building software on
00:19:38
top of llms like you don't get into to
00:19:40
print hello world you get it to answer
00:19:41
questions about your documentation and
00:19:43
I've implemented like 30 like 30 lines
00:19:45
of python I've got one version that's
00:19:46
like two dozen lines of bash I think
00:19:48
it's very easy to get the basic version
00:19:50
working but getting good rag working is
00:19:54
incredibly difficult because the problem
00:19:55
is that um if you built the system and
00:19:57
you know how it works you're naturally
00:19:59
going to ask questions of it in the
00:20:00
right kind of format the moment you
00:20:02
expose it to real human beings you they
00:20:04
will come up with an infinite quantity
00:20:07
of weird ways that they might ask
00:20:08
questions and so the art of building
00:20:10
good rag systems the reason that it
00:20:12
could take six months to actually get it
00:20:14
production ready is figuring out okay
00:20:17
there were all of these different ways
00:20:18
that it can go wrong and the the key
00:20:20
trick and rag is always how do we fill
00:20:22
that context how do we pick the
00:20:24
information that's most relevant to what
00:20:25
the user is asking which is really hard
00:20:28
that's actually like it's an information
00:20:29
retrieval problem it's what search
00:20:31
Engineers have been trying to figure out
00:20:33
for 30 years and there's a lot of depth
00:20:35
to that field so rag just like
00:20:37
everything else in language models it's
00:20:40
fractally interesting and complicated
00:20:43
like it's simple at the top and then
00:20:44
each little aspect of it gets more and
00:20:46
more involved the further you look one
00:20:48
of my favorite difficult problems in
00:20:50
this is um what's called in the industry
00:20:52
evals right automated evaluations
00:20:54
because when you're writing software we
00:20:56
write automated tests we write unit
00:20:57
tests and they intive our software works
00:20:59
and that's great you can't do that with
00:21:01
language models because they're
00:21:03
non-deterministic like they they they
00:21:06
very rarely return exactly the same
00:21:07
answer so we don't even have unit
00:21:09
testing but with with things like rag we
00:21:12
need to have automated tests that can
00:21:14
tell us okay we tweaked our algorithm
00:21:16
for picking content is it better like
00:21:19
does that do a better job of answering
00:21:21
questions it's really difficult I'm
00:21:23
still trying to figure out the right
00:21:24
path this myself and I I talk with
00:21:26
someone who's working at an AI company
00:21:28
and the weird thing that I would just it
00:21:30
just feels it breaks all that we know is
00:21:33
they have this eval test Suite which
00:21:35
which runs against their model whenever
00:21:37
they make a change they run it and she
00:21:38
told me like okay it's it cost us $50 to
00:21:41
run this every single time wow and this
00:21:44
is just something I don't think we've
00:21:46
been used to like you know like I run my
00:21:48
test like as as a software Eng I run my
00:21:49
unit test integr I know how much time it
00:21:51
costs me but suddenly obviously they're
00:21:54
using uh different apis whichever vendor
00:21:57
this is just it feels like there's a bit
00:21:59
of a this clearly used to be the thing
00:22:01
before my time at least like back when
00:22:04
there were you know servers or main
00:22:05
frames or Computing time was expensive
00:22:07
but but suddenly like this is just yet
00:22:09
another interesting variable so yep yeah
00:22:12
so you don't want to run those on every
00:22:13
commit to your repository that'll
00:22:15
bankrupt you pretty quickly it's also
00:22:17
funny that um with evals one of the most
00:22:19
common techniques is what's called llm
00:22:21
as a judge so you know if you're trying
00:22:23
to say okay I'm I'm building a
00:22:25
summarizer uh here's an article I want
00:22:27
it summarized here is the summary how
00:22:29
can you write tests against a summary to
00:22:32
check that it's actually good and what a
00:22:34
lot of people do is they Outsource that
00:22:35
to another model so they produce two
00:22:37
summaries and then they say hey gp4
00:22:39
which of these two summaries is best and
00:22:42
I find that so uncomfortable like this
00:22:44
stuff is all so weird and difficult to
00:22:46
evaluate already and now we're throwing
00:22:47
in another letter of weird language
00:22:49
models to try and give us a score from
00:22:51
our previous language models but that's
00:22:53
kind of these are the the kind of
00:22:54
options that we're exploring at the
00:22:56
moment yeah it's it's interesting was
00:22:58
speaking about op options so you've
00:23:00
experimented a lot with trying out
00:23:02
different tools including build building
00:23:03
your own and and obviously co-pilot and
00:23:05
and and other models I I I saw you
00:23:09
mentioned Claude for example as what
00:23:11
when you're playing with what is your
00:23:12
current llm stack and like day-to-day
00:23:16
how do you use it for for actually
00:23:17
coding on on data set or on your
00:23:19
projects so my default stack right now
00:23:23
is um my default model is Claude 3.5
00:23:26
Sonet which is brand new came out maybe
00:23:29
3 weeks ago I I I heard it's amazing for
00:23:31
coding it's it's amazing for everything
00:23:33
it is the first time somebody who's not
00:23:35
open AI has had the clearly best model
00:23:38
like it's it's just better than open
00:23:40
ey's best best available models at the
00:23:42
moment the um the team behind it the
00:23:43
company behind it anthropic are actually
00:23:45
a splinter group from open AI they split
00:23:48
a couple of years ago and apparently
00:23:50
it's because they tried to get Sam
00:23:52
ultman fired which you can't do like we
00:23:54
we saw this happen publicly 6 months ago
00:23:57
but they were like they were they were
00:23:58
early adopters two two and a half years
00:24:00
ago they tried to get S outman fired it
00:24:01
didn't work they quit and spun up their
00:24:03
own company and they they were some of
00:24:05
the people who built the built GPT 4 so
00:24:07
it's actually the the the the sort of
00:24:09
gp4 original team but anyway clae 3.5
00:24:12
Sonet is unbelievably good um it's my
00:24:16
default for most of the work that I'm
00:24:18
doing I still use GPT 40 which is open
00:24:21
ai's probably their best available model
00:24:24
for mainly because mainly for two
00:24:26
features it's got code into mode this
00:24:29
thing where it can write python code and
00:24:30
then execute that python P so sometimes
00:24:33
I'll throw a fiddly problem at it and
00:24:34
I'll watch it try five or six times
00:24:37
until it works and I just sit there and
00:24:39
watch it going through the motions so I
00:24:41
use that a lot and then chat chat GPT
00:24:45
has the voice mode which I use when I'm
00:24:47
walking my dog cuz you can stick in a
00:24:50
pair of airpods and you can go for an
00:24:52
hourong walk with the dog and you could
00:24:53
talk to this weird AI assistant and have
00:24:56
it write you code because it can do
00:24:57
codeing
00:24:58
and it can look things up on the
00:25:00
internet and such like so you can have a
00:25:02
very productive hourong conversation
00:25:04
while you're walking the dog on the
00:25:05
beach this I was not expecting I'll be
00:25:08
Hest that's very that is the most
00:25:11
dystopian sci-fi future thing as well
00:25:14
like the voice mode and this is the this
00:25:16
isn't the fancy new voice mode they
00:25:17
demoed a few weeks ago this is the one
00:25:18
they found for like uh six months it's
00:25:21
so good like the the intonation the the
00:25:24
voice it's it's it's like having a
00:25:25
conversation with an intern who can go
00:25:28
look things up for you and and then so
00:25:30
so you mentioned the the stack but like
00:25:32
if I imagine your data you know you've
00:25:34
got your terminal or your coat there
00:25:36
there's more to my stack so it's um
00:25:38
those are the ones I'm using in my
00:25:40
browser and on my phone um I use get I
00:25:42
do I use GitHub co-pilot um I've always
00:25:44
got that turned on I use my I've bu been
00:25:47
building this open source tool called
00:25:48
llm which is command line just a
00:25:50
question a coil what features do you use
00:25:53
cuz it's now has a competing feature it
00:25:54
does have a chat window if you want to
00:25:56
use that it has auto complete which ones
00:25:58
find most useful for your use cases
00:26:00
mostly autocomplete like old school
00:26:02
co-pilot I've recently started using the
00:26:04
thing where you can select some lines of
00:26:06
code there's a little sparkly icon you
00:26:08
can click that and then give it a prompt
00:26:10
to what run against those lines of code
00:26:11
and it'll do that I don't use the chat
00:26:13
window at all I use clae 3 I use um clae
00:26:16
clae in the browser for what I would use
00:26:18
that for um and it's great you know um
00:26:21
it's it's copil it's another interesting
00:26:23
one where you hear from people who like
00:26:25
I turned it on and it just gave me a
00:26:26
bunch of junk and I turned it off again
00:26:27
cuz it's clearly not useful and again
00:26:29
co-pilot you have to learn how to use it
00:26:32
like there's no manual for any of this
00:26:34
stuff especially not for co-pilot and
00:26:35
that you have to learn things like if
00:26:37
you type out the start of a function
00:26:39
name and give it named par clearly named
00:26:42
parameters with their types or type
00:26:44
annotations it will complete the
00:26:46
function for you and if you add a
00:26:48
comment it will like you can you can you
00:26:50
learn you prompt it through the comments
00:26:52
that you write essentially yeah I I've
00:26:53
actually started to use that it's it's
00:26:55
actually again no one tells you that but
00:26:57
once once you figure it out it's it can
00:26:59
be rful because that's how you can
00:27:00
generate like either a small part for me
00:27:03
just a small part or a function it just
00:27:05
gets it and again like as I mean it's
00:27:07
not surprising but the more context you
00:27:09
give in the comment the more it'll kind
00:27:11
of do what you want if you're lucky I
00:27:13
think the other thing to know about
00:27:14
co-pilot is that it's actually running
00:27:16
rag it's got an incredibly sophisticated
00:27:19
um like retrieve look rag um uh
00:27:23
mechanism where every time it does a
00:27:25
completion for you co-pilot it tries to
00:27:27
include context from nearby in your file
00:27:30
but it also looks for other files in
00:27:32
your project that have similar keywords
00:27:34
in them so that's why sometimes your
00:27:36
test that's really interesting that you
00:27:38
say that because we're going to get to
00:27:39
the misconceptions but we've been
00:27:40
running an AI survey and one of the
00:27:43
things that people really complain about
00:27:45
saying is I use copilot because it's
00:27:48
it's the one that's easiest to turn on
00:27:49
in your ID and people said that it only
00:27:52
uses my files and I wish it would look
00:27:54
at the project or understand the whole
00:27:56
project but it's interesting you say
00:27:57
that cuz I think lot of people don't
00:27:58
realize that it is trying to do it or in
00:28:01
smart ways most people or not most but a
00:28:03
lot of people assume that it just only
00:28:05
looks at whatever you're seeing on the
00:28:06
screen no it's it is looking at bits of
00:28:08
other files but it's undocumented and
00:28:10
it's weird and it's trying to do
00:28:11
semantic similarities and all of that
00:28:13
sort of stuff what I do a lot of is
00:28:15
sometimes I'll just copy and paste a
00:28:16
chunk of one file into a comment in
00:28:18
another so that it's definitely visible
00:28:20
to co-pilot that's great for things like
00:28:22
writing tests you can literally copy in
00:28:24
the code that you're testing into your
00:28:26
test.py and then start so I'm now
00:28:29
starting to understand you know when you
00:28:31
said you need to learn how to use it
00:28:32
sounds like you kind of you're coming
00:28:33
from the other way instead of like
00:28:35
trying out and saying y or nay and you
00:28:37
know like cuz because I guess you're
00:28:38
working for yourself so it kind of makes
00:28:40
sense that you want to make yourself
00:28:41
productive you figure it out how these
00:28:43
things can actually like make you more
00:28:46
productive right absolutely and like
00:28:48
it's so much work like that's the I
00:28:50
think the the biggest sort of
00:28:52
misconception about all of this is that
00:28:54
you'll get this tool and it'll make you
00:28:55
productive on day one and it absolutely
00:28:57
won't you have to put it
00:28:59
so much effort to learn to explore it an
00:29:01
experiment and learn how to use it and
00:29:03
there's no guidance like I said co-pilot
00:29:05
doesn't have a manual which is crazy
00:29:07
Claude to its credit Claude is the only
00:29:09
one of these things that actually has
00:29:11
documentation that's really good like if
00:29:13
you want to learn how to prompt llms the
00:29:15
clawed anthropic prompting guide is the
00:29:17
actually the best thing I've seen
00:29:18
anywhere open air I have almost nothing
00:29:21
there are so many hypers and blogs and
00:29:24
tweets and Linkedin posts full of like
00:29:27
junk junk advice you know all of the
00:29:29
things like always tell it that you are
00:29:31
the world's greatest expert in X before
00:29:33
you ask all of that kind of mostly
00:29:35
rubbish right but there's so much
00:29:38
Superstition because this stuff isn't
00:29:40
documented and even the people who
00:29:42
created the models don't fully
00:29:43
understand how they do what they do it's
00:29:46
very easy to form superstitions you know
00:29:47
you try the you're the world's greatest
00:29:49
expert in Python thing and you get good
00:29:51
answer so you're like okay I'll do that
00:29:52
from now on it's kind of like um if your
00:29:55
dog finds a hamburger in a bush
00:29:58
every time you walk past that bush for
00:30:00
the next two years they will check for a
00:30:01
hamburger right because dogs are very
00:30:03
superstitious and it's that but for but
00:30:06
for software
00:30:07
engineering and then going back to your
00:30:09
stack so uh yeah couple tools but uh
00:30:15
there's a few more so there's um I
00:30:17
talked about code interpreter one of my
00:30:19
favorite Claude features is again in the
00:30:20
feature from a few weeks ago called
00:30:22
artifacts which is this thing where
00:30:24
Claude can now write HTML and CSS and
00:30:26
JavaScript and then it can show you that
00:30:28
in like a little secure iframe and so it
00:30:31
can build you tools and one of
00:30:32
interfaces and prototypes on demand and
00:30:36
it's quite limited they can't make API
00:30:38
calls from in there it can't actually
00:30:39
see the results so it doesn't have that
00:30:41
debug Loop that code interpreter has but
00:30:43
still it's amazing like I've been um I
00:30:46
redesigned pages on my blog by pasting
00:30:49
in a screenshot of my blog and then
00:30:51
saying try suggest better color scheme
00:30:54
for this and show me a prototype of an
00:30:56
artifact and it did so cool so I'm doing
00:30:59
a lot more front end stuff now because I
00:31:00
can get Claud to build me little
00:31:02
interactive prototypes along the way to
00:31:04
help speed that up um so I'm spending a
00:31:07
lot of time with that I have my my
00:31:09
command line tool llm lets you run
00:31:10
prompts from the command line and the
00:31:12
key feature of that is that you can pipe
00:31:14
things into it so I can like cat a file
00:31:17
into that and say llm write the tests
00:31:20
and it will output test for that and
00:31:22
then just understand you just build like
00:31:24
it's a command line are you running a
00:31:25
local model or somewhere a model ser
00:31:28
llm the tool it's based around plugins
00:31:31
and it can talk to over a 100 different
00:31:33
models is an open SCE tool so yes it's
00:31:36
my my big open my my open source
00:31:38
language model command line project we
00:31:40
we'll link it in the show notes as well
00:31:42
and yes so it's plug-in based originally
00:31:44
it could just do open Ai and then I
00:31:46
added plugins and now it can run local
00:31:47
models and it can talk to other models
00:31:49
too so I mainly use it with with claw
00:31:52
because that's the best available model
00:31:54
but I've also run like Microsoft's 53
00:31:56
and llama and um Al and mistol and
00:31:59
things I can run those locally which to
00:32:01
be honest I don't use on a day-to-day
00:32:03
basis because they're just not as good
00:32:05
you know the local models are very
00:32:07
impressive but the really like high-end
00:32:10
the the best of the best models run
00:32:12
circles around them so when I'm trying
00:32:13
to be productive I'm mostly working with
00:32:16
the the the best available models I love
00:32:18
running the local models for sort of
00:32:20
research and for playing around and also
00:32:23
they're a great way to learn more about
00:32:24
how language models actually work and
00:32:26
what they can do because when you like
00:32:29
um people talk about hallucination a lot
00:32:31
I think it's really useful to have a
00:32:33
model hallucinate at you early because
00:32:35
it helps you get that better mental
00:32:37
model of of of what it can do and the
00:32:39
local models hallucinate wildly so if
00:32:41
you really want to learn more about
00:32:43
language models running a tiny little
00:32:45
like some of them are like two or three
00:32:47
gigabyte files that you can run on a
00:32:48
laptop I've got one that runs on my
00:32:50
phone it's actually really which surpris
00:32:52
yeah um there's an app called mlc mlc
00:32:56
chat and it can run Microsoft 53 and um
00:33:01
Google's Gemma and it's got mistal 7B
00:33:04
these are very good models like if you
00:33:06
ask them like if you say who is Simon
00:33:08
willson they will make up things that's
00:33:10
a great I I love I use like ego searches
00:33:13
to basically see how much they
00:33:15
hallucinate they'll they'll say he was
00:33:16
the CTO of GitHub and I'm like well I
00:33:18
really wasn't but I do use GitHub um but
00:33:22
but they like I've used these on planes
00:33:24
they're good enough at python that I can
00:33:25
use them to like look up little bits of
00:33:27
API doation they can't remember and
00:33:29
things like that um and it runs on your
00:33:31
phone it's really fun yeah awesome so
00:33:35
like looking back you've now been coding
00:33:37
for like more than 20 years right I mean
00:33:40
depending on professionally people have
00:33:42
been paying me for 20 years at this
00:33:43
point people paying for 20 years so like
00:33:45
through throughout this time you know we
00:33:46
have seen some some increases in in
00:33:50
productivity may that be fire Buck
00:33:51
coming out for for developers or other
00:33:54
things like if you could you talk
00:33:56
through like what were kind like bumps
00:33:58
when you became more productive as a
00:34:00
developer and then when we get to llms
00:34:02
compared to how this bump compares to
00:34:04
those ones I love that you mentioned
00:34:06
Firebug because that was a big bump
00:34:08
right I yeah um Firebug was the it was
00:34:12
the Chrome Dev tools before browsers had
00:34:15
them built in it was an extension for
00:34:16
Firefox that added essentially what you
00:34:18
recognize as as the developer tools now
00:34:20
and that was an absolute Revelation when
00:34:22
it came out especially for me because
00:34:25
I've spent most of my career as a python
00:34:27
programmer my favorite feature of python
00:34:29
is the interactive prompt I love being
00:34:31
able to code by writing a line of code
00:34:34
and hitting enter and seeing what it
00:34:35
does and then you end up copying and
00:34:36
pasting a bunch of those Explorations
00:34:38
into a file but you know that it's going
00:34:40
to work because you you worked on it
00:34:41
interactively Firebug instantly brought
00:34:43
that to JavaScript like suddenly you
00:34:45
could interactively code against a live
00:34:47
web page and figure things out that way
00:34:48
so that was a big one um I think the
00:34:51
biggest yeah I think just as a reminder
00:34:54
cuz like some some listeners were were
00:34:55
not necessar around but before firebug I
00:34:57
was doing web development and the way
00:34:59
you debugged your JavaScript
00:35:00
applications which were pretty simple at
00:35:02
the time but you did alerts to to show
00:35:05
we didn't even have
00:35:07
console.log cons was invented by Firebug
00:35:10
yeah so it was just really painful and
00:35:12
really hard to debug and you also
00:35:14
couldn't really inspect the elements so
00:35:15
you were changing it it was like doing
00:35:17
it in the dark and and as you say it it
00:35:19
was a game changer and now these days
00:35:21
Chrome developer tools is better than
00:35:23
what Firebug used to be but Firebug was
00:35:25
a was almost as good as the Chrome
00:35:27
developer tool
00:35:28
today in my memory at least so it was
00:35:30
this huge leap and like I think for
00:35:32
front developers like it's hard to tell
00:35:34
how much more but I'm sure at least you
00:35:36
know twice the productivity I'll just
00:35:38
say something because it it it took so
00:35:39
much longer to fix things or to
00:35:41
understand why things were happening so
00:35:43
yeah like that that was a big jump so
00:35:45
firebugs a good one the biggest
00:35:46
productivity boost my entire career is
00:35:48
just open source generally like so it
00:35:50
turns out 25 years ago you had to really
00:35:54
fight to use anything open source at all
00:35:56
like a lot of companies had blanket bans
00:35:58
on open- Source Ed like Microsoft
00:36:02
were were were Camp were were making the
00:36:05
case that this is a very risky thing for
00:36:07
you to even try that's completely gone
00:36:08
out of the window I don't think there's
00:36:10
a company left on Earth that can have
00:36:11
that policy now because how are you
00:36:13
going to write any front end code
00:36:14
without npm you know that's that's
00:36:17
that's all but that the um so it was
00:36:19
open source as a concept and I was very
00:36:21
early on in open source you know Django
00:36:23
was a we we ojango open source in 2005
00:36:26
Python and PHP and so forth all came out
00:36:28
of the open source community and that
00:36:31
was huge because prior to open source
00:36:33
the way you wrote software is you sat
00:36:35
down and you implemented the same thing
00:36:37
that everyone else had already built or
00:36:39
if you had the money you bought
00:36:40
something from a vendor but good luck
00:36:43
buying a decent thing and then of course
00:36:44
you can't customize it because it's
00:36:46
proprietary and that the open source and
00:36:49
then on top of um of Open Source as a
00:36:51
concept it really was um GitHub coming
00:36:54
along massively accelerated open source
00:36:56
because prior to that it was Source
00:36:58
Forge and mailing lists and c c CVS and
00:37:01
subversion and just starting a new
00:37:04
project you had like I started open
00:37:05
source projects where I had to start by
00:37:07
installing track which meant I needed to
00:37:09
run a virtual private server and then
00:37:11
get Linux secured and then install like
00:37:14
the open source alternative to what gith
00:37:15
her became it was great software but it
00:37:17
was not exactly a oneclick experience um
00:37:21
so open source was absolutely huge and
00:37:23
then you had GitHub making open source
00:37:25
way more productive and accessible
00:37:27
massively accelerating then the package
00:37:29
managers so um pii for Python and npm
00:37:32
for JavaScript and I mean the the OG of
00:37:35
that was um was cpan for Pearl which was
00:37:38
up and running in the late 90s and it's
00:37:40
where we we owe so much to to cpan and
00:37:43
sort of H how it made that kind of thing
00:37:45
happen you know today the productivity
00:37:48
boost you get from just being able to
00:37:49
pip install or npm install a thing that
00:37:51
solves your problem I think my my hunch
00:37:54
is that developers who crew grew up with
00:37:56
that already B have no idea how much of
00:37:59
a difference that makes like when I did
00:38:01
it my my software engineering degree 20
00:38:03
years ago um the big one of the big
00:38:06
challenges everyone talked about was was
00:38:08
was software reusability right like why
00:38:10
are we writing the same software over
00:38:12
and over again and at the time people
00:38:14
thought oop was the answer they're like
00:38:16
oh if we do everything as classes in
00:38:18
Java then we can subclass those classes
00:38:20
and that's how we'll solve reusable
00:38:22
software with Hite that wasn't the fix
00:38:24
the fix was open source the fix was
00:38:26
having a diverse and vibrant open source
00:38:29
Community releasing software that's
00:38:31
documented and you can package and
00:38:32
install and all of those kinds of things
00:38:34
that's been incredible like that that
00:38:36
the um the the the the cost of building
00:38:39
software today is a fraction of what it
00:38:41
was 20 years ago purely thanks to open
00:38:44
source it's interesting because like
00:38:46
when we talk about developer
00:38:47
productivity like it's it's a topic that
00:38:49
will come back and obviously it's very
00:38:51
popular very important for people in
00:38:53
leadership positions you know who are
00:38:55
hiring certain number of people and
00:38:57
there their um CEOs and will ask how are
00:39:01
these people used and right now there's
00:39:03
a big big you know push to say that geni
00:39:06
is adding this and this much
00:39:08
productivity but it's two things are
00:39:10
interesting one is that we don't really
00:39:12
talk about how much just having open
00:39:13
source or not having to do it ads we
00:39:15
just I guess we just take it for granted
00:39:18
and the other thing that I want to ask
00:39:19
you I want to ask you like how much more
00:39:20
productive do you think with this
00:39:21
current workflow you have which is
00:39:23
pretty Advanced it sounds like it you're
00:39:24
using a bunch of different tools you
00:39:26
spend a lot of time tweaking it so I'm
00:39:28
going to assume you're one of the the
00:39:31
software Engineers who are using it more
00:39:33
efficiently to your own personal
00:39:35
productivity how do you feel like how
00:39:38
much more productiv this makes you and
00:39:40
and you know there's a kave here
00:39:41
obviously it's hard to you know like be
00:39:44
honest about yourself but right now the
00:39:46
good thing is we don't have any like any
00:39:48
polls vendors will obviously have a bias
00:39:51
to say AI vendors that it's helping them
00:39:53
more and you know people who might not
00:39:56
like these tools they might have a to
00:39:57
say like ah it's not not even helping me
00:39:59
so I I think we're we the best answer we
00:40:01
can probably get right now is just from
00:40:02
like people like you looking honestly at
00:40:04
yourself and like okay so I think I've
00:40:07
got two answers to this um I it's
00:40:10
difficult to like quantify this but um
00:40:13
my guess for a while has been that I've
00:40:15
had a giant productivity boost in the
00:40:17
portion of my job which is typing code
00:40:20
at a at at a computer and I I I would
00:40:22
estimate I am two to three times more
00:40:24
produ like faster at turning thoughts
00:40:27
into working code than I was before but
00:40:30
that's only 10% of my job like as a
00:40:31
software engineer as once you're once
00:40:33
you're sort of more senior software
00:40:35
engineer the typing in the code bit is
00:40:36
is not near you spend way more time
00:40:38
researching and figuring out what the
00:40:40
requirements for the thing are and all
00:40:42
of those other activities um so huge
00:40:45
boost for for typing for for for typing
00:40:48
code the other thing that's and and it
00:40:52
does speed up a lot of the other
00:40:53
activities the research activity in
00:40:55
particular like if I need a little
00:40:58
JavaScript library to solve a particular
00:41:00
problem because I have a I I I have a
00:41:02
bias towards boring technology anyway if
00:41:04
I ask Claude or gp4 it will I always ask
00:41:07
for options I always say give me options
00:41:09
for solving this problem and it spits
00:41:12
out three or four and then I can go and
00:41:13
look at those and it's effectively using
00:41:15
as a slightly better slightly faster and
00:41:17
more productive Google search because
00:41:19
you can say things to it like okay now
00:41:20
show me an ex example code that uses
00:41:23
that option if you're using Claude sonit
00:41:25
you can say show me the interactive
00:41:26
prototype of that opt
00:41:28
um all of that so that that research
00:41:31
stuff happens more quickly for me um
00:41:34
there's a whole bunch of those sort of
00:41:35
smaller productivity boosts the bigger
00:41:37
one the more interesting one for me is
00:41:40
um I can take I can take on much more
00:41:42
ambitious project because I'm no longer
00:41:44
limited to the things that I already
00:41:46
know all of the trivia about and I feel
00:41:49
like this is one of the most important
00:41:51
aspects of all of this is if you want to
00:41:53
program in Python or JavaScript or go or
00:41:56
bash or whatever there's a baseline of
00:41:58
trivia that you need to have at the
00:42:00
front of your mind you need to know how
00:42:01
for loops work and how conditionals work
00:42:03
and all of that kind of stuff and so I
00:42:05
think there is a limit on the number of
00:42:07
programming languages most people can
00:42:09
work in like I've found personally I
00:42:11
Capt out at about four or five
00:42:13
programming languages and if I want to
00:42:15
start using another one there's a like a
00:42:17
month potentially a monthl long spin up
00:42:19
for me to start get get and that means I
00:42:21
won't do it right why would I use go to
00:42:24
solve a problem if I have to spend a
00:42:26
month spinning up on go when I could
00:42:27
solve it with python today that is gone
00:42:30
like I am using a much wider range of
00:42:33
programming languages and tools right
00:42:35
now because I don't need to know how for
00:42:37
loops and go work I need to understand
00:42:39
the sort of higher level concepts of go
00:42:41
like memory management and co go
00:42:43
routines and all of that kind of stuff
00:42:45
but I don't have to memorize the trivia
00:42:47
so given that I've actually shipped go
00:42:50
codes to production despite not being a
00:42:52
go programmer just sort of six months
00:42:54
ago that's been running happily every
00:42:56
day and it has unit test and it has
00:42:57
continuous integration and continuous
00:42:59
deployment and all of the stuff that I
00:43:01
think is important for code and I could
00:43:04
do that because the language model could
00:43:06
fill in all of those little sort of
00:43:07
trivia bits for me this episode is
00:43:09
sponsored by tldr tldr is a free daily
00:43:12
newsletter covering the most interesting
00:43:14
stores in Tech startups and programming
00:43:16
join more than 1 million readers and
00:43:18
sign up at tldr dotech that is
00:43:21
tldr dotech I sometimes dread going back
00:43:24
to certain side projects where it takes
00:43:26
me a while to spin up and remember and
00:43:28
it's in a language or an outdated uh
00:43:31
framework that that I just don't want to
00:43:32
touch and it like what you said the
00:43:35
confidence is is higher and I can
00:43:36
actually just paste Parts into chat GPC
00:43:39
or turn on GitHub copile and it'll like
00:43:41
I know what good looks like so I think
00:43:43
when when you know that even a different
00:43:45
you need to have that experience like if
00:43:48
I was a a brand new programmer I don't
00:43:50
think it would I'd be using it to write
00:43:52
go despite not knowing go but I've got
00:43:54
20 years of experience I I can look I
00:43:57
can read code that it's written in a
00:43:58
language that I don't know very well and
00:44:00
I can still make a pretty good like
00:44:02
evaluation of if that's doing what I
00:44:04
needed to do and if that looks like it's
00:44:06
good um I guess there's an important
00:44:09
disclaimer right that the more you look
00:44:10
at languages as long as it's an
00:44:12
imperative language like you can read it
00:44:15
right I think it will be a bit different
00:44:16
if you we we don't really to use some
00:44:18
languages are not as popular like
00:44:20
prologue and SML and some of these
00:44:22
really trust myself yeah I would not
00:44:25
trust myself to just look prologue code
00:44:27
that it had written me and make a
00:44:30
judgment as to whether that was good
00:44:31
prologue code but I feel like I can do
00:44:33
that with with with languages like go
00:44:34
and rust you know yeah so so with with
00:44:37
that I think it's good by the way thanks
00:44:40
for sharing I think it's great to see
00:44:41
that you are getting productivity and
00:44:43
but it also took a lot of work I I think
00:44:45
like a big takeaway for for me would be
00:44:48
anyone who's trying out is like like put
00:44:50
in the work and experiment to figure out
00:44:52
what what workflow works for yourself
00:44:54
and that there's just no answers I mean
00:44:55
you've been I I think you you've been
00:44:57
experimenting a lot more than most
00:44:58
people have and and still sounds like
00:45:01
it's it's a working progress oh with
00:45:03
this I I I really want to touch on
00:45:06
misconceptions and and doubts they might
00:45:08
not be misconceptions there doubts and
00:45:10
questions that a lot of people have
00:45:11
about these tools let's talk about
00:45:14
resistance a little bit because I feel
00:45:15
like the resistance lots of I see so
00:45:17
much resistance to this and it's a very
00:45:19
natural and very understandable thing
00:45:20
this stuff is really weird you know it's
00:45:23
weird and it is uncomfortable and the
00:45:25
ethics around it are so mer like these
00:45:27
models were trained on vast quantities
00:45:29
of unlicensed copyrighted data and
00:45:32
whether or not that's legal and I I'm
00:45:34
not a lawyer I'm not going to go into
00:45:36
that the the morality the ethics of that
00:45:39
like especially when you look at things
00:45:40
like um image models like stable
00:45:42
diffusion which are now when now being
00:45:45
used when you would have commissioned an
00:45:47
artist instead and they were trained on
00:45:49
that artist work like that's I don't
00:45:51
care if that's legal that's blatantly
00:45:52
unfair right if something trained on
00:45:54
your work one person there's a person
00:45:57
who who wrote just this that they tried
00:45:59
it out didn't work that well plus they
00:46:00
don't want to use it because they
00:46:02
disagree fundamentally with this and
00:46:04
honestly I respect that position I I
00:46:06
think that is's a it's I I I've compared
00:46:08
it to being vegan in the past right the
00:46:10
veganism I think there's a very strong
00:46:12
argument that for for for why you should
00:46:14
be a vegan and I understand that
00:46:16
argument and I'm not a vegan so I have
00:46:18
made that sort of personal ethical
00:46:20
choice and all of this stuff does t come
00:46:22
down to personal ethical choices if you
00:46:24
say I am not going to use these models
00:46:27
until somebody produces one that was
00:46:28
trained on entirely like like licensed
00:46:30
data I absolutely respect that I I think
00:46:33
that's a very like I I've not made that
00:46:35
decision myself um and you know for the
00:46:38
code stuff um it's all it's basically
00:46:40
trained on on every piece of Open Source
00:46:42
Code they could get on but it is
00:46:44
ignoring the license terms you know the
00:46:45
GP licenses that say attribution is
00:46:48
important you can't attribute what comes
00:46:49
out of a model because it's been
00:46:50
scrambled with everything else so yeah
00:46:52
there the ethical concerns I completely
00:46:54
respect um but then there's also so it's
00:46:58
scary right it is scary when you think
00:47:00
okay I earn a very good salary because I
00:47:03
have worked through the trivia of
00:47:05
understanding Python and JavaScript and
00:47:06
I'm better at that trivia than most
00:47:07
other people and that gets that that and
00:47:09
now you've got this machine that comes
00:47:11
along and it's better at the trivia than
00:47:13
I am like it knows the things that I
00:47:15
know it I mean knows in scare quotes um
00:47:19
that that is disconcerting and um there
00:47:22
there I feel like there's a pessimistic
00:47:23
and an optimistic way of taking on the
00:47:25
pessimistic way is saying
00:47:27
okay I better learn to be I I need to go
00:47:30
into the trades I need to learn Plumbing
00:47:31
because my job is not going to exist in
00:47:33
a few years time yeah um the optimistic
00:47:35
version the version I take on is I can
00:47:38
use these tools better than anyone else
00:47:40
for programming I know I I can take my
00:47:42
existing programming knowledge and when
00:47:44
I combine it with these tools I will run
00:47:46
circles around somebody who's never
00:47:49
written a code line of code in their
00:47:50
life and is trying to build an iPhone
00:47:52
app using chat GPT I can just do this
00:47:54
stuff better so we've essentially got
00:47:56
these um
00:47:57
tools that are they're actually power
00:47:59
user tools right you have to put a lot
00:48:01
of work into mastering them and when
00:48:03
you've got that when you combine
00:48:06
expertise in using tools with expertise
00:48:07
in a subject matter you can operate so
00:48:10
far above other people and like the
00:48:13
competitive Advantage you get is
00:48:14
enormous that's something that actually
00:48:16
does worry me most about the resistance
00:48:18
is I like people who are resisting this
00:48:20
stuff right I like that they're not
00:48:22
falling for the hype I like that they're
00:48:24
care about the ethics of it I like that
00:48:26
they're questioning
00:48:27
I don't I it it would upset me if that
00:48:29
put them at a serious professional
00:48:31
advantage over the next few years as
00:48:33
other people who don't share their
00:48:35
ethics start being able to churn out
00:48:37
more stuff because they've got this this
00:48:38
this additional it's like if you were to
00:48:40
say I don't like I don't like search
00:48:44
engines I'm never going to search for an
00:48:45
answer to my programming problem that
00:48:47
would set you back enormously right now
00:48:49
and it's I feel like it's a it's in a
00:48:50
similar kind of space to that yeah and
00:48:54
so another I guess
00:48:57
cons opinion I hear a lot is well it
00:49:00
seems like this whole technology is
00:49:02
pling like if we look at the past 18
00:49:04
months chat GPT 4 is okay Cloud might be
00:49:08
a little bit better Sonet okay cool but
00:49:10
like you know Let's ignore that for just
00:49:11
a second gith up co-pilot hasn't changed
00:49:13
all all that much so I I do see a sense
00:49:16
especially for for people who are
00:49:17
managing uh engineers and they're also
00:49:20
playing with this tool saying like well
00:49:21
it sounds like this is what what it's
00:49:23
going to be you know like we just use it
00:49:26
as is is is is this all like you're
00:49:29
you're more in the whis do you see
00:49:32
improvements or drastic improvements or
00:49:33
little
00:49:34
improvements that's a really interesting
00:49:36
question I mean from my perspective I'd
00:49:39
kind of Welcome a plateau at this point
00:49:41
it's been a bit exhausting keeping up
00:49:42
with the stuff over the last two years
00:49:44
um I feel like if there were no
00:49:46
improvement if we if what we have today
00:49:48
is what we're stuck with for the next
00:49:50
two years it would still get better
00:49:52
because we'd all figure out better ways
00:49:53
to use it you know a lot of the one of
00:49:56
the most one of my favorite advances in
00:49:58
language models is this thing called
00:49:59
Chain of Thought prompting right this is
00:50:02
this thing where if you say to a
00:50:03
language model solve this puzzle it'll
00:50:06
often get it wrong and if you say solve
00:50:07
this puzzle think step by step and it'll
00:50:10
then say Okay step one this step two
00:50:12
step step three and often it'll get it
00:50:14
right and the wild thing about Chain of
00:50:17
Thought prompting is that it was
00:50:19
discovered against gpt3 about a year
00:50:22
after gpt3 came out was an independent
00:50:24
research paper that was put out saying
00:50:25
heyy it turns out
00:50:27
take this model and say think step by
00:50:28
step and it it gets better at all of
00:50:30
this stuff nobody knew that right the
00:50:32
people who built gpt3 didn't know that
00:50:33
it was an independent Discovery we've
00:50:35
had quite a few examples like this and
00:50:38
so if we are in a
00:50:40
plateau then I think we'll still get
00:50:42
lots of advances from just people
00:50:44
figuring out better ways to use the
00:50:46
tooling I a lot of this also comes down
00:50:49
to whether or not you buy into the whole
00:50:50
AGI thing right like um so much of the
00:50:54
kind of room here right and um so so
00:50:57
much like it's kind of like Tesla
00:50:59
self-driving cars right you've got these
00:51:01
the the CEOs of these companies go and
00:51:03
say we're going to have AGI in two in in
00:51:05
in two years time it's coming nobody
00:51:06
will ever work again which helps you
00:51:09
raise a lot of money but it's also it
00:51:12
scares I mean it scares me like I I'm I
00:51:15
I'm not convinced that human economies
00:51:16
will work if if if all knowledge work is
00:51:18
replaced by Ai and it also gives a very
00:51:21
unrealistic idea of what these things
00:51:23
can do because don't forget it's also
00:51:25
happening with software engineers right
00:51:26
there are companies out there whose
00:51:28
pitches we will replace software
00:51:30
Engineers with AI Engineers which is a
00:51:32
very uh direct although I'm now starting
00:51:36
to see a pattern of how this is really
00:51:38
good for fundraising because it means a
00:51:40
lot of potential market and don't forget
00:51:42
that that's who they're talking to and
00:51:44
once they raise the money uh you know
00:51:46
they have that money they they can then
00:51:48
operate and and often like in this case
00:51:50
you know with cognition AI their claims
00:51:52
are toned down to the point of it's
00:51:54
pretty much Co pilot so there but you
00:51:58
see it in the M it is scary because you
00:51:59
see it in the mainstream media
00:52:01
everywhere this this claim that softw
00:52:02
like we are I think someone said we're
00:52:05
we are replacing our own jobs as
00:52:06
software engineers and as you said it's
00:52:08
right I I think it's the first time I've
00:52:10
I've seen that written in in the Press
00:52:11
maybe this happened like before I was
00:52:13
born but not recently it's funny isn't
00:52:16
it it's like um who who would have
00:52:18
thought that AI would come for the
00:52:20
lawyers and the software engineers and
00:52:22
the illustrators and all of these things
00:52:24
that are normally you don't think think
00:52:26
of being automatable um but yeah so the
00:52:29
AGI thing that leads lots of
00:52:31
disappointment people are like yeah well
00:52:32
I asked it as this this dumb like logic
00:52:35
puzzle and he got it wrong you how is
00:52:37
this but it also ties into science
00:52:39
fiction you everyone thinks about the
00:52:41
Matrix and Terminator and all of that
00:52:42
kind of stuff um especially honestly the
00:52:45
sort of the key problem here is these
00:52:48
things can talk now right they can they
00:52:50
can they can they can they can imitate
00:52:52
human speech and throughout Human
00:52:54
Society being able to write well
00:52:56
convincingly has always been how we
00:52:58
evaluate intelligence but these things
00:53:00
are not intelligent at all but they can
00:53:02
write really well they can produce very
00:53:04
convincing text um which which kind of
00:53:06
throws everyone off so so yeah if you're
00:53:09
in if you're captured by the AGI hype
00:53:12
you're going to then I think yeah I
00:53:13
think we're going to have a plateau I'd
00:53:15
be very surprised if we had anything
00:53:16
that was AGI like um I'd also be like I
00:53:20
said I'm I have not been sold that this
00:53:22
is a net Win For Humanity I don't know
00:53:24
how how society would cope with that but
00:53:27
if we what we are seeing is incremental
00:53:29
improvements like Claude 3.5 Sonet
00:53:32
is a substantial incremental improvement
00:53:35
over GPT 40 and Claude 3 Opus um the
00:53:38
anthropic um the interesting thing about
00:53:40
Claude 3.5 Sonet is that it's named
00:53:43
Sonet because their previous Claude 3
00:53:46
had three levels there was Haiku Sonet
00:53:47
and Opus Haiku was the cheap one Sonet
00:53:50
in the middle Opus was the really fancy
00:53:51
one they're clear they have said they're
00:53:54
going to release Haiku 3.5 which will be
00:53:56
cheap and amazing and Opus 3.5 which is
00:53:59
going to be a step up from Sonet those I
00:54:02
I try to ignore the it's coming soon
00:54:04
those ones I am excited about in terms
00:54:05
of it's coming soon um but yeah so if
00:54:08
you're buying into the AGI stuff then I
00:54:12
I don't buy into it I I don't think you
00:54:14
get to AGI from autocom completing
00:54:15
sentences no matter how good you are at
00:54:17
autoc comp completing sentences um and
00:54:20
then yeah if it's a in terms of the the
00:54:24
plateau I'm just I incremental
00:54:27
improvements is enough for me like I
00:54:28
want models like right now I want them
00:54:31
cheap faster yeah if if you look through
00:54:35
back through history like I'm I'm a
00:54:37
little bit skeptical to to believe that
00:54:39
suddenly like fundamental things would
00:54:42
change in in the software industry you
00:54:45
know there's always this um um people
00:54:49
sometimes you know project that this
00:54:51
time it will be very different and and
00:54:53
again there's always Innovation but
00:54:54
looking back we've always had innovation
00:54:56
we've had some new technologies and then
00:54:58
incremental improvements so like pattern
00:55:01
matching that would be logical obviously
00:55:03
there's Black Swan events right like
00:55:05
would have who could have seen Co come
00:55:07
or or this is also a breakthrough but I
00:55:10
I think there there's that part of like
00:55:12
we're not we're not just in a vacuum
00:55:14
there's not just this one event and AI
00:55:16
has been predicted to to be around the
00:55:18
corner by different people since since
00:55:20
the start of computing really to be
00:55:23
fair but I think the the the other
00:55:26
something I think about a lot is um the
00:55:28
impact of Tik Tok and YouTube on
00:55:31
professional video creation right like
00:55:33
the the the iPhone is a this is a really
00:55:36
great video camera and Tik Tok and
00:55:38
YouTube have meant that you can now
00:55:39
publish videos to the entire world and
00:55:42
that has not killed off professional
00:55:44
video um like people who work
00:55:46
professionally in that industry they're
00:55:48
doing fine you know what's happened is
00:55:51
is millions of people who would never
00:55:53
have even dreamed of trying to learn to
00:55:55
stand in front of a camera or to operate
00:55:56
that equipment are now publishing
00:55:58
different kinds of content online and I
00:56:01
I that's kind of my my my ideal version
00:56:04
of the sort of AI programming thing is I
00:56:06
want the number of people who can do
00:56:08
basic programming to go up by an order
00:56:10
of magnitude I I think every human being
00:56:14
deserves to be able to automate dull
00:56:16
things in their lives with a computer
00:56:18
and today you almost need a computer
00:56:19
science degree just to automate a dull
00:56:21
thing in your life with a computer
00:56:23
that's the thing which language models I
00:56:25
think are taking a huge bite out of and
00:56:27
then maybe so there is a version of that
00:56:30
where the demand for professional
00:56:32
software Engineers goes down because the
00:56:34
more basic stuff can be done by by other
00:56:36
things the alternative version of that
00:56:38
is the thing where because a
00:56:41
professional software engineer can now
00:56:42
do five times the work they used to do
00:56:44
maybe two times five times whatever it
00:56:46
is that means that companies that
00:56:48
wouldn't have built custom software now
00:56:50
do which means that the number of jobs
00:56:52
of software Engineers goes up right a
00:56:54
company that would never have built its
00:56:55
own customer CRM for their industry
00:56:57
because you'd have to hire 20 people and
00:56:58
wait 6 months can now do it with five
00:57:00
people and two months and that means
00:57:03
that that that's now feasible for them
00:57:05
and those those five people are still
00:57:07
getting paid very well it's just that
00:57:09
their the value that they provide to
00:57:11
companies has gone up so despite the
00:57:14
sort of so that's that's the demand
00:57:15
curve that I'd like to see well and also
00:57:18
don't don't forget like one thing that
00:57:20
we do talk about or I think it's kind of
00:57:22
a common knowledge correct me if it's
00:57:23
wrong but code equals liability the more
00:57:26
code you have the more liability you
00:57:28
have and and one thing just what we're
00:57:30
seeing is more code will be generated
00:57:32
and at some point I I just think about
00:57:34
this thing have you have you worked at a
00:57:35
company or a team where you just had
00:57:38
like less experienced developers one or
00:57:40
two years experience and you you leave
00:57:41
them for a while you might have seen and
00:57:44
then and then what happens right like
00:57:46
fast forward to two years you don't add
00:57:47
anyone
00:57:49
experience you know like usually like my
00:57:52
my my observation is like it's you get
00:57:54
spaghetti code it's a mess it's it's
00:57:55
hard do and then you pull in some people
00:57:57
with more experience who look around
00:57:59
they point out some seemingly simple
00:58:01
changes that are are you know not that
00:58:04
simple for the people they they simplify
00:58:06
things you might delete a lot of code
00:58:08
and then all will be good in the world
00:58:09
or or those people get more experienced
00:58:12
but I I do think about this part where
00:58:14
you know a year in everything still
00:58:16
seems to be fine right like the CEO of
00:58:19
of the company is like oh this team is
00:58:21
shipping quickly people are enthusiastic
00:58:23
and my sense is that there will be there
00:58:26
should be a demand and again like u i
00:58:28
I'm I'm curious to hear your thoughts on
00:58:30
this but Engineers who can go into the
00:58:32
generated code and for example explain
00:58:34
reason uh even when the machine fails to
00:58:37
to explain this complicated mumble
00:58:39
jumble or just say we're going to delete
00:58:41
all of this and it makes sense I'm
00:58:42
confident I can tell you why I'm doing
00:58:45
this right and that's what I expect
00:58:48
that's the skill that you need like
00:58:50
turns out the typing code and
00:58:51
remembering how for loops work that's
00:58:53
the piece of our jobs that is has been
00:58:55
devalued right remembering that sort of
00:58:57
trivia and and typing really quickly
00:58:59
nobody cares if you can type faster than
00:59:00
anyone else anymore that's that's not a
00:59:02
thing but the systems thinking and
00:59:05
evaluating a skill that I think is
00:59:07
really important right now is is QA like
00:59:09
in terms of just the old F like manual
00:59:11
testing being able to take some code and
00:59:13
really hammer away at it and make sure
00:59:15
that it does exactly what it needs to do
00:59:16
combined with automated testing combined
00:59:19
with um like system design and
00:59:21
prioritization there's so much to what
00:59:24
we do that isn't just typing code on the
00:59:26
keyboard and those are the skills which
00:59:28
the thing is language models can do a
00:59:30
lot of this stuff but only if they're if
00:59:32
you ask the right questions of them
00:59:34
right like if you if you ask a language
00:59:35
model to write five paragraphs on what
00:59:39
like how you should refactor your
00:59:40
microservices maybe it'll do an okay job
00:59:42
but who's going to know to even pose
00:59:44
that question and who's going to know
00:59:46
how to evaluate what it says so those
00:59:48
decisions these things I don't think you
00:59:50
should ever have them make decisions for
00:59:52
you I think you should use them as as
00:59:54
supporting like tools to support support
00:59:56
the decisions that you're making it's
00:59:57
one of the reasons I love saying give me
00:59:58
options for x and that's what we become
01:00:01
software Engineers we are the people
01:00:04
making these the high level design
01:00:06
decisions the people evaluating what's
01:00:07
going on I don't think you should ever
01:00:09
commit a line of code that a language
01:00:10
model wrote If you don't understand it
01:00:12
yourself that's sort of my personal line
01:00:14
that I draw um and yeah I so so I do not
01:00:17
feel threatened as a software engineer
01:00:19
and honestly partly as a software
01:00:21
engineer who's got good with this stuff
01:00:22
I really don't feel threatened by it um
01:00:25
but just generally like I I don't I
01:00:28
think the bits of my job that these
01:00:30
tools will accelerate there are a whole
01:00:33
whole bunch of jobs bits of the job that
01:00:34
accelerate some of which are a bit
01:00:35
tedious some of which are kind of
01:00:36
interesting but it gives me so much more
01:00:39
scope to take on more exciting problems
01:00:41
overall I I love it and if if you can
01:00:45
offer advice to like to two different
01:00:48
groups of people so two separ pieces but
01:00:50
experienced Engineers like yourself in
01:00:52
in terms of like you know like put in
01:00:54
the Years work across different stack
01:00:56
and also to less experienced Engineers
01:00:58
who are like coming into they're already
01:00:59
working inside the industry but
01:01:01
obviously they're not at the level uh
01:01:03
just what would you suggest to them to
01:01:05
make the most out of these tools or to
01:01:08
make themselves most more future proof
01:01:10
If you will I
01:01:12
mean my Universal advice is always to
01:01:14
have side projects on the go which
01:01:16
doesn't necessarily work for everyone
01:01:18
you know if you've got like a family and
01:01:21
and a demanding job and so forth it can
01:01:23
be difficult to carve those out a trick
01:01:25
I used at companies in the past I love
01:01:27
um advocating for internal hack days you
01:01:29
know saying let's once a quarter have
01:01:32
everyone spend a day work or two days
01:01:35
working on their own projects that kind
01:01:36
of stuff can be great good employers
01:01:38
should always be able to leave a little
01:01:40
bit of wiggle room for for you know for
01:01:43
that sort of exploratory programming but
01:01:45
some employers don't but if you can get
01:01:47
that that's amazing um if you're earlier
01:01:49
in your career like uh people in their
01:01:51
20s can normally get away with a lot of
01:01:53
side projects because they have a lot
01:01:54
less going on it's like when I'm
01:01:56
managing people I don't like people
01:01:59
working super long hours and all of that
01:02:01
it's hard to talk a 20 like a 22y old
01:02:03
out of that that's just sort of how
01:02:05
people are wired earlier in their
01:02:06
careers so take advantage of that while
01:02:08
you can but yeah I feel like um like I'm
01:02:10
doing my my personal web blog um I'm
01:02:13
using all sorts of weird AI tools to
01:02:16
hack on that because the stakes could
01:02:18
not be lower right aing that it'll break
01:02:21
a page and I'll fix it so that's where
01:02:23
I've been using um there's a thing
01:02:24
called GitHub copon workspaces that
01:02:27
they've just started tring it's you're
01:02:29
in the beta yeah and I've added four or
01:02:32
five features to my blog using that some
01:02:33
of them in live like in meetings with
01:02:35
people is a demo I'm like oh let's show
01:02:37
you this tool I'm going to add
01:02:39
autocomplete to the tags on my blog and
01:02:41
I did that last week and so I'm I'm
01:02:43
using my blog as a sort of fun
01:02:45
exploration space for some of that kind
01:02:47
of thing but yeah um so if you can
01:02:50
afford to do a side project with these
01:02:52
tools and like set yourself a challenge
01:02:54
to write every line of code with these
01:02:56
have these tools write that code for you
01:02:57
I think that's a great thing you can do
01:02:59
if you can't afford side projects just
01:03:01
use them like T get an account with um I
01:03:04
mean both of the best models are now
01:03:06
free like GPT 4 with open AI 3.5 Sonic
01:03:10
now you have to log in you might have to
01:03:12
give them a phone number but they're you
01:03:14
can use a free account with them use
01:03:16
those and just throw questions at
01:03:19
them sometimes have a question where you
01:03:21
think it definitely won't get this and
01:03:23
throw that in because that's useful
01:03:24
information throw in basic things just
01:03:27
work work with them that way I think
01:03:28
that's that's definitely worthwhile and
01:03:30
play with the Claude 3.5 artifacts thing
01:03:32
is just so much fun like the other day I
01:03:36
wanted to add a box Shadow to a thing on
01:03:38
a page and I'm like what I really need
01:03:41
is I need a sort of very light sort of
01:03:43
subtle box Shadow and then I was halfway
01:03:45
through prompting Claude to that and
01:03:46
said actually you know build me a tool
01:03:48
build me a little tool with where I I
01:03:49
think I said where I can twiddle with
01:03:51
the settings that's my prompt let me
01:03:53
twiddle with the settings in the Box
01:03:54
Shadow and it built me this little
01:03:56
interactive thing with a box Shadow and
01:03:58
sliders for the different settings and a
01:03:59
copy and paste CSS thing and if I'd
01:04:02
spent an extra 15 seconds on it I could
01:04:04
have found a tool that existed on Google
01:04:06
but it was faster to get Claude to build
01:04:09
me a custom tool on demand than to
01:04:11
because if you're on a Google search you
01:04:12
have to evaluate the answers you get
01:04:14
that four back and then you click
01:04:15
through and all that like no I know I
01:04:16
know what I want so do that right that's
01:04:19
just wild ENT I I feel this is what
01:04:22
you're saying as like yeah I mean it's
01:04:24
it's EAS easier said than D but
01:04:26
experimenting and I think your blog
01:04:27
which we're going to link in the show
01:04:29
notes is just a really good example like
01:04:31
I did find myself a little bit
01:04:33
reenergized reading how much weird stuff
01:04:36
you're doing sorry for the that's the
01:04:37
other thing it's got to be fun right
01:04:40
that one of the things people I can see
01:04:42
that you're having fun with it like and
01:04:44
and again thanks for sharing because
01:04:45
because you put it out there I think you
01:04:47
know that that's another thing that but
01:04:48
honestly with these tools it's a bit
01:04:50
easier to write it up as well so I think
01:04:52
I think that's it's helpful advice fun
01:04:55
like this is a crucial thing these
01:04:56
things are absolutely hilarious and it's
01:04:59
not like they can sometimes they can
01:05:00
write a joke that's good but that's not
01:05:02
what makes them funny it's trying out
01:05:04
weird dystopian things trying something
01:05:07
you didn't think would work and having
01:05:08
it work I get them to do I use um the
01:05:10
voice mode I used to do prank phone
01:05:12
calls to my dog so I'll be like hey chat
01:05:15
GPT I need to give my dog a pill covered
01:05:18
in peanut butter I need you to pretend
01:05:20
to be from the government Department of
01:05:21
peanut butter and make up an elaborate
01:05:23
story about why she has to have it now
01:05:25
go and it does it and it it it SP and I
01:05:27
hold the the the spe CP to my dog it's
01:05:29
just really really amusing so stuff like
01:05:32
that is is so much fun I for a while I
01:05:36
was always trying to throw twist into my
01:05:38
prompts I'm like answer this and then at
01:05:39
the bottom i' say oh and pretend you're
01:05:40
a golden eagle and use Golden Eagle
01:05:42
analogies they would say well if you're
01:05:44
soaring above the competition stupid
01:05:46
things like that right just you can you
01:05:49
can get it to rap kind of and it's awful
01:05:52
like really absolutely appalling but
01:05:54
with the voice mode you can say now now
01:05:56
rap now do a rap about that answer and
01:05:58
just it's cringeworthy it is it is kind
01:06:01
of wild how I don't really remember
01:06:03
having a tool that we're talking about
01:06:05
programming here but you can get it to
01:06:07
do all these things within a you know
01:06:10
potentially even in the work context
01:06:12
just throw it in there it's it's kind of
01:06:14
as you said it is fun so it it's I I I
01:06:16
like to look at that part of it so thank
01:06:20
you for the insight and let's end with
01:06:22
some rapid questions in the end if
01:06:23
you're up for it so these are question
01:06:25
I'm just going to ask and you just throw
01:06:27
out whatever comes up uh could you
01:06:30
recommend two or three books uh to
01:06:32
people that you enjoyed reading Martin
01:06:34
kon's book designing data intensive
01:06:36
applications is here it's it's it's on
01:06:38
my shelf actually absolutely incredible
01:06:41
the Blue Sky Team told me Martin kton
01:06:43
advises them that this is the book they
01:06:45
have all on their shelf because this
01:06:47
describes everything you need to know to
01:06:48
build Blue Sky it's kind of amazing at
01:06:50
um at eventbr we had a book club and one
01:06:53
of the things we did with the book club
01:06:54
is because nobody reads the book for
01:06:56
book clubs right it turns out that just
01:06:57
doesn't work so what you do instead is
01:06:59
you assign chapters to different people
01:07:02
and they have to provide a summary of
01:07:03
the chapter at the book club so it's
01:07:05
almost like you you um you um
01:07:07
parallelized the act of of reading the
01:07:09
book that worked so well and that was
01:07:11
that was I think that was the best book
01:07:12
that we that we did for that one um and
01:07:16
there is there maybe a fiction book that
01:07:17
you can
01:07:18
recommend so my favorite genre affection
01:07:21
is British Wizards Tangled Up in Old
01:07:24
School British bureaucracy so I like um
01:07:27
there's turns there's um Charles stros
01:07:29
does the laundry file series which is
01:07:31
about sort of secret like MI5 style
01:07:33
Wizards there's the river of London
01:07:35
series by Ben aronovich which are
01:07:37
metropolitan police officer who gets
01:07:39
Tangled Up In Magic I really enjoy those
01:07:41
oh nice what's your favorite programming
01:07:43
language and framework and you cannot
01:07:45
say d Jango and python really putting me
01:07:47
on the spot with this one oh yeah um
01:07:50
okay um JavaScript and no framework at
01:07:52
all I love doing the vanilla JavaScript
01:07:54
thing um basically because so I used to
01:07:56
love jQuery and now document. query
01:08:00
selector all and array. map and stuff
01:08:02
jQuery is built into browsers now you
01:08:04
don't need an extra Library it it is
01:08:05
kind of wild yeah I remember that one
01:08:07
when I used to use I'm surprised nice
01:08:11
what's an exciting company uh that you
01:08:14
uh that you're interested in and
01:08:16
why so I'm going to plug fly.io here the
01:08:19
hosting company because um partly
01:08:22
because they sponsor some of my work but
01:08:24
no actually completely independently of
01:08:25
their sponsorship I picked them to build
01:08:27
my data set Cloud SAS platform on
01:08:30
because they're a hosting company that
01:08:31
makes it incredibly easy to spin up
01:08:34
secure containers for for as part of
01:08:36
your infrastructure basically I was
01:08:38
trying to build this stuff on top of
01:08:39
kubernetes which is not easy to use oh
01:08:41
and then I realized that fly.io their
01:08:43
machines layer is effectively what you
01:08:45
can do with kubernetes but with an API
01:08:47
that actually makes sense and pricing
01:08:48
that makes sense so I'm able to build
01:08:50
out this SAS platform where every one of
01:08:52
my paying customers gets a private
01:08:54
separate container running my software
01:08:56
with its own encrypted volumes and all
01:08:57
of that kind of thing and um so I don't
01:08:59
have to worry about data leaking from
01:09:01
one container to another and it scales
01:09:03
to zero so but it scales to zero in
01:09:05
between the requests and all of that
01:09:06
kind of stuff so yeah I'm really excited
01:09:08
about fly as a platform for specifically
01:09:11
building that thing where you've got an
01:09:13
open source project and you want to run
01:09:14
it for your customers um like p pa paid
01:09:17
hosting of Open Source I feel like FES a
01:09:19
really great platform for that awesome
01:09:21
well well thanks very much it was great
01:09:23
having you cool this has been really fun
01:09:25
thanks a lot thanks a lot to Simon for
01:09:27
this if you'd like to find Simon online
01:09:30
you can do so on his blog Simon
01:09:33
wilson.nc iton all in the show notes
01:09:36
below you can also check out his open
01:09:38
source projects data set and llm which
01:09:40
are also in the notes as closing here
01:09:42
are my top three takeaways from this
01:09:44
episode takeaway number one if you're
01:09:47
not using llms for your software inuring
01:09:49
workflow you are falling behind so use
01:09:52
them Simon outlined a bunch of reasons
01:09:55
that hold back many deps from using
01:09:56
these tools from eal concerns to energy
01:09:59
concerns but llm tools are here to stay
01:10:01
and those who use them get more
01:10:03
productive so give yourself a chance
01:10:05
with these takeway number two it takes a
01:10:08
ton of effort to learn how to use these
01:10:10
tools efficiently as Simon put it you
01:10:12
have to put in so much effort to learn
01:10:14
explore and experiment on how to use
01:10:16
them and just there's no guidance so you
01:10:18
really need to put into time and
01:10:21
experimentation by the way in a survey I
01:10:23
ran in the pragmatic engineer about AI
01:10:25
tools with about 200 software Engineers
01:10:27
responding we saw some similar evidence
01:10:30
those who have not used AI tools for 6
01:10:32
months were more likely to be negative
01:10:34
about the perception of these in fact
01:10:36
the very common feedback from Engineers
01:10:38
not using these tools was that they use
01:10:40
it a few times but it just didn't live
01:10:41
up their expectations and they just
01:10:43
stopped using them I asked Simon how
01:10:45
long it took him to get good at these
01:10:47
tools and he told me it just took a lot
01:10:49
of time you couldn't put an exact number
01:10:50
of months on it but it just took a bunch
01:10:53
of time and experimentation and fig
01:10:55
figing out if it works my third and
01:10:57
final takeaway is that using local
01:11:00
models to learn more about large
01:11:01
language models is a smart strategy
01:11:04
running local models has two bigger
01:11:06
benefits number one you figure out how
01:11:08
to just do these how to run models
01:11:10
locally it's actually less complicated
01:11:13
than one would think thanks to tools
01:11:14
like hugging face that make downloading
01:11:16
and running models a lot easier so just
01:11:18
go and play around with them and see how
01:11:21
smaller model feels like the second
01:11:23
benefit is that you learn a lot more
01:11:25
about how large language models works
01:11:27
because local models are just less
01:11:29
capable so they feel less magical Simon
01:11:32
said how it's really useful to have a
01:11:34
model hallucinate at you early because
01:11:36
it helps you get better at the mental
01:11:38
model of what it can do and the local
01:11:40
models do hallucinate wildly you'll also
01:11:43
find some additional resources in the
01:11:44
pragmatic engineer one of them is about
01:11:47
rag retrieval argument to generation
01:11:50
this is an approach that Simon talked
01:11:52
about in this episode it's a common
01:11:54
building Brock for applications we did a
01:11:56
deep dive into pragmatic engineer about
01:11:58
this approach and this is linked in the
01:11:59
show notes below also in the pragmatic
01:12:02
engineer we did a three-part series on
01:12:04
AI tooling for software Engineers
01:12:05
reality check we looked at how Engineers
01:12:08
are using these tools what their
01:12:10
perception is what advice they have to
01:12:12
use these tools more efficiently
01:12:14
personally I cannot remember any
01:12:15
developer tool or development approach
01:12:17
that has been adopted so quickly by the
01:12:19
majority of backend and frontend
01:12:20
developers in the first two years of its
01:12:22
release like large language model have
01:12:25
done so since
01:12:27
2022 so it's a good idea to not sleep on
01:12:29
this topic and this marks the end of the
01:12:32
first episode under pragmatic inur
01:12:33
podcast thanks a lot for listening and
01:12:35
watching if you enjoyed the episode I'd
01:12:38
greatly appreciate if you subscribed and
01:12:39
left to review Thanks and see you in the
01:12:42
next one

Etiquetas

AI in coding
Large language models
ChatGPT
Software engineering
Programming productivity
Open source
Code Interpreter Mode
Ethics of AI
Productivity tools
Simon Willison