00:00:00
AI is changing the world how we work how
00:00:03
we use the internet but even how we
00:00:05
write code but what does this really
00:00:07
mean for us as software developers is AI
00:00:10
here to replace us or will it redefine
00:00:13
what it means to program allog together
00:00:15
today I want to explore a radical idea
00:00:18
what if acceptance testing isn't just a
00:00:20
way to verify software but is really the
00:00:23
best next step in evolution of how we
00:00:26
program computers stick around I think
00:00:29
this might change how you think about
00:00:30
coding
00:00:35
[Music]
00:00:39
forever hi I'm Dave Farley of continuous
00:00:41
delivery and welcome to my channel if
00:00:43
you haven't been here before please do
00:00:45
hit subscribe and if you enjoy the
00:00:47
content here today hit like as well
00:00:49
let's start with this programming
00:00:51
languages exist not because that's how
00:00:54
computers understand the world but as
00:00:56
tools to help us think at the right
00:00:58
level of detail
00:01:00
programs are specifications of what we
00:01:03
want our system to do they often do this
00:01:06
by defining in detail the steps that we
00:01:09
think will achieve our goals but really
00:01:12
the underlying message to the computer
00:01:15
is something different it's a definition
00:01:17
of what we want the computer to achieve
00:01:20
over the decades programming languages
00:01:22
have evolved to be clear expressive and
00:01:24
precise in doing this making it easier
00:01:27
for us to tell the machines what we want
00:01:29
to do do when we use them this is
00:01:31
distinct from natural human languages
00:01:34
which are much more ambiguous and so
00:01:37
less useful as a tool with which to
00:01:40
achieve Precision when communicating
00:01:42
particularly with machines it seems to
00:01:44
me a fairly common Mistake by all of us
00:01:48
technical or not that we assume that the
00:01:51
hard part of developing software is
00:01:53
writing the code where in reality the
00:01:56
hard part is being able to come up with
00:01:58
a sufficiently detailed accurate
00:02:00
description of what we want so that a
00:02:03
computer can do it whatever it is that
00:02:05
that takes but now we have ai stepping
00:02:08
into the picture and everyone thinks
00:02:10
that the way forward is to adopt The
00:02:12
Woolly imprecise overly verbose natural
00:02:15
language as the best means of achieving
00:02:18
that precise
00:02:19
description is this really the case with
00:02:22
tools like chat GPT or co-pilot the
00:02:26
process of software development
00:02:27
certainly feels very different you
00:02:30
describe what you want and the AI
00:02:32
generates code to do it sounds magical
00:02:35
right but fundamentally this is still
00:02:38
just programming your prompt is the
00:02:41
program and the AI acts as an incredibly
00:02:44
Advanced compiler taking what you
00:02:46
specify and ultimately changing that
00:02:49
into executable code we're extremely
00:02:52
fortunate to be sponsored by equal
00:02:55
experts transic topple honeycomb and rad
00:02:59
Studio
00:03:00
transic is a fintech company applying
00:03:03
Advanced continuous delivery techniques
00:03:05
to deliver low latency trade routing
00:03:07
services to some of the biggest
00:03:09
financial institutions in the world all
00:03:12
of these companies offer products and
00:03:14
services that are extremely well aligned
00:03:16
with the topics that we discuss on this
00:03:18
channel every week so if you're looking
00:03:20
for excellence in continuous delivery
00:03:22
and software engineering please do click
00:03:25
on the links in the description below
00:03:26
and check them out it's through their
00:03:28
support that we can keep this channel
00:03:30
going so please do check them out and
00:03:32
say thanks to them programming at its
00:03:35
core is about three things understanding
00:03:38
the problem well enough to explain it
00:03:40
clearly Translating that explanation
00:03:43
into something that a computer can
00:03:45
execute and verifying that what we've
00:03:48
built actually works and addresses the
00:03:50
problem that we started out with these
00:03:52
principles don't change when we're using
00:03:54
AI but the mechanism of going from the
00:03:57
description to the executable thing
00:04:00
instead of coding line by line we now
00:04:03
collaborate with an AI and while the AI
00:04:06
can handle some of the grunt work it
00:04:08
still leaves the hardest most creative
00:04:10
parts to us really that is defining the
00:04:13
problem in precise unambiguous terms
00:04:16
there's a challenge though and it's a
00:04:18
big one
00:04:20
reproducibility traditional programming
00:04:22
is deterministic if you write the same
00:04:24
code and run it twice you expect to see
00:04:27
the same results each time a compile as
00:04:29
a program too so if I write some code
00:04:32
and compile it twice I expect to get the
00:04:34
same results of the same program each
00:04:37
time but with AI code generation not so
00:04:41
much really the output can change based
00:04:43
on the model Randomness settings or even
00:04:46
the updates to the II itself so how do
00:04:49
we address all of this
00:04:51
variation there are two questions here
00:04:54
for us to answer how can we specify what
00:04:57
we want with enough clarity enough
00:05:00
detail that the AI is going to get it
00:05:02
right and how can we verify that we and
00:05:07
the AI did in fact get it right that the
00:05:10
code works even after changes to the AI
00:05:13
or to our requirements for the system
00:05:16
this brings me to the role of acceptance
00:05:18
testing in this for years I've used
00:05:20
automated acceptance tests in the form
00:05:23
of executable specifications as tools to
00:05:26
describe from a user's perspective
00:05:29
exactly how a system should behave
00:05:31
they're clear easy to write and when we
00:05:34
get the level of abstraction right
00:05:35
precise and reproducible which makes
00:05:38
them perfect for solving both of these
00:05:40
challenges that programming with AI
00:05:42
presents to us imagine this instead of
00:05:45
writing code we write these detailed
00:05:48
specifications each one tells the AI
00:05:51
what the system should do in a specific
00:05:54
scenario the AI handles the
00:05:56
implementation and generates the detail
00:05:59
of the test T that we need to verify our
00:06:01
specifications now we can write the code
00:06:04
more easily by having an accurate way to
00:06:06
specify what it is that we actually want
00:06:08
to achieve and we can verify its work by
00:06:11
running the tests this isn't just
00:06:14
testing it's programming at a higher
00:06:16
level of abstraction let's break this
00:06:18
down acceptance testing solves ambiguity
00:06:22
programming languages are great because
00:06:24
they're precise by treating tests as
00:06:27
specifications we give the AI
00:06:29
unambiguous
00:06:31
instructions it also ensures
00:06:34
reproducibility even if the AI changes
00:06:37
the underlying implementation the tests
00:06:39
act as a safeguard to verify that the
00:06:42
second version of the implementation
00:06:45
works in the same way as the first they
00:06:47
confirm that the system still behaves
00:06:49
the way that we want this approach
00:06:51
shifts programming from writing
00:06:53
solutions to specifying Behavior it's
00:06:57
like the leap from Assembly Language to
00:06:59
a high level language it represents a
00:07:01
new level of abstraction that allows us
00:07:03
to focus on what we really want from the
00:07:06
system not on the mechanics of how we
00:07:08
achieve it is this idea raising the
00:07:10
level of abstraction for programming
00:07:12
really new though well no the idea
00:07:16
really represents the whole history of
00:07:18
programming but the next step in
00:07:20
abstraction has until now at least
00:07:23
always been somewhat problematic
00:07:26
techniques like model driven development
00:07:28
and low code platform forms have tried
00:07:30
hard to raise the level of abstraction
00:07:32
for programming and sometimes work quite
00:07:35
well for some narrowly constrained
00:07:37
problems but they've struggled as
00:07:40
general solutions to programming
00:07:42
particularly with issues like
00:07:44
maintainability flexibility and
00:07:46
reproducibility being left unresolved
00:07:49
acceptance testing however is different
00:07:52
it's practical it already works
00:07:54
extremely well for human programmers who
00:07:57
are part of the translation process
00:07:59
modern software development as it stands
00:08:02
even though not all of them currently
00:08:04
use acceptance testing this means that
00:08:06
it integrates seamlessly into a modern
00:08:08
development workflow I think that our
00:08:11
current position of being at the dawn of
00:08:13
AI programming makes it a very good time
00:08:16
to rethink programming itself and to do
00:08:19
that it makes sense to me to focus on
00:08:22
the problem that we're really trying to
00:08:23
solve with programming that of
00:08:25
specifying what we want the computer to
00:08:28
achieve rather than and focus on the
00:08:30
mechanism that we're currently going to
00:08:32
use to specify that what if the
00:08:34
specifications the acceptance tests were
00:08:37
the program what if our job as
00:08:39
developers shifted from writing a
00:08:42
solution to more clearly specifying the
00:08:45
problem that we once solved and so
00:08:47
writing these detailed executable
00:08:49
examples and letting AI handle
00:08:51
everything else it's not just an idea it
00:08:54
might be the next step in the evolution
00:08:56
of our profession as software engineers
00:08:59
so I thought that I'd give it a try to
00:09:01
see how close I could get to making this
00:09:03
work now with the current generations of
00:09:05
AI systems and actually we were rather
00:09:08
closer than I thought we were I started
00:09:10
with open ai's 40 model and did pretty
00:09:13
well but it took me a while to prompt it
00:09:16
so that it the result was a working
00:09:18
flask application so I switched over to
00:09:21
the more capable 01 model in both cases
00:09:24
I began explaining the idea of
00:09:26
programming using acceptance tests I
00:09:29
actually used the script for this video
00:09:31
and fed it in to explain what I was
00:09:32
trying to recommend then I described my
00:09:35
preferred four layer model of acceptance
00:09:38
testing to the AI in both cases they
00:09:41
were rather eager to rush ahead and
00:09:43
write all of the code tests and code and
00:09:46
certainly 01 did pretty well at that but
00:09:49
my aim was a bit different I wanted to
00:09:51
program the AI with acceptance tests so
00:09:54
I worked to get a structure in place for
00:09:57
a single test and then I added another
00:10:00
test and got the AI to implement the
00:10:02
code to make both of the tests pass it
00:10:04
worked surprisingly well here are my
00:10:07
tests the AI assistant generated some of
00:10:10
these initially but I reworked them a
00:10:13
reasonable amount to make them a bit
00:10:15
clearer and to be a better fit with my
00:10:17
preferred style for writing acceptance
00:10:19
tests the AI though generated all of the
00:10:22
actual code I shaped the tests but
00:10:26
merely cut and pasted the code that the
00:10:28
AI generated into my IDE while I was
00:10:30
working with it I was consciously trying
00:10:32
to avoid changing the application as far
00:10:35
as I could I did make a few minor
00:10:37
Corrections when the code wouldn't run
00:10:39
but it was this was mostly due to my
00:10:42
configuration being slightly different
00:10:43
to what the AI assumed in my development
00:10:46
environment when I did change the code
00:10:49
rather than the tests as far as I could
00:10:52
I did it by prompting chat GPT and
00:10:55
barely touched the non-est code myself
00:10:58
there are as a result still some gaps
00:11:01
here stuff that I would generally
00:11:04
strongly recommend for acceptance tests
00:11:06
that I haven't done here like functional
00:11:09
and temporal isolation that I describe
00:11:12
in a bit more detail in this video and
00:11:15
that are covering much more extensively
00:11:17
in my acceptance test driven development
00:11:19
training
00:11:20
courses this means that these tests are
00:11:23
a little more untidy than I would
00:11:25
usually prefer and a little more fragile
00:11:27
too as a result but this exercise wasn't
00:11:30
really about that it was meant to be an
00:11:32
experiment and so I didn't spend lots of
00:11:34
time making it production level test
00:11:37
code one of the interesting side effects
00:11:39
of this experiment that I think is
00:11:42
common to many experienced programmers
00:11:44
when using AI is that although being
00:11:47
very impressed with how well 01 in
00:11:49
particular was doing at times I still
00:11:53
believe that I spent more time writing
00:11:54
the code with the AI than I would if I'd
00:11:56
written it on my own from scratch now
00:11:59
this may just be misplace optimism on my
00:12:02
part but I certainly spent a few hours
00:12:05
of fiddling with the code and with
00:12:07
prompts to the AI to achieve the results
00:12:09
that I ended up with and that I wanted
00:12:12
but I do know that I could have achieved
00:12:13
something very similar in similar time
00:12:16
scales because I've written code like
00:12:18
this many times before I'm also
00:12:20
convinced that I would be more satisfied
00:12:23
with the results than I am with this
00:12:25
code if I'd written it myself I spent a
00:12:28
fair amount of time in mistakes that I'm
00:12:30
certain that I wouldn't have introduced
00:12:32
in the tests and the test
00:12:34
infrastructure I am though pretty
00:12:36
convinced that given some better
00:12:38
prompting of the AI I could do this a
00:12:41
lot more quickly in future there were
00:12:44
certainly times when the AI sped things
00:12:46
up for me but also times when it wrote
00:12:48
code that I wasn't satisfied with and
00:12:50
those times it reminded me of tinkering
00:12:52
with tidying Legacy code more than
00:12:55
anything else too often it generated
00:12:57
code that didn't work at all although it
00:13:00
was close enough to what I was looking
00:13:02
for to make me press on and correct it
00:13:05
rather than rewrite it from scratch but
00:13:07
at times it made bigger mistakes missing
00:13:09
things out that should have been there
00:13:11
for example it generated a rest API
00:13:14
web-based application but generated a
00:13:17
test and test infrastructure for an
00:13:19
application with a web UI they simply
00:13:23
didn't work together they weren't
00:13:24
compatible until I work to fix them that
00:13:27
is some with prompting some with editing
00:13:29
the code however the target of this
00:13:32
exercise to get the AI to generate code
00:13:34
from an executable
00:13:36
specification certainly worked at some
00:13:39
level once I got past these teething
00:13:41
troubles anyway and I'm pretty sure that
00:13:44
given more time and more refinement of
00:13:46
the techniques maybe better prompts of
00:13:48
what I wanted this is a viable strategy
00:13:51
for raising the level of abstraction in
00:13:54
programming in a way that simultaneously
00:13:56
allows us to accurately Define what we
00:13:59
want the code to do and then confirm
00:14:01
that the code has done what we wanted it
00:14:03
to the part that I'm less sure of with
00:14:06
at least with the current generation of
00:14:07
AI assistants is how well this will work
00:14:10
without the very strong sense of
00:14:12
guidance to do the right things uh that
00:14:15
I applied as an experienced programmer
00:14:17
but I will try this again when I have
00:14:19
more time I'd like to more reliably be
00:14:21
able to get the AI to when given an
00:14:24
executable specification to generate the
00:14:26
test infrastructure to support the test
00:14:29
and the code to make that test pass with
00:14:32
a bit less help from me to do so if it
00:14:35
can do that though then this would
00:14:37
genuinely be a working model for
00:14:39
programming more effectively with
00:14:41
artificial intelligence I saw 01 do both
00:14:45
these things in fact all of the things
00:14:47
that are necessary to make this stuff
00:14:48
work but with a lot of virtual
00:14:51
handholding on my and coaching on my
00:14:53
part to guide it in the direction that I
00:14:55
wanted it to go if this is to really
00:14:58
work we'd also need to solve a very hard
00:15:01
problem the problem of how to prevent
00:15:03
the AI from cheating the tests which it
00:15:06
may well do this is not at all an easy
00:15:08
problem to solve but we will have that
00:15:11
same problem in some form however we
00:15:14
specify what it is that we want to the
00:15:15
AI if this concept resonates with you at
00:15:18
all do let me know in the comments I'd
00:15:21
love to hear what your thoughts are on
00:15:23
how AI is going to reshape our field and
00:15:25
if you're curious about acceptance
00:15:27
testing do check out my fre tutorial
00:15:29
linked below and explore my self-study
00:15:32
course on bdd and atdd thank you very
00:15:35
much for watching and if you enjoy what
00:15:37
we're doing here please consider joining
00:15:39
our patreon community your support helps
00:15:42
us to keep exploring ideas like these so
00:15:46
see you next time and thanks to our
00:15:48
existing patrons bye-bye
00:15:52
[Music]