00:00:00
- [Will] That's hot.
(bright upbeat music)
00:00:02
- All right, so this is
simultaneously really impressive
00:00:07
and really frightening at the same time,
00:00:10
and it's hitting me in ways
that I didn't really expect.
00:00:12
So do you remember Will
Smith eating spaghetti?
00:00:15
Do you remember when this
00:00:16
was what AI generated videos looked like?
00:00:18
Remember when we said, "Okay,
this AI stuff is cool and all
00:00:20
"but clearly there's a long way to go
00:00:22
"before there's any need for concern."
00:00:25
Well, welcome to the future people
00:00:27
because this is also
an AI generated video.
00:00:32
And so is this, completely
synthesized out of thin air
00:00:35
by computers.
00:00:36
This one too, this is not real.
00:00:39
Absolutely ridiculous
00:00:40
how far we've come in literally one year.
00:00:43
This does feel like another
ChatGPT, DALL.E moment for AI.
00:00:49
And maybe I'm overreacting
00:00:50
because, okay, I am a video creator,
00:00:53
so an AI that's actually doing my job,
00:00:56
maybe that feels a little more threatening
00:00:58
so I'm particularly impressed by it.
00:00:59
But also this stuff is really good.
00:01:02
So today, Sam Altman and
OpenAI announced a new model
00:01:06
called Sora and it can generate
00:01:08
full up to one minute video
clips from just text input.
00:01:13
So the same way DALL.E was able
to understand our text input
00:01:17
and turn it into a photorealistic
00:01:19
or stylized image or whatever you want,
00:01:22
same thing with Sora but
now since it's videos,
00:01:25
it also needs to understand
00:01:28
how all these things like
reflections and textures
00:01:31
and materials and physics
all interact with each other
00:01:35
over time to make a
reasonable looking video.
00:01:37
And of course, right away,
there's a bunch of examples
00:01:39
on their website that are crazy.
00:01:41
Now, before I show you these,
00:01:42
I just need you to keep this in mind,
00:01:44
you're about to watch a
bunch of AI generated videos
00:01:47
and you know that you're about to watch
00:01:49
a bunch of AI generated content.
00:01:51
So your brain, you're already
looking for this stuff
00:01:53
and it's not perfect, you
will find imperfections,
00:01:57
but not everybody who
sees AI generated content
00:02:00
on the internet knows
to be looking for that.
00:02:04
So also keep that in mind.
00:02:06
This is also the worst
that this technology
00:02:08
is going to be from here on out.
00:02:10
So, okay, here's one of the videos.
00:02:12
There's no audio to any of these clips,
00:02:13
but the prompt for this one
00:02:15
is a stylish woman walks
down a Tokyo street
00:02:18
filled with warm, glowing,
neon and animated city signage.
00:02:22
She wears a black leather jacket,
00:02:23
a long red dress and black boots.
00:02:26
This video is already miles
ahead of where we were.
00:02:30
It has accurate lighting,
it has materials,
00:02:33
it has skin tones, movements,
00:02:36
even has reflections all over the place.
00:02:38
Now, of course, if you look at it
00:02:39
for more than about 10
seconds, very closely,
00:02:42
there are lots of giveaways.
00:02:43
Like this dude in the background
00:02:45
kinda looks like he's
gliding in a weird way.
00:02:48
The frame rates and the
reflections in the water
00:02:50
are for some reason lower
than the rest of the video.
00:02:52
The camera movement overall
is just a bit inconsistent
00:02:54
and it just, I don't know,
00:02:56
it just kinda feels a little bit off.
00:02:58
But then again, this is
where we were one year ago.
00:03:02
So just keep that in the back
of your head for all this.
00:03:05
Okay, how about this one?
00:03:06
This is another one
which has a long prompt
00:03:08
about a camera following
behind a white vintage SUV
00:03:12
with a black roof rack as it's
speeds up a steep dirt road.
00:03:16
This is also, again, really good.
00:03:20
It kinda looks a little more video gamey
00:03:21
because of how rock solid
the drone footage is,
00:03:24
but clearly very usable.
00:03:27
Here's another one,
00:03:28
a litter of golden retriever
puppies playing in the snow.
00:03:31
Their heads pop in and out
of the snow covered in it,
00:03:34
it's so good.
00:03:35
It feels like the physics
of the fur and the ears
00:03:37
and everything with the snow
flying around in slow motion
00:03:41
is incredible.
00:03:42
I've looked through all
of the sample videos
00:03:43
on OpenAI's website,
00:03:45
and clearly these are
the handpicked best ones
00:03:48
that they chose to share
00:03:49
where they just put in some text
00:03:50
and then get a video and don't modify it.
00:03:52
But there's really
impressive stuff in there.
00:03:54
Some of it has humans, some of it doesn't.
00:03:56
Some of it is more realistic feeling
00:03:58
like the truck driving one,
00:03:59
but some of them are more
video gamey or more stylized.
00:04:02
A lot of it is slow motion,
00:04:03
I just have to say how insanely fast
00:04:06
these models are improving is genuinely,
00:04:09
like that's the shocking part.
00:04:11
Like I remember not even that
many months ago, DALL-E 3,
00:04:15
really, really high end,
00:04:16
and you could always still
find something off about it.
00:04:19
Like especially if you ask it
00:04:20
for something like a
photorealistic image of a human,
00:04:25
something about like the hands or the ears
00:04:28
would always just be a little bit off,
00:04:30
nevermind the physics.
00:04:31
But even this video here
is crazy at first glance.
00:04:36
The prompt for this AI generated video
00:04:37
is a young man in his 20s
00:04:39
is sitting on a piece of a
cloud in the sky reading a book.
00:04:43
This one feels like 90%
of the way there for me.
00:04:47
Like it's beyond the uncanny valley
00:04:49
of like apple's personas,
00:04:51
which are actually based on humans.
00:04:53
This is a made up person.
00:04:54
I mean, his eyes are kinda weird,
00:04:56
and the motion of the pages
in the book are kinda odd.
00:04:59
And yeah, obviously, he's in
a cloud and that's a giveaway
00:05:01
but like, the lighting and
the shadows and the skin tones
00:05:05
and then all the realism of
the textures on the shirt
00:05:07
and the way the shirt and
the pants move and the hair,
00:05:09
they're all really impressive.
00:05:11
And then for this one,
00:05:12
they typed in a movie trailer
00:05:15
featuring the adventurers
of the 30-year old spaceman
00:05:19
wearing a red wool knitted
motorcycle helmet, blue sky,
00:05:23
salt desert, cinematic style,
shot on 35 millimeter film.
00:05:28
And the closeups of his face,
the fabrics on the helmet,
00:05:33
the film grain through every
shot and the cinematic style,
00:05:36
this is one of the most
convincing AI generated videos
00:05:40
I've ever seen, minus
maybe the weird physics
00:05:42
of that dude walking
kind of in fast motion.
00:05:45
So Sam Altman, if you
follow him on Twitter,
00:05:46
he's going through a whole bunch more
00:05:47
of like people's requests
00:05:49
and posting a bunch more generated videos.
00:05:51
And so if you wanna check out his profile,
00:05:53
you can see those.
00:05:53
But here's the thing about
these AI generated videos now,
00:05:58
as good as they've gotten to this point,
00:06:01
they can and will pass as real videos
00:06:06
to people who are not looking
for AI generated videos.
00:06:09
Now that is obviously insanely sketchy
00:06:13
during an election year in the U.S.
00:06:14
and also terrifying
00:06:15
for a bunch of other
internet related reasons
00:06:17
but it's also perfect for stock footage.
00:06:22
Like there are already
all kinds of presentations
00:06:25
and advertisements and then PowerPoints
00:06:28
that are in need of oddly
specific stock videos.
00:06:33
And these AI generated videos
are already good enough
00:06:38
to 100% pass for that purpose.
00:06:40
Like look at this one,
this one with the waves
00:06:42
at Big Sur, this drone shot.
00:06:45
Honestly, if I saw this on Twitter,
00:06:47
I wouldn't even think twice.
00:06:48
I'd be like, "Oh, nice drone shot, dude."
00:06:50
Wouldn't even think about
AI if I wasn't pixel peeping
00:06:53
at like the way the water was moving.
00:06:55
Like this is a totally
usable video in an ad
00:06:59
for some California based product.
00:07:01
And that has all sorts of
implications for the drone pilot
00:07:05
that no longer needs to be hired,
00:07:07
for all the photographers
and videographers
00:07:09
whose footage no longer
needs to be licensed
00:07:12
to show up in that ad that's being made.
00:07:14
It's already that good.
00:07:16
There's other stuff like this wall of TVs,
00:07:19
which would be a totally
expensive and difficult thing
00:07:21
to shoot with a camera and
all these old expensive props,
00:07:25
but if you can just generate
it this well with reflections
00:07:29
and the environment and
everything else around it,
00:07:32
I mean, why do it any other way?
00:07:33
It's also very capable of
historical themed footage.
00:07:36
So this is supposed to be
California during the Gold Rush.
00:07:40
It's AI generated but
it could totally pass
00:07:43
for the opening scene in an old Western
00:07:45
with the right music over it.
00:07:46
How long until an entire ad,
00:07:47
every single shot is
completely generated with AI?
00:07:51
Or what about an entire YouTube
video or an entire movie?
00:07:54
I'm tempted to say like we're
a long way away from that
00:07:56
because you know, this
still has flaws clearly
00:07:59
and there's no sound,
00:08:00
and there's a long way to go
with the prompt engineering
00:08:02
to iron these things out.
00:08:04
But then again, the spaghetti
was like a year ago.
00:08:09
Now actually like that
OpenAI, on their website,
00:08:11
they show some of the downfalls too
00:08:13
of this particular model.
00:08:15
And because who would know better
00:08:16
than the people who have been using it?
00:08:18
This is a very private
tool, by the way, right now.
00:08:20
It's in super limited access,
00:08:21
so it's in the hands of red teamers,
00:08:23
which basically means people
testing it, pushing the limits,
00:08:26
trying to break it, and
a few trusted creators.
00:08:29
But they have found plenty
of weird edge stuff.
00:08:33
Like this clip here of a
bunch of gray wolf pups
00:08:36
looks normal at first
00:08:37
but then it's pretty clear
that something's kinda off
00:08:39
with the way they're just
kinda appearing out of nowhere
00:08:42
and walking through each other.
00:08:44
That's kinda weird.
00:08:45
Or this clip of a guy
running on a treadmill,
00:08:47
which I mean, I don't
really have to say much more
00:08:49
about why this one is weird.
00:08:50
But this is my favorite one, again,
00:08:52
so again, just try to put
yourself in the mind of someone
00:08:54
who's not expecting AI.
00:08:56
You're just scrolling through Facebook
00:08:59
or Twitter or something, right?
00:09:01
So you just see this video.
00:09:02
So first I just want
you to watch this clip
00:09:03
as if it's just a stock video you found
00:09:05
of a grandma celebrating her birthday.
00:09:07
And just try to think like,
00:09:09
I wonder what birthday
she's celebrating, right?
00:09:11
I don't know, how old do you think she is?
00:09:12
60?
00:09:13
65?
00:09:14
Maybe it's the big 70.
00:09:15
She seems to really like that cake.
00:09:17
Now, did you see it?
00:09:19
Did you catch that?
00:09:20
I'm gonna play it again,
00:09:22
but this time, watch the video
00:09:24
knowing that AI generated
photos and videos
00:09:27
have trouble accurately doing hands.
00:09:32
I'll play it again.
00:09:33
And now it feels super obvious
like every time you watch it,
00:09:37
watch a different set of hands,
it gets weirder and weirder.
00:09:40
You can watch it like five
times and there's dead giveaway
00:09:42
after dead giveaway,
00:09:43
not even mentioning the
weird inconsistencies
00:09:45
with the direction of
the wind on the candles.
00:09:48
But even as I'm saying all that,
00:09:50
even as it's coming outta my mouth,
00:09:51
I can't help but remember
that 12 months ago
00:09:54
we were critiquing this.
00:09:56
(Will laughing)
00:09:58
So what does this all mean?
00:10:00
Well, I mean, there's what it means now
00:10:02
and there's what it means for the future.
00:10:05
Now, Sora, this thing that they've made
00:10:08
is clearly a really impressive
video generation AI tool
00:10:12
that is both going to fool
people and also be very useful.
00:10:18
There's also a watermark
in the bottom corner
00:10:21
of every video generated by it.
00:10:22
So if you see one of those videos
00:10:23
and ideally it hasn't been cropped out,
00:10:26
then that's at least a
pretty clear indicator
00:10:27
that it's AI generated.
00:10:28
It's a Sora video.
00:10:29
But also, I do think they're
gonna have to be very careful
00:10:33
with this, they're gonna have
a whole bunch of safety stuff
00:10:35
to keep in mind.
00:10:36
I think they'll probably
have to be even more safe
00:10:37
than DALL.E.
00:10:38
Like you shouldn't be able to
generate people's likenesses.
00:10:42
Like you shouldn't be
able to make a politician
00:10:43
look like they're doing
something on video,
00:10:45
especially this year.
00:10:47
You probably won't be able to make
00:10:48
Will Smith eating spaghetti,
00:10:50
but it also definitely
means stock video generation
00:10:56
is absolutely going to take a
dent out of video licensing.
00:11:01
Like I can basically guarantee that.
00:11:02
Like logistically, why would
anyone making something
00:11:05
pay for footage of a house in the cliffs
00:11:07
when they can generate one for free
00:11:10
or for a small subscription price?
00:11:11
Like that is the real scary
part of what this tool implies.
00:11:15
But in the future, it gets
pretty existential, man.
00:11:19
I mean, okay, if this
is trained on all videos
00:11:23
that have ever been made by humans,
00:11:25
then surely it can't be
innovative or creative
00:11:27
in ways that humans haven't
already been, right?
00:11:33
I don't know.
00:11:34
Either way, I'll have all the links below
00:11:35
for all the Sora stuff, for OpenAI stuff,
00:11:38
and I guess I'll talk to you next year
00:11:40
when we look back and go,
00:11:41
"Remember that first version of Sora
00:11:44
"and how bad those wolf pups looked
00:11:46
"when they spawned out of nowhere?"
00:11:48
Just remember, this is the
worst that this technology
00:11:50
is going to be from here on out.
00:11:53
Thanks for watching.
00:11:55
Catch you the next one, peace.
00:11:57
(bright upbeat music)