What is the Nvidia A100 used for?

The Nvidia A100 is used for AI enterprise high performance computing, big data analytics, and machine learning workloads.

How was the Nvidia A100 acquired by Linus?

Linus obtained the Nvidia A100 through a fan who helped procure it, despite Nvidia not providing it for review.

Why doesn't Nvidia seed the A100 to reviewers?

Nvidia generally doesn't provide the A100 to reviewers because it is a specialized, expensive piece of hardware, priced around $10,000.

What is unique about the A100's cooling system?

The A100 uses a passive cooling system with a large heat sink and airflow through a server chassis.

How does the A100 compare to the RTX 3090 in rendering?

The A100 is more power efficient but slightly slower than the RTX 3090 in CUDA rendering, although it excels in specific AI tasks.

What is the power consumption difference between the A100 and RTX 3090?

The A100 consumes less power, around 250 watts, while the RTX 3090 can draw more than 400 watts during heavy tasks.

NVIDIA REFUSED To Send Us This - NVIDIA A100

00:23:46

https://www.youtube.com/watch?v=zBAxiQi2nPc

摘要

TLDRThe video from Linus Tech Tips focuses on the review and teardown of the Nvidia A100, a high-performance GPU designed for AI and data-intensive analytics. Despite Nvidia's reluctance to provide this $10,000 GPU to reviewers, Linus acquired it through a fan. The A100 is compared to other GPUs like the RTX 3090, showcasing its strengths in machine learning and efficiency, though it lacks certain graphical capabilities like ray tracing. Linus also explores the unique passive cooling system of the A100. Tests reveal its superior performance in AI tasks but note its slight lag in rendering tasks compared to consumer GPUs. The video also touches on expected technological advancements in future GPUs with more efficient manufacturing processes.

心得

🎥 Linus reviews the Nvidia A100, a premier AI GPU.
💰 The A100 costs about $10,000 and wasn't officially provided by Nvidia.
🤖 It excels in AI and data analytics tasks but lacks gaming features like ray tracing.
🔍 Linus compares the A100's performance to a RTX 3090, showcasing its efficiency in specific tasks.
🌬️ The A100 features a unique passive cooling system, suitable for server environments.
🔌 A100 is more power efficient than consumer GPUs like the RTX 3090.
🖥️ OpenGL is supported, but not DirectX or Vulkan on the A100.
⚙️ Features higher memory bandwidth compared to other GPUs.
🔍 Future Nvidia GPUs are expected to be even more powerful, built on smaller processes.
📊 Nvidia A100 uses a 40 GB memory for handling extensive data computations.

时间轴

00:00:00 - 00:05:00
The video introduces various high-end GPUs that the speaker has previously reviewed but focuses on the Nvidia A100, which is highlighted as an AI enterprise GPU. The speaker explains that Nvidia refused to send them this card, but they acquired it through a fan. The Nvidia A100 has impressive specifications with 40GB of memory, and the video promises a teardown and testing, including Ethereum mining comparisons with similar cards like the CMP 170 HX and the RTX 3090.
00:05:00 - 00:10:00
The tear down of the Nvidia A100 reveals its hardware, which is similar to other high-performance cards but lacks certain features like video output and NV Link accessibility. It uses GA100 silicon with specific modifications for enterprise use, resulting in high efficiency. The speakers humorously discuss its power connector and internal design quirks, such as a unique unclipping mechanism. The discussion compares these details with the CMP 170 HX and highlights Nvidia's pattern of re-utilizing silicon across different product lines, adjusting capabilities through drivers or hardware modifications.
00:10:00 - 00:15:00
When testing the A100, it performs slightly worse than the RTX 3090 in CUDA rendering but excels in Optix rendering, showcasing its design geared towards different workload efficiencies. The speakers explain that although the A100 lacks ray tracing cores, Optix utilizes Tensor cores for AI denoising, resulting in significant performance leaps. Efficiency in power consumption is emphasized, with continued exploration into the A100's capabilities in alternative applications like mining and folding, demonstrating remarkable efficiency and performance per watt compared to the RTX 3090.
00:15:00 - 00:23:46
The Nvidia A100 further proves its worth in machine learning benchmarks, achieving significantly higher performance than the RTX 3090 while consuming less power. The speaker discusses the architectural advantages of the 7nm manufacturing process used for the A100, sparking excitement for future Nvidia GPUs rumored to be built on even more advanced processes. The GPU's potential in gaming remains unexplored due to limitations in DirectX support, but the efficiency gains are highlighted as promising for next-generation products. The video concludes with sponsorship messages and promotional content.

显示更多

思维导图

视频问答

What is the Nvidia A100 used for?
The Nvidia A100 is used for AI enterprise high performance computing, big data analytics, and machine learning workloads.
How was the Nvidia A100 acquired by Linus?
Linus obtained the Nvidia A100 through a fan who helped procure it, despite Nvidia not providing it for review.
Why doesn't Nvidia seed the A100 to reviewers?
Nvidia generally doesn't provide the A100 to reviewers because it is a specialized, expensive piece of hardware, priced around $10,000.
What is unique about the A100's cooling system?
The A100 uses a passive cooling system with a large heat sink and airflow through a server chassis.
How does the A100 compare to the RTX 3090 in rendering?
The A100 is more power efficient but slightly slower than the RTX 3090 in CUDA rendering, although it excels in specific AI tasks.
What is the power consumption difference between the A100 and RTX 3090?
The A100 consumes less power, around 250 watts, while the RTX 3090 can draw more than 400 watts during heavy tasks.

查看更多视频摘要

即时访问由人工智能支持的免费 YouTube 视频摘要！

字幕

自动滚动:

00:00:00
- We have looked at a lot of ballin' GPU over the years,
00:00:03
whether it's the six titan views we had
00:00:05
for the six editor's project,
00:00:06
three GV100 Quadras for 12K Ultrawide gaming,
00:00:10
or even this unreleased mining GPU,
00:00:13
the CMP 170 HX.
00:00:15
There are not a lot of cards out there
00:00:17
that we have not been able to get our hands on
00:00:19
in one way or another,
00:00:20
except for one.
00:00:22
Until now,
00:00:23
the Nvidia A100,
00:00:25
this is their absolute top dog,
00:00:28
AI enterprise high performance compute,
00:00:31
big data analytics monster,
00:00:33
and they refused to send it to me.
00:00:37
Well, I got one anyway Nvidia,
00:00:39
so, deal with it.
00:00:42
Just like everyone's gotta deal with my segues,
00:00:44
smart deploy provides out of the box windows imaging support
00:00:47
for over 1,500 computer models.
00:00:49
You can deploy one windows image to any hardware model
00:00:52
with ease and you can get free licenses worth over $500
00:00:55
at smartdeploy.com/linus.
00:00:57
(upbeat music)
00:01:07
The first two questions on your mind,
00:01:08
are probably why we weren't able to get one of these.
00:01:10
and what ultimately changed
00:01:12
that resulted in me holding one in my hands right now.
00:01:15
The answer to the first one
00:01:16
is that Nvidia just plain
00:01:17
doesn't seed these things to reviewers
00:01:19
and at a cost of about $10,000.
00:01:23
It's not the sort of thing that I would just,
00:01:25
you know, buy.
00:01:27
'Cause I got that swagger.
00:01:29
You know what I'm saying?
00:01:30
As for how we got one,
00:01:32
I can't tell you.
00:01:33
And in fact,
00:01:34
we even blacked out the serial number
00:01:36
to prevent the fan who reached out offering to get us one,
00:01:40
from getting identified.
00:01:42
This individual agreed to let us
00:01:43
do anything we want with it.
00:01:45
So you can bet your butt,
00:01:46
we're gonna be taking it apart.
00:01:48
And all we had to offer in return
00:01:50
was that we would test Ethereum mining on it,
00:01:53
send a shroud, that'll allow 'em to actually cool the thing
00:01:56
and reassemble it before we return it.
00:01:58
So let's compare it really quickly to the CMP 170 HX,
00:02:02
which it is the most similar card that we have.
00:02:05
It's the silver metal
00:02:06
and it's not ripped for my pleasure.
00:02:09
Regrettable. - [Jake] Alright.
00:02:11
- And we actually have one other point of comparison.
00:02:13
This isn't a perfect one.
00:02:14
This is an RTX 3090.
00:02:16
And what would've been maybe more apt
00:02:19
is the Quadro or rather they dropped the Quadro banding.
00:02:22
But the A6000.
00:02:23
Unfortunately that's another really expensive card
00:02:26
that I don't have a legitimate reason to buy
00:02:29
and Nvidia wouldn't send one of those
00:02:30
for the comparison either.
00:02:32
So the specs on this are pretty similar.
00:02:34
We're gonna use it as a standin'
00:02:35
since we're not really looking
00:02:36
at any workstation loads anyway.
00:02:38
So the A100.
00:02:40
This is a 40 gigabyte card.
00:02:43
I'm gonna let at that sink in for a second.
00:02:45
And the craziest part,
00:02:47
is that 40 gigs is not even enough
00:02:49
for the kinds of workloads
00:02:51
that these cards are used to crunch through.
00:02:53
We're talking enormous data sets
00:02:55
to the point where this 40 gig model,
00:02:57
is actually obsolete now,
00:02:59
replaced by an 80 gig model.
00:03:00
And these NV Link bridge,
00:03:02
connectors on the top here,
00:03:04
let's go ahead and pull these off.
00:03:05
These, there we go,
00:03:07
are used to link up multiples of these cards
00:03:10
so they can all pull memory
00:03:13
and work on even larger data sets.
00:03:15
Now the diet, the center of it,
00:03:17
is a seven nanometer TSMC manufactured GPU
00:03:20
called the GA 100.
00:03:21
We're gonna pop this shroud off.
00:03:22
We're gonna take a look at it.
00:03:24
It has a base clock of just 765 megahertz,
00:03:27
but it'll boost up to fourteen ten.
00:03:30
That memory runs
00:03:31
at a whopping one and a half terabytes a second
00:03:34
of bandwidth on a massive
00:03:37
5,120 bit bus.
00:03:40
It's got 6,912 CUDA cores
00:03:44
and, what is it?
00:03:45
250 watt TDP.
00:03:48
Woooh.
00:03:50
She's packing.
00:03:51
- [Jake] Oh, you're just going right for it.
00:03:52
- I'm going right for.
00:03:53
- [Jake] Oh geez.
00:03:54
- This is Linus tech tips.
00:03:55
- [Jake] And basically every part of this
00:03:57
is identical to the CMP card.
00:04:00
- It kinda looks that way.
00:04:01
I mean the color's obviously different.
00:04:02
- Yeah, but it looks like the clamshell
00:04:04
is two pieces in the same manner.
00:04:06
There's no display outputs.
00:04:08
The fins look the same.
00:04:09
- Now here's something the CMP card specifically
00:04:12
didn't even contain the hardware for video in code,
00:04:16
if I recall correctly.
00:04:17
- Yeah, this doesn't have anything.
00:04:18
- Okay, so it's not that it was fused off.
00:04:19
It's just plain not on the chip.
00:04:21
- Not on GA 100, yeah.
00:04:23
- Okay but,
00:04:24
- GA102, which is like 3090.
00:04:27
- Yes.
00:04:27
- Does have it. - Ooh.
00:04:28
- And A6000.
00:04:30
- Okay, you ready?
00:04:31
- Oh God!
00:04:34
So yeah. - Hey.
00:04:35
- It's like exactly the same on the inside.
00:04:36
Same junk power connector.
00:04:38
- Wow.
00:04:39
That is super junk,
00:04:41
check this out guys.
00:04:41
It uses a single eight pin EPS power connector,
00:04:45
which you might think is a PCIE power connector.
00:04:49
So here, look, I'll show you guys.
00:04:51
This is an eight pin,
00:04:53
like normal GPU connector,
00:04:54
but watch, cannot go in.
00:04:57
But if we take the connector
00:04:58
out of our CPU socket on the motherboard,
00:05:02
There you go although,
00:05:03
the clips are interfering a little bit.
00:05:05
I mean, what the heck is going on here,
00:05:07
ladies and gentlemen?
00:05:08
- You need more power.
00:05:09
- Yeah exactly.
00:05:10
- So you can combine two PCIE connectors into that.
00:05:14
- [Andy] Can't remember how to get it outta here.
00:05:15
I see the fingerprint of the technician
00:05:17
who assembled the card though.
00:05:18
- I think we have to unclip this part first.
00:05:21
Oh, there's a little screw, right?
00:05:22
- There's a little screw.
00:05:23
- Haha, third type of screws.
00:05:26
- [Andy] Yourself. - Didn't see that one, nerd.
00:05:28
- [Andy] You're a nerd.
00:05:29
- [Jake] Your face is a nerd.
00:05:30
- [Andy] Your but nerd.
00:05:33
- [Jake] Whoa.
00:05:34
- It's not coming off, Jake.
00:05:36
- What? You gotta like tilt it out, buddy.
00:05:38
Whoa, whoa, whoa.
00:05:39
Don't pull the cooler off.
00:05:40
- See?
00:05:41
It's like it's caught back here.
00:05:43
- Hey ho.
00:05:44
Hey, how you doing?
00:05:45
- Jesus.
00:05:48
- Stressful.
00:05:49
Look, maybe if we break it,
00:05:52
you'll actually have to buy one.
00:05:53
- I don't wanna buy one.
00:05:54
That's not the goal. - What?
00:05:55
- I thought you put your hand up for a high five.
00:05:57
I was like, "well, what are you talking about?
00:05:59
I don't want to buy one."
00:06:00
- Why not?
00:06:01
Whoa, what is going on here?
00:06:02
You see that?
00:06:03
- It looks like there was a thermal pad there or something,
00:06:05
but there isn't, its like greasy.
00:06:07
- It actually,
00:06:08
no, look at it closer.
00:06:09
It's not greasy.
00:06:10
It's, you see how this is brushed almost.
00:06:12
Or looks somebody sandblasted it.
00:06:15
- That part's not.
00:06:17
I don't remember that on this card.
00:06:18
- Alright, so the spring loading mechanism
00:06:21
is just from the bend of the back plate, that's kinda cool.
00:06:23
- [Jake] So I checked the CMP thing.
00:06:26
Doesn't look like it.
00:06:27
- [Andy] I wonder why they wouldn't have like a map.
00:06:28
- [Jake] This doesn't look brushed at all.
00:06:31
What did we, last time we twisted?
00:06:33
- [Andy] No, I don't think we did.
00:06:34
- Yeah we did.
00:06:35
- [Andy] I'm pretty sure I just rimmed on it.
00:06:37
- [Jake] Oh God! No.
00:06:38
You were against rimming on it.
00:06:39
And then we were like, just twist a little.
00:06:41
- [Jake] Oh.
00:06:42
God.
00:06:43
Ah.
00:06:44
It has an IHS.
00:06:45
It looks basically the same.
00:06:47
- [Andy] Yeah.
00:06:48
- [Jake] We're gonna have to clean that off and see
00:06:51
there's not much alcohol.
00:06:53
- [Jake] No, I like to go in dry first.
00:06:55
So yep, that's the same thing, alright.
00:06:58
I mean, this isn't the first time Nvidia
00:07:00
has used the same Silicon in two different products
00:07:03
with two different capabilities.
00:07:05
We see the same thing
00:07:06
with their Quadro lineup versus their GForce lineup
00:07:09
where things will just be disabled
00:07:10
through drivers or fusing off different functional units
00:07:13
on the chip.
00:07:14
What I wanna know then
00:07:15
is besides the lack of NV Link connectors on this one.
00:07:17
- Well, they are in there.
00:07:19
They're just not accessible and they probably don't work.
00:07:21
- Right.
00:07:22
What is the actual difference
00:07:23
in function between them?
00:07:25
(Jake sighs)
00:07:26
- Well, this one doesn't have full PCIE 16X,
00:07:29
- Right? - It does less memory.
00:07:32
I think it has way less transistors,
00:07:33
but it is still a GA100.
00:07:35
- Yeah, so the transistors are there.
00:07:37
- Yeah, they're probably just not functional.
00:07:40
Let me see what the chip number is on that one.
00:07:42
- Yeah, 'cause were we not even able
00:07:43
to find a proper Nvidia.com reference to this one anyway.
00:07:47
So we're just relying on someone else's spec sheet.
00:07:49
So the transistor count could just be wrong.
00:07:51
- Okay, so this is so the CMP card was a GA.
00:07:55
- Look at this guy?
00:07:56
- Yeah.
00:07:57
What a weirdo.
00:07:58
GA 100-105F.
00:08:00
And this is a GA100-833.
00:08:04
- If it's a GA100,
00:08:06
I guess it could be a different GA100.
00:08:07
I don't know.
00:08:08
- I mean, it used to be back in the day,
00:08:09
you would assume that it's just using the same Silicon
00:08:11
as the GForce cards because Nvidia's data center business
00:08:14
hadn't gotten that big yet,
00:08:15
but nowadays, they can totally justify,
00:08:17
an individual, like new guide design
00:08:20
for a particular lineup
00:08:21
of enterprise product.
00:08:22
- And interestingly enough,
00:08:23
the SXM version doesn't have an IHS
00:08:26
at least it seems that way.
00:08:28
But the SXM version is also like 400 Watts.
00:08:31
And this is like 250.
00:08:33
- [Andy] Yeah, totally different classes
00:08:34
of capabilities, alright?
00:08:36
Let's put it back together then, shall we?
00:08:38
- I got your new goop. - Goop me.
00:08:39
- I brought two goops.
00:08:40
- We're going for the no look catch.
00:08:46
- Oh yeah baby. - Yes.
00:08:49
X marks the spot, baby.
00:08:52
My finest work.
00:08:53
- Maybe it'll perform better now.
00:08:54
- [Andy] Probably not.
00:08:55
(Jake laughs)
00:08:56
(Andy truck signals)
00:09:00
We're backing it up.
00:09:01
(Jake chuckles)
00:09:03
- [Jake] Cool story, bro.
00:09:04
- [Andy] Thanks.
00:09:05
Thanks bro.
00:09:06
- Where's our back plate.
00:09:08
Did you take it?
00:09:09
Oh shoot.
00:09:10
- Yes. - Black.
00:09:11
I thought it was gold.
00:09:12
I was looking for gold.
00:09:13
(Jake laughs)
00:09:14
- [Jake] Aren't we all. - I don't know about you,
00:09:16
but I found my gold.
00:09:17
- What's that?
00:09:19
- Yvonne.
00:09:20
- Shut up (chuckles)
00:09:22
- Alright.
00:09:22
Alright.
00:09:23
Let's get going here.
00:09:24
Which one do you wanna put on the bench first?
00:09:26
- What do you mean?
00:09:26
We're not gonna compare to that thing.
00:09:27
It doesn't do anything.
00:09:30
We don't need this thing.
00:09:31
- But here we go, boys.
00:09:32
- We can't put this in the first lock.
00:09:33
'Cause we don't have a display output.
00:09:35
- You like the bottom one? - Yeah,
00:09:36
- You're a bottom?
00:09:38
- Sure.
00:09:39
- This, okay.
00:09:41
This is how you flex IT style.
00:09:43
Now you might have noticed
00:09:44
at some point that the A100
00:09:46
doesn't have any sort of cooling fan.
00:09:47
It's just one big fat, long heat sink
00:09:50
with a giant vapor chamber under it to spread the heat
00:09:53
from that massive GPU.
00:09:55
So Jake actually designed
00:09:57
what we call the shroud donator.
00:09:59
It allows us to take these two screws
00:10:01
that are on the back of the cart
00:10:02
for securing it in a server chassis,
00:10:03
because that's how it's designed to be used.
00:10:05
So it's passive,
00:10:06
but there's lots of airflow going through the chassis,
00:10:09
and then lets us take those screw holes,
00:10:12
and mount a fan to the back of the cart.
00:10:14
It's frankly not amazing.
00:10:17
(Jake chuckles)
00:10:18
- What? No.
00:10:20
That is aerodynamics at its peak.
00:10:22
You should hire me to work on F1 cars, okay?
00:10:25
- Yeah.
00:10:26
Not so much.
00:10:27
- Yeah.
00:10:27
It only blows probably more air out this end
00:10:30
from the back pressure than it does on this end. (laughs)
00:10:32
But it's enough to cool it, I swear.
00:10:34
- It is. - Yeah.
00:10:35
- Let's go ahead and turn on the computer, shall we?
00:10:39
- Oh yeah, so a couple interesting points here.
00:10:41
It wouldn't boot right off the bat.
00:10:43
You have to enable Above 4G decoding.
00:10:45
And then I also had to go in and I think it's called like
00:10:47
4G MMIO or something like that.
00:10:50
I had to set that to 42.
00:10:52
- Okay.
00:10:53
- The answer to the universe.
00:10:55
- Yes.
00:10:56
Thank you.
00:10:57
And they are both here.
00:10:57
A100 PCIE 40 fricking gigabytes.
00:11:02
- I installed the like game ready driver for the 3090,
00:11:05
and then I installed the data center driver,
00:11:07
and I think it overwrote it,
00:11:08
but the game ready driver,
00:11:10
it still showed as like active
00:11:11
and you could do stuff with the A100 and vice versa.
00:11:14
So it's probably fine.
00:11:16
- Now, interestingly,
00:11:17
the A100 doesn't show up in task manager at all.
00:11:20
- [Jake] Did the CMP, I can't,
00:11:22
- [Andy] remember. - No, no.
00:11:23
I don't think it did actually, anyways.
00:11:24
- What do you wanna do in Blender,
00:11:26
classroom?
00:11:26
BMW?
00:11:27
BMW's probably too short.
00:11:28
- Yeah.
00:11:29
Let's do classroom.
00:11:30
I think BMW on a 3090 is like 15 seconds
00:11:32
or something like that anyway so.
00:11:35
- That's also like the spiciest 3090.
00:11:37
- [Jake] That you can get. Yeah, pretty much.
00:11:39
It's just so thick.
00:11:40
Why would you ever use it?
00:11:43
- Because you wanted,
00:11:43
- Is it even doing anything like (chuckles)
00:11:45
- Here's one reason,
00:11:46
'cause you can do classroom renders
00:11:48
in a minute and 18 seconds, that's why?
00:11:51
- Okay.
00:11:52
Well, what about the A100?
00:11:52
You didn't plug the fan in, hey.
00:11:54
- Oh whoops.
00:11:55
How hot is this?
00:11:56
- Probably warm.
00:11:57
- Fortunately it hasn't been doing anything.
00:11:59
Time to beat is a minute and 18 seconds.
00:12:02
So let's go ahead and see how it does.
00:12:05
- It feels like this is the intake.
00:12:08
I mean it's hot.
00:12:09
So like, - Oh yeah.
00:12:10
But it's going.
00:12:11
It's going Jake.
00:12:12
It's going.
00:12:13
You did good.
00:12:14
- It works enough.
00:12:15
This should be like, this is all.
00:12:16
- This should be way faster. - Way huge GPU, right?
00:12:19
- [Andy] It's actually slower.
00:12:20
- [Jake] How much?
00:12:21
Not by much.
00:12:22
- It's like a few seconds, but it's slower.
00:12:25
- So it's worse in CUDA.
00:12:26
What about Optixs?
00:12:28
So the interesting thing
00:12:30
is this card doesn't have Ray Tracing cores.
00:12:33
The 3090 does,
00:12:35
see you'd think that Optixs
00:12:37
would only work on the 3090, right?
00:12:38
- Do you want me to just try the A100?
00:12:40
- Yeah, sure.
00:12:41
It's still GPU compute.
00:12:43
- I mean you gotta give it to it in terms of efficiency.
00:12:47
For real though, even running two renders to the 3090's one,
00:12:51
the average power consumption here is still lower.
00:12:54
- [Jake] Yeah well, and looking at while it's running,
00:12:56
it's like 150 Watts.
00:12:58
- Yeah.
00:12:59
- [Jake] Versus 350 or whatever it was on the 1390.
00:13:02
- Yeah, ready to go again?
00:13:04
- [Jake] Yep.
00:13:05
- Okay. - [Jake] Oh my God.
00:13:07
- Man, this thing is fast.
00:13:08
- What's the power consumption?
00:13:10
- [Andy] Holy bananas.
00:13:10
- [Jake] 353.
00:13:13
Still like just,
00:13:15
I want one of these.
00:13:17
This thing is sick.
00:13:18
(Jake laughs)
00:13:19
It's way faster.
00:13:19
- Yeah.
00:13:20
There's no question.
00:13:21
We don't even need to.
00:13:22
- It's gonna be like thirty seconds.
00:13:23
- Yeah.
00:13:24
Not even close.
00:13:25
- So do you wanna know why?
00:13:27
- I would love to know why.
00:13:28
- You said it earlier.
00:13:29
You just weren't really thinking about it.
00:13:31
This has half the CUDA course of a 3090,
00:13:34
it's likes seven thousandish I think.
00:13:36
- Right, so it's just full of like machine learning stuff.
00:13:38
- Yeah, so it has basically half the CUDA cores.
00:13:42
So the fact that it is even close
00:13:44
is kind of crazy in CUDA mode.
00:13:45
But in Optix, what I found out
00:13:47
is Optixs will use the Tensor cores
00:13:50
for like AI Denoising,
00:13:52
- [Andy] But nothing else.
00:13:53
- Which you'll see in there.
00:13:54
So I think it's falling back to CUDA for the other stuff.
00:13:57
- [Andy] Got it.
00:13:58
- But the 3090 has Ray Tracing and Tensor cores so.
00:14:02
- Right.
00:14:02
- It just demolishes (chuckles)
00:14:05
- Where's the thing where you can select apps
00:14:08
and then tell it which GPU to use.
00:14:10
Yeah, here we go.
00:14:12
No, so it'll not allow you to select the A100 to run games,
00:14:15
even if we could pipe it through our onboard,
00:14:18
or through a different graphics card like we did with that.
00:14:21
- [Jake] It doesn't have DirectX Ray
00:14:22
- Mining card ages ago.
00:14:22
No DirectX support whatsoever.
00:14:25
- [Jake] Let's check it in GPU-Z.
00:14:26
- So way fewer CUDA cores.
00:14:28
You can see that
00:14:29
we go from over 10,000,
00:14:31
to a lot less than 10,000.
00:14:35
Pixel fillrates are actually higher.
00:14:36
I guess that's your HBM2 memory talking.
00:14:40
- [Jake] One point five Gigabytes per second.
00:14:43
- What's a 39,
00:14:43
One point five terabytes per second.
00:14:45
It's like
00:14:47
- [Jake] 50% or more
00:14:48
- 60% almost.
00:14:50
- Holy banana.
00:14:51
- But what about the supported tech?
00:14:54
Yeah, so we can do CUDA, OpenCL,
00:14:57
- [Jake] PhysX (laughing)
00:14:59
- Sure.
00:14:59
- [Jake] We should set it as the PhysX card.
00:15:01
- Dedicated PhysX card.
00:15:03
All the rag dolls everywhere.
00:15:06
- [Jake] And OpenGL but not Direct anything or Vulkan even.
00:15:09
- OpenGL.
00:15:11
Now that's interesting.
00:15:13
- [Jake] Go to the advanced tab.
00:15:14
'Cause you can select
00:15:15
like a specific DirectX version
00:15:17
at the top under General.
00:15:19
Like well, the DX 12.
00:15:21
What does it say?
00:15:22
Device not found.
00:15:22
It's the same as the mining card.
00:15:25
It'll do OpenCL.
00:15:27
So we can't mine on it (chuckles)
00:15:30
- Alright. I mean, should we try that?
00:15:32
- [Jake] Yeah, we could do mining or folding or.
00:15:34
- Sure, I have a feeling that's gonna kind of suck
00:15:36
for that too.
00:15:37
- There's not. - Like AI in mining.
00:15:39
- I don't think so.
00:15:40
It's still a big GPU dude.
00:15:42
- So you can't.
00:15:44
- Well suck is relative, right?
00:15:45
Like for the price you'd never buy.
00:15:46
- I think it might be better than the CMP card though.
00:15:49
Just a little bit. - Shut up.
00:15:51
- I think so.
00:15:52
So the only thing you can adjust,
00:15:54
I think this is the same with the CMP card
00:15:56
is the core clock and the power limit.
00:15:58
You can't mess with the memory speed.
00:15:59
- [Andy] And you can move the power limit only down
00:16:01
it looks like.
00:16:02
- [Jake] Yeah.
00:16:03
Top is the 3090,
00:16:03
bottom is the A100.
00:16:04
- [Andy] Wow.
00:16:05
That is a crap tone faster than a 3090.
00:16:08
- [Jake] It's pretty much the same as the CMP,
00:16:10
but look at the efficiency.
00:16:11
- 714 kilo hash per watt.
00:16:15
- [Jake] And I bet you if we lower the power limit
00:16:17
to like 80,
00:16:19
it's a little bit lower speed.
00:16:20
Maybe we can go, I don't know.
00:16:22
We probably don't have to tinker with this too much.
00:16:25
I mean, it doesn't draw that much power to begin with,
00:16:27
I guess. - Yeah.
00:16:28
I think it's pretty fricking efficient
00:16:30
right outta the box.
00:16:31
- I mean the efficiency is better.
00:16:33
It's a little bit better,
00:16:34
but before it was doing 175 mega hash
00:16:36
roughly at 250 Watts,
00:16:38
so it's pretty pretty good.
00:16:41
3090, you can probably do like 300 Watts
00:16:44
with 120 mega hash.
00:16:45
We're running the folding client now.
00:16:48
I've had it running for a few minutes,
00:16:49
and it's kind of hard to say.
00:16:52
The thing with folding is,
00:16:53
based on whatever project you're running,
00:16:55
which is whatever job the server has sent you to process,
00:16:59
your points per day will be higher or lower.
00:17:01
So it's possible that the A100 got a job
00:17:03
that rewards less points than the 3090 did.
00:17:07
It does look like it's a bit higher,
00:17:08
but you can see our 39.
00:17:10
This is like a little,
00:17:11
like comparison app thing
00:17:13
is 31% lower than the average.
00:17:16
So it's probably just that this job
00:17:17
doesn't give you that many points.
00:17:19
- Got it.
00:17:20
- The interesting part is
00:17:21
the 3090's drawing.
00:17:24
400 watt.
00:17:25
- [Both] 400.
00:17:26
- Holy shnikes.
00:17:27
- [Jake] A100 is drawing.
00:17:28
- 240.
00:17:29
(Jake laughing)
00:17:30
Man, that's efficient
00:17:32
and performance per what?
00:17:33
Maybe gamers don't care that much.
00:17:35
Actually we know for a fact,
00:17:36
gamers don't care that much.
00:17:37
In the data center, that's everything,
00:17:40
because the cost of the card,
00:17:42
is trivial compared to the cost of power delivery,
00:17:45
and cooling on a data center scale.
00:17:48
- Especially when you have eight of these
00:17:49
with a 400 watt power budget,
00:17:51
like you would get on the SXM cards in a single chassis,
00:17:54
times 50 chassis,
00:17:56
like that's a lot of power (chuckles)
00:18:00
- Let's try something, machine learning.
00:18:03
- Unfortunately for obvious reasons,
00:18:05
most machine learning or deep learning,
00:18:07
whatever you want to call it, benchmarks,
00:18:09
don't run on windows.
00:18:10
So instead I've switched over to Ubuntu
00:18:12
and we've set up the CUDA Toolkit,
00:18:14
which is gonna include our GPU drivers
00:18:15
that we need to even run the thing
00:18:17
as well as Docker and the Nvidia Docker Container,
00:18:20
which will allow us to run the benchmark.
00:18:21
We're gonna be running the ResNet-50 benchmark,
00:18:24
which runs within TensorFlow two.
00:18:26
This is a really, really common benchmark
00:18:28
for big data, clusters and stuff.
00:18:30
Except our cluster, is just one GPU.
00:18:34
In a separate window, I've got Nvidia SMI running.
00:18:36
It's kind of like the Linux version of MSI Afterburner,
00:18:39
but it's made by Nvidia, so not quite,
00:18:42
but what it's good for,
00:18:43
is at least telling us our power and the memory usage,
00:18:46
which we should see spike a lot
00:18:47
when we run this benchmark,
00:18:49
I took the liberty of pre-creating a command
00:18:51
to run the benchmark.
00:18:52
So we're gonna be running with XLA on
00:18:53
to hopefully bump the numbers a bit.
00:18:55
We will do that for the A100 as well.
00:18:57
So no worries there.
00:18:58
It should be the same
00:18:59
as well as using, what do you want?
00:19:01
Look, he left cause he didn't have time for this.
00:19:03
And now he's back.
00:19:04
This is the world's most expensive lint roller.
00:19:06
(Andy chuckles)
00:19:07
I even don't remember what I was saying, damn it.
00:19:10
Distractions aside, we're gonna be running with XLA on.
00:19:13
That'll probably give us a bit higher number
00:19:15
than you would normally,
00:19:16
but it is still accurate
00:19:18
and we're gonna be running the same settings
00:19:19
on the A100 as well.
00:19:20
So no concerns there.
00:19:21
We'll also be using a batch size of 512
00:19:24
as well as fp16 rather than fp32.
00:19:27
So if you wanna re-create these tests yourself,
00:19:29
you totally can.
00:19:30
Let's see what our 3090 can do.
00:19:33
Look at that 24 gigs of VRAM completely used.
00:19:39
God, I don't know if there's any application
00:19:41
aside from like Premier that will use all that VRAM.
00:19:44
I'm sure Andy can attest to that (strained laugh)
00:19:47
Okay, 1,400 images a second.
00:19:49
That's pretty respectable.
00:19:51
I think like a V100,
00:19:53
which is the predecessor to the A100
00:19:55
does like less than 1000.
00:19:58
So the fact that a 3090,
00:19:59
which is a consumer gaming card
00:20:01
can pull off those kind of numbers is huge.
00:20:04
Mind you, the wattage, 412 Watts.
00:20:08
That's a lot of power.
00:20:11
It'll be interesting to see how much more efficient
00:20:13
the A100 is when we try that after.
00:20:15
The test is done now,
00:20:16
and the average total images per second
00:20:18
is 1,400 and 35.
00:20:21
It's pretty good.
00:20:22
I've gone ahead and added our A100
00:20:24
so we can run the benchmarks on that instead.
00:20:25
And I'm expecting,
00:20:27
this is gonna be substantially more performant.
00:20:30
So it's the same test.
00:20:31
I'm just gonna run the command here.
00:20:33
Gonna wait a few seconds.
00:20:35
We got Nvidia SMI up again.
00:20:37
You can see that it's just running on the A100.
00:20:40
The RAM on the 3090 is not getting filled.
00:20:42
We're just using that as a display output.
00:20:44
See, all 40 gigabytes used.
00:20:46
That's crazy.
00:20:48
(Jack laughing)
00:20:50
If we thought the 3090 was fast.
00:20:53
Look at that Andy.
00:20:54
That's like a full 1000 images more,
00:20:56
we're getting like 2400
00:20:58
instead of 1400
00:20:59
and the icing on the cake.
00:21:01
If you look at Nvidia SMI,
00:21:03
we're using like 250 Watts
00:21:06
instead 400,
00:21:07
while getting like almost double the performance.
00:21:10
That is nuts.
00:21:12
- Probably the coolest thing
00:21:13
about this whole experience though,
00:21:15
is seeing the Ampere architecture
00:21:17
on a seven nanometer manufacturing process.
00:21:19
'cause you gotta remember
00:21:20
while none of this is applicable to our daily business.
00:21:22
What this card does do,
00:21:24
is excite me for the next generation of Nvidia GPUs.
00:21:27
Because even though the word on the street
00:21:29
is that the upcoming Ada Lovelace architecture,
00:21:32
is not going to be that different from Ampere.
00:21:35
Consider this, Nvidia's gaming lineup
00:21:38
is built on Samsung's eight nanometer node,
00:21:40
while the A100 is built on TSMC's seven nanometer node.
00:21:44
Now we've talked a fair bit about how nanometers,
00:21:47
from one fab to another,
00:21:49
can't really be directly compared in that way.
00:21:52
But what we can do, is say that it is rumored,
00:21:55
that Nvidia will be building
00:21:56
the newer ADA Lovelace gaming GPUs
00:21:59
on TSMC's five nanometer node,
00:22:02
which should perform even better
00:22:04
than their seven nanometer node.
00:22:05
And if the efficiency of improvements
00:22:07
are anything like what we're seeing here,
00:22:09
we are expecting those cards
00:22:10
to be absolute freaking monsters.
00:22:13
So good luck buying one.
00:22:16
(Jake laughing)
00:22:17
Hey, at least you can buy one of these.
00:22:19
We've got new pillows, that's right.
00:22:22
This is the, what are we calling it?
00:22:24
- [Jake] Couch ripper.
00:22:25
- The couch ripper the couch rip.
00:22:26
It's an AMD themed version
00:22:28
of our CPU pillow with alpaca and regular filling blend.
00:22:31
And this video is brought to you by our sponsor,
00:22:34
ID agent.
00:22:35
90% of data breaches start with a phishing email.
00:22:39
So you can reduce your organization's chance
00:22:41
of experiencing a cybersecurity disaster
00:22:43
by up to 70% with security awareness training.
00:22:46
That includes phishing simulation,
00:22:48
Bullphish ID by ID agent is a Phish simulation platform
00:22:52
that transforms your biggest attack surface,
00:22:55
into your biggest defensive asset.
00:22:56
You can add every employee to your security team
00:22:59
with security awareness training
00:23:00
that empowers them to spot and stop Phishing threats.
00:23:03
You can automate training campaigns
00:23:04
and reporting for stress free,
00:23:06
consistent training that gets results.
00:23:08
Choose from a rich set of
00:23:10
Plug and Play Phishing campaign kits
00:23:11
and video lessons accompanied by short quizzes,
00:23:13
or you can create your own fishing campaigns
00:23:16
and training materials easily.
00:23:17
Bullphish ID provides effective affordable one-Stop
00:23:20
phishing resistance training
00:23:21
that fits any business and budget.
00:23:23
Get two months for free and 50% off setup
00:23:25
at bullphishid@it.idagent.com/linus
00:23:30
If you guys enjoyed this video,
00:23:32
maybe go check out our previous video,
00:23:33
looking in more depth at the CMP 170 HX.
00:23:39
- [Jake] I like this silver better.
00:23:40
- If we were smart,
00:23:41
we'd be mining on this,
00:23:42
but we're not that smart.
00:23:43
- [Jake] Well, you know, mining is bad.

标签

Nvidia A100
GPU Comparison
AI
Machine Learning
Tech Review
RTX 3090
Power Efficiency
GPU Mining
Cooling System
Data Analytics