Memory Wars! Apple vs Ryzen - Is Unified Memory Faster than Shared GPU Memory?

00:13:09
https://www.youtube.com/watch?v=Cn_nKxl8KE4

Summary

TLDRIn this video, Dave Plamer compares the memory architecture of Apple's M2 Ultra Mac Pro and GMK Tech's Ryzen 860S APU. He highlights the differences in memory access, bandwidth, and architecture, emphasizing Apple's unified memory architecture (UMA) that provides high bandwidth and low latency. The M2 Ultra's memory bandwidth reaches up to 800 GB/s, while the Ryzen 860S offers around 80 GB/s. The video discusses the advantages of each system, with the M2 Ultra being ideal for professional content creation and the Ryzen system offering flexibility and upgradability. Dave also touches on the importance of software optimization and how it affects performance, concluding that the choice between the two systems depends on the user's specific needs and preferences.

Takeaways

  • 💻 Dave Plamer compares M2 Ultra and Ryzen 860S.
  • 📊 M2 Ultra features unified memory architecture (UMA).
  • ⚡ M2 Ultra offers up to 800 GB/s bandwidth.
  • 🔄 Ryzen 860S provides flexibility and upgradability.
  • 🖥️ M2 Ultra excels in professional content creation.
  • 🔧 Ryzen system is better for users who like to tinker.
  • 📈 Software optimization impacts performance significantly.
  • 🔗 Cache coherency reduces latency in M2 Ultra.
  • ⚖️ Trade-off between performance and modularity exists.
  • 🔍 Choose the system based on specific needs.

Timeline

  • 00:00:00 - 00:05:00

    Dave Palmer introduces the topic of modern memory architecture, emphasizing the importance of memory access and sharing in system performance. He compares two platforms: Apple's M2 Ultra Mac Pro with unified memory and GMK Tech's Ryzen 8600S APU with shared DDR5 memory. The discussion highlights the architectural differences, focusing on bandwidth, bus width, and cache coherency, setting the stage for a detailed comparison of their performance in creative and technical workflows.

  • 00:05:00 - 00:13:09

    The M2 Ultra's unified memory architecture allows for high bandwidth and low latency, enabling efficient data sharing between CPU, GPU, and other components. In contrast, the Ryzen's shared memory model leads to contention and slower performance due to its narrower memory bus and the need for data copying. While the M2 Ultra excels in high-performance tasks like video editing and machine learning, the Ryzen offers flexibility and upgradability, making it suitable for general use and tinkering. Ultimately, the choice between the two depends on specific user needs and workloads.

Mind Map

Video Q&A

  • What is the main focus of the video?

    The video focuses on comparing the memory architecture of Apple's M2 Ultra Mac Pro and GMK Tech's Ryzen 860S APU.

  • What is unified memory architecture (UMA)?

    UMA is a memory architecture where the CPU, GPU, and other components share the same pool of memory, allowing for faster data access and reduced latency.

  • How does the M2 Ultra's memory bandwidth compare to the Ryzen 860S?

    The M2 Ultra has a memory bandwidth of up to 800 GB/s, significantly higher than the Ryzen 860S, which has around 80 GB/s.

  • What are the advantages of the Ryzen system?

    The Ryzen system offers flexibility and upgradability, allowing users to swap out RAM and customize their setup.

  • Which system is better for professional content creation?

    The M2 Ultra is better for high-bandwidth professional content creation tasks due to its architectural advantages.

  • What is the trade-off between the two systems?

    The M2 Ultra excels in performance and efficiency, while the Ryzen system provides more user choice and modularity.

  • How does software optimization affect performance?

    Software optimized for Apple's architecture can leverage its memory management more effectively, leading to better performance.

  • What is the key takeaway regarding system choice?

    Choosing between the two systems depends on the user's specific needs, such as performance requirements and upgrade flexibility.

  • What is the significance of cache coherency?

    Cache coherency allows the CPU and GPU to access shared data without needing to copy it, reducing latency and improving performance.

  • What does Dave suggest for users who like to tinker with their systems?

    Dave suggests that users who enjoy tinkering should consider the Ryzen system for its modularity and upgrade options.

View more video summaries

Get instant access to free YouTube video summaries powered by AI!
Subtitles
en
Auto Scroll:
  • 00:00:00
    Hey, I'm Dave. Welcome to my shop. I'm
  • 00:00:03
    Dave Plamer, a retired software engineer
  • 00:00:04
    from Microsoft, going back to the MSTOS
  • 00:00:06
    and Windows 95 days. And today, we're
  • 00:00:09
    going to venture into the world of
  • 00:00:10
    modern memory architecture, but with a
  • 00:00:12
    twist. Because while everybody's busy
  • 00:00:14
    talking about raw CPU core counts and
  • 00:00:16
    GPU teraflops, there's something even
  • 00:00:18
    more foundational lurking under the hood
  • 00:00:20
    that makes or breaks your systems real
  • 00:00:22
    world performance, especially in
  • 00:00:24
    mixeduse creative or technical
  • 00:00:25
    workflows. And that something is memory.
  • 00:00:28
    how it's accessed, how it's shared, how
  • 00:00:30
    fast it is, and who gets to use how much
  • 00:00:32
    of it at a time. And to make that
  • 00:00:34
    exploration interesting, we're going to
  • 00:00:35
    do what I'm known for doing, pitting two
  • 00:00:37
    radically different platforms
  • 00:00:39
    head-to-head. And then I'll share some
  • 00:00:40
    comparison benchmarks towards the end
  • 00:00:42
    once we understand the platforms better.
  • 00:00:45
    On the one side, we've got the sleek,
  • 00:00:46
    streamlined M2 Ultra Mac Pro from Apple,
  • 00:00:49
    featuring 128 GB of what Apple calls
  • 00:00:52
    unified memory. On the other, we've got
  • 00:00:54
    GMK Tech Nookbox, a sleek 16 core Ryzen
  • 00:00:57
    desktop equipped with the AMD 8060S APU,
  • 00:01:01
    also with integrated graphics, but
  • 00:01:03
    running on a shared DDR5 system. They
  • 00:01:05
    both use integrated graphics. They can
  • 00:01:07
    both edit video. They both run
  • 00:01:09
    productivity apps, but beyond those
  • 00:01:11
    surface similarities, their memory
  • 00:01:12
    systems may as well come from two
  • 00:01:14
    different planets, and that's what we're
  • 00:01:15
    going to explore today. So, buckle up
  • 00:01:17
    because this is going to be a deep dive
  • 00:01:19
    into bandwidth, bus width, cache
  • 00:01:21
    coherency, and a bit of silicon
  • 00:01:22
    wizardry. Let's start with the Apple
  • 00:01:24
    side of the fence. Apple's M2 Ultra and
  • 00:01:27
    their Pro systems, and indeed, most of
  • 00:01:29
    Apple silicon is built around something
  • 00:01:30
    called a unified memory architecture, or
  • 00:01:32
    UMA. The M2 Ultra is a behemoth of a
  • 00:01:35
    chip. Up to 24 cores, 60 GPU cores, and
  • 00:01:38
    a neural engine to boot. But what sets
  • 00:01:41
    it apart isn't just what's on the chip.
  • 00:01:42
    It's how the memory is arranged around
  • 00:01:44
    it. Instead of slapping in some sodiums
  • 00:01:46
    on a motherboard and calling it a day,
  • 00:01:48
    Apple took the bold step of integrating
  • 00:01:50
    the memory directly onto the chip
  • 00:01:51
    package using a silicon interposer. That
  • 00:01:54
    means the LPDDR5 memory modules aren't
  • 00:01:56
    somewhere off in the weeds or on the
  • 00:01:58
    bus. They're right next to the SoC. And
  • 00:02:01
    not in a close enough kind of way, but
  • 00:02:03
    in a shared substrate with a thousand
  • 00:02:05
    pin connection kind of way. We're
  • 00:02:06
    talking about a 1024bit memory bus
  • 00:02:09
    capable of delivering up to 800 GB per
  • 00:02:11
    second of bandwidth. That's not a typo.
  • 00:02:13
    800 gigabytes per second. If that number
  • 00:02:16
    doesn't make your jaw drop, let me put
  • 00:02:18
    it this way. That's 8 to 10 times the
  • 00:02:20
    bandwidth you'd typically find in a
  • 00:02:22
    modern Ryzen desktop running dual
  • 00:02:23
    channel
  • 00:02:24
    DDR5. So, what does all the extra
  • 00:02:26
    bandwidth and proximity actually buy
  • 00:02:28
    you? Well, in Apple's, all the
  • 00:02:30
    components, the CPU, the GPU, the NPU,
  • 00:02:33
    and even the image signal processor
  • 00:02:35
    share access to the same pool of memory
  • 00:02:37
    in a cache coherent fashion. That means
  • 00:02:39
    if the GPU writes to a memory address,
  • 00:02:41
    the CPU can read that exact data without
  • 00:02:43
    needing to copy it out to another buffer
  • 00:02:45
    or go through explicit synchronization.
  • 00:02:47
    And this is a big deal because on a
  • 00:02:49
    traditional PC, things aren't nearly as
  • 00:02:51
    cooperative. Enter the AMD 860S in the
  • 00:02:54
    GMK Tech Nookbox. The 8060S lives inside
  • 00:02:58
    a Ryzen APU, and like many of its x86
  • 00:03:01
    siblings, it runs in what's called a
  • 00:03:02
    shared memory model. That's a much older
  • 00:03:05
    approach where the CPU and the GPU
  • 00:03:07
    technically share the same pool of RAM,
  • 00:03:09
    but functionally they don't share it
  • 00:03:10
    well. Instead, a portion of your
  • 00:03:12
    system's main memory, say 2 GB or 4 GB
  • 00:03:15
    or 16 or 32, is carved out and reserved
  • 00:03:17
    as VRAM for the GPU. This reservation is
  • 00:03:20
    handled by the firmware or the BIOS, and
  • 00:03:22
    the operating system treats it as off
  • 00:03:24
    limits for everything else. So, yes, the
  • 00:03:26
    memory is shared, but that's more of a
  • 00:03:27
    logistical arrangement than a true
  • 00:03:29
    architectural unification. data still
  • 00:03:31
    has to move and buffers are still copied
  • 00:03:33
    and everything still travels through a
  • 00:03:35
    memory controller located on the APU die
  • 00:03:37
    which then talks to your RAM modules
  • 00:03:39
    through a relatively narrow pipe. 128
  • 00:03:41
    bit or 256-bit dual channel DDR5 memory
  • 00:03:45
    interface and you're not getting 256
  • 00:03:47
    unless you're quad channel and I don't
  • 00:03:48
    think they do that on the Ryzen desktop
  • 00:03:50
    yet. But compared to the 1024-bit beast
  • 00:03:53
    on the N2 Ultra, that's a bit like
  • 00:03:54
    trying to hydrate a stadium using a
  • 00:03:56
    garden hose. But now let's talk speed
  • 00:03:58
    cuz that's what we care about. Apple's
  • 00:04:00
    LPDDR5 memory is not only wide, it's
  • 00:04:02
    also fast. Running at around 6,400 megat
  • 00:04:05
    transfers a second, each module can move
  • 00:04:07
    data very quickly. And when multiplied
  • 00:04:09
    across 1024 bits of access width, you
  • 00:04:11
    can start to see where that 800 GB a
  • 00:04:14
    second number comes from. And all of
  • 00:04:15
    this happens right on package, meaning
  • 00:04:17
    there are no long PCB traces, no
  • 00:04:20
    motherboard routes, no DIM slots, no
  • 00:04:22
    latency inducing connectors. Data moves
  • 00:04:24
    quickly and efficiently. On the Ryzen
  • 00:04:26
    side, DDR5 might also clock in at around
  • 00:04:29
    5200 or 5600 megat transfers a second.
  • 00:04:32
    But because the memory bus is narrower,
  • 00:04:34
    the total bandwidth is limited to
  • 00:04:35
    somewhere in the 80 GB range, depending
  • 00:04:37
    on configuration. Not bad, but once
  • 00:04:40
    again, about 1/8 of what the M2 Ultra
  • 00:04:42
    can do. And that's assuming that the CPU
  • 00:04:44
    and GPU aren't stepping on each other's
  • 00:04:46
    toes. In reality, contention can further
  • 00:04:48
    reduce effective bandwidth during mixed
  • 00:04:50
    workloads. So, when you're editing 8K
  • 00:04:52
    video or training a neural network and
  • 00:04:54
    both the CPU and the GPU want to chew on
  • 00:04:56
    the same data set, Apple's architecture
  • 00:04:59
    can serve as both without blinking. With
  • 00:05:01
    the Ryzen, it's got to mediate who goes
  • 00:05:03
    next. Now, let's talk bit width because
  • 00:05:05
    this is one of those classic size
  • 00:05:07
    matters situations. On the M2 Ultra,
  • 00:05:10
    that memory interface we identified was
  • 00:05:11
    1024 bits wide. That means the CPU, the
  • 00:05:15
    GPU, or the neural engine can request
  • 00:05:17
    huge chunks of data in a single
  • 00:05:18
    transaction. That's great for tasks like
  • 00:05:21
    high resolution video rendering where
  • 00:05:22
    you're moving gigabytes of raw pixel
  • 00:05:24
    data around per second. The rise in APU,
  • 00:05:26
    by contrast, is working with a 128 bit
  • 00:05:29
    bus. And that smaller highway means more
  • 00:05:31
    memory transactions are required to move
  • 00:05:34
    the same amount of data. Not only does
  • 00:05:36
    that slow things down, it also consumes
  • 00:05:38
    more power and can increase memory
  • 00:05:39
    contention when multiple agents are
  • 00:05:41
    requesting access. And speaking of
  • 00:05:43
    contention, let's move on to latency and
  • 00:05:45
    cache coherency. Apple's on package
  • 00:05:48
    memory and system level cache design
  • 00:05:49
    mean that the CPU and GPU can both
  • 00:05:52
    access the same data without having to
  • 00:05:54
    make redundant copies. This is
  • 00:05:56
    especially powerful for things like
  • 00:05:57
    metal accelerated machine learning where
  • 00:05:59
    data sets can live in shared memory and
  • 00:06:01
    be updated in place by whichever engine
  • 00:06:03
    is working on them. In contrast, the
  • 00:06:06
    Ryzen's architecture requires more
  • 00:06:07
    fencing and mapping. The GPU might have
  • 00:06:10
    its own view of a buffer and when the
  • 00:06:11
    CPU wants to read to write to it, a copy
  • 00:06:14
    operation or at least a synchronization
  • 00:06:16
    operation is often required. That adds
  • 00:06:18
    latency and burns power. And while Ryzen
  • 00:06:21
    does have a shared L3 cache across its
  • 00:06:24
    CPU cores, usually in the 16 to 32
  • 00:06:26
    megabyte range, it doesn't extend that
  • 00:06:28
    cache to the GPU in a unified fashion.
  • 00:06:31
    Apple on the other hand includes a
  • 00:06:32
    massive 64 megabyte system level cache
  • 00:06:35
    that is accessible and usable by all the
  • 00:06:37
    cores in the SOC. That means hot data
  • 00:06:39
    can be kept very close to all the
  • 00:06:41
    engines, reducing latency and boosting
  • 00:06:43
    throughput, which also brings us now to
  • 00:06:45
    power efficiency. Now, I'm not saying
  • 00:06:47
    the Apple silicon is magic, but if you
  • 00:06:49
    squint hard enough, it's starting to
  • 00:06:50
    feel that way. The N2 Ultra's tight
  • 00:06:52
    coupling of compute and memory, coupled
  • 00:06:54
    with the inherently lower power draw of
  • 00:06:56
    LPDDR5 versus desktop DDR5, means it can
  • 00:06:59
    deliver incredible performance per watt.
  • 00:07:02
    There are very few data copies, less
  • 00:07:04
    movement across physical interconnects,
  • 00:07:06
    and much lower idle and leakage power.
  • 00:07:08
    Ryzen, meanwhile, has to move data from
  • 00:07:10
    the APU die to external DIMs and across
  • 00:07:12
    the traces of your motherboard. That not
  • 00:07:14
    only takes more energy, but it also
  • 00:07:16
    means you've got the signal integrity
  • 00:07:17
    issues, timing coordination, and more
  • 00:07:19
    memory power spent on the controller
  • 00:07:21
    overhead. And sure, desktop DDR5
  • 00:07:24
    supports things like power down modes,
  • 00:07:25
    but nothing beats the efficiency of an
  • 00:07:27
    SOC where everything lives under one
  • 00:07:29
    digital roof. So, here's a philosophical
  • 00:07:31
    question. What's more important, raw
  • 00:07:33
    performance or flexibility? Because
  • 00:07:35
    while the M2 Ultra absolutely crushes
  • 00:07:37
    the Ryzen 860S in terms of architectural
  • 00:07:40
    elegance and performance per watt,
  • 00:07:42
    there's one area where the Ryzen system
  • 00:07:44
    still has a clear advantage.
  • 00:07:46
    Upgradability. On the Apple side, what
  • 00:07:48
    you buy is what you live with. If you
  • 00:07:50
    get the 128 GB of memory, great, but
  • 00:07:52
    it's expensive and it's soldered into
  • 00:07:54
    the SOC package and there's no going
  • 00:07:55
    back. That's fine for video editors or
  • 00:07:58
    machine learning engineers who know
  • 00:07:59
    their memory footprint. But for the rest
  • 00:08:01
    of us, especially those who like to
  • 00:08:02
    tinker or to buy small and scale up down
  • 00:08:05
    the road, it can be a hard stop. The
  • 00:08:07
    Ryzen system, on the other hand, uses
  • 00:08:09
    industry standard DDR5 DIMs in socketed
  • 00:08:11
    slots. If you want to swap in more RAM,
  • 00:08:13
    upgrade it to 128 GB, use faster memory,
  • 00:08:16
    or even run mixed mode configurations,
  • 00:08:18
    you can do that. And while that doesn't
  • 00:08:20
    help your integrated GPU performance, it
  • 00:08:22
    does offer system level flexibility that
  • 00:08:24
    Apple simply doesn't. If you're doing
  • 00:08:26
    prolevel video editing, machine
  • 00:08:27
    learning, or any kind of high resolution
  • 00:08:29
    media work, the M3 Ultra's unified
  • 00:08:31
    memory setup is hands down the better
  • 00:08:33
    tool. You get massive bandwidth, zero
  • 00:08:35
    copy data sharing between compute units,
  • 00:08:37
    and incredibly low latency. But if
  • 00:08:39
    you're gaming, browsing, running
  • 00:08:41
    spreadsheets with the occasional bit of
  • 00:08:42
    Photoshop thrown in, then the Ryzen APU
  • 00:08:45
    with shared memory is a fine choice. You
  • 00:08:47
    might not get the absolute best
  • 00:08:48
    performance per watt or the flashiest
  • 00:08:50
    benchmarks, but you get a solid
  • 00:08:52
    capability at a fraction of the price,
  • 00:08:54
    and you can upgrade or tinker to your
  • 00:08:55
    heart's content. What we're looking at
  • 00:08:57
    here isn't just two different chips, but
  • 00:08:58
    two different design philosophies. Apple
  • 00:09:01
    is betting everything on tight
  • 00:09:02
    integration, shared resources, and
  • 00:09:04
    vertical control of their hardware. AMD
  • 00:09:07
    and the broader x86 world is still built
  • 00:09:09
    around flexibility, modularity, and user
  • 00:09:11
    choice, even if it comes at a cost,
  • 00:09:13
    performance, and efficiency. And you
  • 00:09:15
    know what? There's room for both in this
  • 00:09:17
    world. There's one more subtle but
  • 00:09:19
    incredibly important angle that we need
  • 00:09:20
    to cover, and it's all about real world
  • 00:09:22
    optimization. Because it's easy to get
  • 00:09:24
    swept up in the specs, gigabytes per
  • 00:09:26
    second or cache sizes and bus widths.
  • 00:09:29
    But the rubber meets the road when you
  • 00:09:31
    ask a very simple question. How well
  • 00:09:33
    does your software stack actually use
  • 00:09:34
    your hardware? And here's where Apple's
  • 00:09:36
    approach really shines, especially if
  • 00:09:38
    you're inside their walled garden. Let
  • 00:09:40
    me explain. When you run Final Cut on an
  • 00:09:43
    M2 Ultra, you're running a program
  • 00:09:44
    tailor made to leverage everything
  • 00:09:46
    Apple's architecture has to offer. It
  • 00:09:48
    can stream data straight from SSD to RAM
  • 00:09:50
    to GPU to display, all without
  • 00:09:52
    translation layers, driver issues, or
  • 00:09:54
    copying buffers back and forth. Metal,
  • 00:09:57
    Apple's graphics and compute API, which
  • 00:09:59
    is kind of like CUDA, was built from the
  • 00:10:01
    ground up to play nice with unified
  • 00:10:03
    memory. And that means real gains
  • 00:10:04
    because projects that used to need
  • 00:10:06
    intermediate render passes on disk can
  • 00:10:08
    be computed on the fly. Machine learning
  • 00:10:10
    effects can tap into the neural engine
  • 00:10:12
    without exporting model data across
  • 00:10:14
    buses or dealing with interop headaches.
  • 00:10:16
    The OS, the apps, and the hardware all
  • 00:10:18
    speak the same language, and it's a
  • 00:10:20
    private dialect. Contrast that with the
  • 00:10:22
    Ryzen system. Yes, you can run resolver
  • 00:10:24
    or Blender or PyTorch, but now you're
  • 00:10:26
    relying on drivers from AMD, Open CL, or
  • 00:10:29
    Vulcan interop layers, and a dance of
  • 00:10:31
    memory synchronization going on between
  • 00:10:33
    the CPU and GPU buffers. You can still
  • 00:10:35
    get good results, but you're doing with
  • 00:10:36
    a lot more leg work behind the scenes.
  • 00:10:38
    Now, this doesn't mean that the PC is
  • 00:10:40
    inferior. It just means that open
  • 00:10:42
    platforms carry with them the burden of
  • 00:10:44
    interoperability. Every layer adds
  • 00:10:46
    flexibility for sure, but also friction.
  • 00:10:49
    And nowhere does that show up more
  • 00:10:50
    clearly than how memory is used and
  • 00:10:51
    managed across components. There's one
  • 00:10:54
    last philosophical contrast I want to
  • 00:10:56
    touch on. When you look at the Apple M2
  • 00:10:58
    Ultra, it's clear that Apple is chasing
  • 00:10:59
    a specific vision, a monolithic, highly
  • 00:11:02
    integrated comput engine where
  • 00:11:04
    specialization is internal, not
  • 00:11:05
    external. Everything's on the SOC.
  • 00:11:08
    Memory is shared. Caches are unified.
  • 00:11:10
    The user doesn't manage resources. The
  • 00:11:12
    system does. It's elegant, but it's also
  • 00:11:14
    very rigid. You're buying into a fixed
  • 00:11:16
    future. The Ryzen desktop, by contrast.
  • 00:11:19
    It's almost modular by design. You get
  • 00:11:21
    to pick your CPU, your RAM, your GPU if
  • 00:11:23
    you want one, and you can add more later
  • 00:11:25
    and change cooling systems, tune
  • 00:11:26
    voltages. It's messy, sure, but it's
  • 00:11:28
    yours. And that openness is why the PC
  • 00:11:31
    ecosystem has survived for decades and
  • 00:11:33
    adapted to workloads that Apple could
  • 00:11:34
    never have imagined. So, which is
  • 00:11:37
    better? Well, if you're building a
  • 00:11:38
    system for a tightly scoped, high
  • 00:11:40
    bandwidth professional content creation
  • 00:11:42
    task, especially video, photography, or
  • 00:11:44
    machine learning pipelines, then Apple's
  • 00:11:46
    unified memory architecture can offer a
  • 00:11:48
    level of performance and simplicity
  • 00:11:50
    that's hard to match. So, if like me,
  • 00:11:52
    your main reason for having a Mac is to
  • 00:11:54
    run Final Cut, it's almost perfect for
  • 00:11:56
    that task. But if your needs are more
  • 00:11:58
    general, or if you value upgrade paths,
  • 00:12:00
    flexibility, or just being able to
  • 00:12:02
    tinker and learn, then a Ryzenbased
  • 00:12:04
    system with shared memory is not only
  • 00:12:05
    good enough, it might actually be better
  • 00:12:06
    for you as a long-term investment. If
  • 00:12:09
    you're doing AI workloads, the ability
  • 00:12:10
    to run the larger models is appreciated
  • 00:12:12
    on both systems, but the Macs run them
  • 00:12:14
    significantly faster than the Ryzen's
  • 00:12:16
    APU. But if you're running a more CPU
  • 00:12:18
    friendly task like solving prime
  • 00:12:20
    numbers, the 16 high-speed cores of the
  • 00:12:23
    Ryzen actually put the Mac to shame,
  • 00:12:25
    turning in nearly double the
  • 00:12:26
    performance. In fact, the Knuckbox is
  • 00:12:28
    the fastest single core chip I've ever
  • 00:12:30
    tested, faster than both the M2 Mac Pro
  • 00:12:32
    Ultra and the Ryzen Thread Ripper 7995WX
  • 00:12:35
    on single core workloads. And the CPU is
  • 00:12:38
    fast enough that it even beats my older
  • 00:12:39
    32 core Thread Ripper 3270X on
  • 00:12:42
    multi-core tests despite having only
  • 00:12:44
    half the core count. So, the key is
  • 00:12:46
    knowing what kind of work you do and
  • 00:12:47
    then choosing the tool that best matches
  • 00:12:49
    that profile. If you found today's look
  • 00:12:52
    at memory architecture to be any
  • 00:12:53
    combination of informative or
  • 00:12:55
    entertaining, remember that I'm mostly
  • 00:12:56
    in this for the subs and likes. So, I'd
  • 00:12:58
    be honored if you consider leaving me
  • 00:12:59
    one of each before you go today. And if
  • 00:13:01
    you're already subscribed to the
  • 00:13:02
    channel, thank you. In the meantime, and
  • 00:13:04
    in between time, I hope to see you next
  • 00:13:06
    time right here in Dave's Garage.
Tags
  • memory architecture
  • M2 Ultra
  • Ryzen 860S
  • unified memory
  • bandwidth
  • cache coherency
  • performance
  • upgradability
  • software optimization
  • content creation