[Tom's Hardware] AMD 3D V-Cache enables RAM disk to hit 182 GB/s speeds - over 12X faster than the fastest PCIe 5.0 SSDs

RTcore@alien.top · 2 years ago

[Tom's Hardware] AMD 3D V-Cache enables RAM disk to hit 182 GB/s speeds - over 12X faster than the fastest PCIe 5.0 SSDs

Ruffelz@alien.top · 2 years ago

no way, cache and memory is faster than storage???

DktheDarkKnight@alien.top · 2 years ago

Imagine 8GB 3D V-cache. That would be glorious.

BurtMackl@alien.top · 2 years ago

Latency enters the chat

FlygonBreloom@alien.top · 2 years ago

L4 cache is still much faster than RAM.

3G6A5W338E@alien.top · 2 years ago

It’s not that simple.

Having L4 at all increases latency to actual RAM.

AgeOk2348@alien.top · 2 years ago

while true, at least for gaming intel already proved it increaes performance

CentralComputer@alien.top · 2 years ago

What use cases would this be good for?

KnownDairyAcolyte@alien.top · 2 years ago

What use cases would this be good for?

Yes

BookPlacementProblem@alien.top · 2 years ago

Your entire game could be in the next floor up from the CPU cores, instead of in a metaphorical different city.

RedTuesdayMusic@alien.top · 2 years ago

Looking at Star Citizen install size now I just need a 14-socket motherboard

gnivriboy@alien.top · 2 years ago

Reminds me of RAM drives, but people mostly moved on from that since SSDs have gotten so incredibly fast and cheap in the past couple of years.

14u2c@alien.top · 2 years ago

I seriously dubt that main memory latency and bandwidth is the performance bottleneck for many games. Even for loading it wouldn’t be particularly useful compared to storing the game in RAM because now you’d be limited by PCIe bandwidth. Maybe with horribly optimized games that do a lot of random random reads during load it would help, but that’s pushing it. Now the GPU side on the other hand could be interesting.

friendlyshrek@alien.top · 2 years ago

What. No.

Quatro_Leches@alien.top · 2 years ago

Game Dev in 2047: “why don’t I just decompress all these textures I won’t use for a while here”

emfloured@alien.top · 2 years ago

PCIe rebar currently uses system RAM to VRAM communication. 3dvcache to VRAM via DMA could be made possible without even accessing RAM, this would completely eliminate 50+ ns of RAM access latency (of course the necessary data needs to be already available in 3dvcache from system RAM before any of this fancy stuff happens).

7silverlights@alien.top · 2 years ago

Different amounts of stacked cache will be the next SKU differentiator and price gouger? Seems perfect for it.

ManicChad@alien.top · 2 years ago

It would be awesome if Simone could invent SRAM speed disks that weren’t volatile. It would be a huge step forward for PCs for many things and we would stare at CPU/GPU makers as the bottle necks.

Z3r0sama2017@alien.top · 2 years ago

I remember when I tried ramdisking my modded Skyrim. It was the only way to remove cell transistion stutter, even though I had a 5950x, 64gb ram, 980pro and a 3090.

200gb+ v cache when AMD?

jigsaw1024@alien.top · 2 years ago

I waiting for us to get GB amounts of cache on consumer chips like is starting to show up in enterprise/server chips.

That will be useful.

jaadumantar@alien.top · 2 years ago

that breaks the purpose of a cache, you want a cache to reduce service times, larger caches take longer to process, it’s diminishing results after a point, they also take A LOT MORE area on the die

XenonJFt@alien.top · 2 years ago

That needs a very big die.very big for cutrent mobo platforms. But I can see it being tried.making big dies and putting cache to outer borders of CPU walls. And no vchace is vertical cache it son top of die cover and chip itself. Not around it

F9-0021@alien.top · 2 years ago

Cache surrounding the die is probably better than stacking it on top of the die anyway. Would solve a good few of the current limitations of the X3D CPUs.

Malygos_Spellweaver@alien.top · 2 years ago

I would love to see the results of a 3D chip with a powerful iGPU. Not sure if it would work, but if it is possible, why is AMD not doing it? Would it cannibalise 100-200 eur GPU (they are already nonexistent anyway).

AnimeAlt44@alien.top · 2 years ago

There is very little demand for a powerful iGPU desktop chip, so the ones that exist are derivatives of laptop chips and thus monolithic. So far there has not been a stacked cache monolithic die chip.

Tired8281@alien.top · 2 years ago

It’s easy to say there no demand for something that doesn’t exist. Sales are zero.

kaisersolo@alien.top · 2 years ago

Igpu is not apu

AnimeAlt44@alien.top · 2 years ago

The laptop based desktop chips exist they are literally a thing and have been for a while. Both AMD and Intel have not seen high demand for those. Also even if that wasn’t the case, your argument is not really an argument at all since it can just be used to justify literally anything that hasn’t been tried.

Tired8281@alien.top · 2 years ago

lol they aren’t powerful, in this universe or any other

Falconman21@alien.top · 2 years ago

I do think this will change quickly if Qualcomm’s ARM chips are as fast as the M2 Max like they claim. And there’s reason to believe it, as they’ve bought/hired Apple’s head of processor development.

Considering the M2 Max GPU is roughly equivalent to a 3080 mobile or a desktop 3060ti at significantly better efficiency, I think the demand for monolithic could explode practically overnight.

Assuming some x86 to ARM translation gets most things running.

TwelveSilverSwords@alien.top · 2 years ago

Maybe Qualcomm would do so in the future, but as things stand now, it’s not the case.

The iGPU in the Snapdragon X Elite is on the same ballpark as the regular M2. Not the Pro or Max variant.

In 3DMARK wildlife extreme, the X Elite GPU is 50% faster than Radeon 780M.

https://youtu.be/03eY7BSMc_c?si=HbhQPDt-AN_PP_TS

Still, that’s nowhere near 3080 tier.

Qualcomm still needs to work on their Windows GPU drivers. Currently the only API the X Elite supports is DirectX12.

Some speculate that Qualcomm will eventually create Windows Vulkan driver for Adreno. And then use DXVK to support older DirectX versions, and use Zink to support OpenGL.

MMyRRedditAAccount@alien.top · 2 years ago

They already have a vulkan driver. The 3dmark runs were on vulkan

TwelveSilverSwords@alien.top · 2 years ago

Are you talking about the Snapdragon X Elite? Sure, their mobile chips do have Vulkan drivers.

If you go to the Snapdragon X Elite Product Brief, you can see the only supported API is DX12.

INITMalcanis@alien.top · 2 years ago

There is very little demand for a powerful iGPU desktop chip

There was little demand. Things change.

kif88@alien.top · 2 years ago

AMD is planning on the mi300 technically but that’s for enterprise and will cost tens of thousands.

Irisena@alien.top · 2 years ago

We’re still very far from that. Even mobile phones don’t stack GPU, they only stack RAM and NAND. RAM and cache are far simpler to stack since they are simple things in nature. While GPU is unbelievably complicated compared to those 2. Maybe Intel’s tile / AMD’s chiplet system is closer to what we want, but it’s still not as good as stacking.

THE_MUNDO_TRAIN@alien.top · 2 years ago

Time to play some old Playstation 1 RPGs with horrendous loading times all entirely stored on the L3 cache.

chronocapybara@alien.top · 2 years ago

Imagine Factorio stored entirely in CPU

Eitan189@alien.top · 2 years ago

The Xeon Max CPUs contain 64GB of HBM2e, which can be configured to act as a cache. You could run a lot of games entirely on the HBM!

ICC-u@alien.top · 2 years ago

Xeon isn’t AMD 3D VCache

Yaris_Fan@alien.top · 2 years ago

The 7995WX already has 384 MB L3 cache.

I wouldn’t be surprised if the next gen Thread ripper has 1GB L3.

Lycaa@alien.top · 2 years ago

Until you notice that the insane loading and save times are built into the engine and no SSD can ever change that.

I’m looking at you, Digimon World 2003.

Feniksrises@alien.top · 2 years ago

When I’m playing old games I sometimes wonder how we ever had the patience for it. Couldn’t play them today if it wasn’t for save state’s.

THE_MUNDO_TRAIN@alien.top · 2 years ago

That is a crime worthy of a chair with a power current flowing.

Marha01@alien.top · 2 years ago

https://en.wikipedia.org/wiki/Tiny_Core_Linux

There are <100MB Linux distributions. Is it theoretically possible to run an entire operating system without RAM, purely in CPU cache?

froop@alien.top · 2 years ago

Xeon Max will boot without any RAM installed at all. Though I’m not sure it counts, considering it has 64gb built into the cpu.

ShunyaAtma@alien.top · 2 years ago

That’s exactly what is done during bring up of new SoCs. Memory controllers are either non-functional in early prototypes or a miniature design is put into a bunch of FPGAs with only a single core and caches. The cache lines and TLB entries are primed and pinned with all relevant code and data pages before booting up a kernel.

loser7500000@alien.top · 2 years ago

I think this is also what happens at boot on most systems before RAM is initialized, so maybe boot times could be faster if they took advantage of caches getting larger?

ShunyaAtma@alien.top · 2 years ago

Not sure if you meant to point out something else but initramfs or ramdisks are loaded on to RAM itself which is already up and running at that point. RAM initialization is usually initiated by early boot firmware and information about the physical address map is eventually passed on to the OS kernel which later sets up paging (virtual memory).

VegetableNatural@alien.top · 2 years ago

On coreboot this boot method is called CAR, Cache as RAM, pretty interesting usage of cache to be honest, no need to add separate SRAM if you already have some

Yaris_Fan@alien.top · 2 years ago

The 7995WX has 384 MB L3 cache.

Imagine what you could do with that!

AgeOk2348@alien.top · 2 years ago

not only possible people have done it. dunno if theyve done it on amd but some soc have.

Srslyairbag@alien.top · 2 years ago

182GB/s, for up to 32MB of data. It’s an interesting study in misusing the tech, but it’s ultimately a bit meaningless.

What we really need is for someone to modify the ramdisk driver to appear as usb storage and make it so it runs under Vista, so we can use it for ReadyBoost.

evilgeniustodd@alien.top · 2 years ago

buy a copy of primocache. it’s a great piece of software that adds multi-level cache to windows

ShaidarHaran2@alien.top · 2 years ago

What we really need is for someone to modify the ramdisk driver to appear as usb storage and make it so it runs under Vista, so we can use it for ReadyBoost.

Use the RAM used as a ramdisk mimicking a disk drive as USB storage for Readyboost which uses a USB drive as…quasi-RAM?

This sounds like a circular way to do what RAM caching is already supposed to do haha, all modern operating systems do this already, used to call it Superfetch but now it’s just commonplace and assumed, as well as not dumping things you close out of RAM immediately in case some parts of it get reused

Ketorunner69@alien.top · 2 years ago

Why would it only be 32MB? This is the V-cache, not the L3.

Srslyairbag@alien.top · 2 years ago

32MB is what they tested on the article.

To clarify a little on what’s happening here, they’re not using the v-cache as a memory space and making the volume there as you might create a partition on a conventional disk drive, but rather, they’re accessing the ramdisk in such a way as to trick the system into keeping that it in cache. It’s almost completely impractical in real terms, but it’s a fun way to exploit the cache algorithm to get some silly numbers out of it.

[Tom's Hardware] AMD 3D V-Cache enables RAM disk to hit 182 GB/s speeds - over 12X faster than the fastest PCIe 5.0 SSDs

[Tom's Hardware] AMD 3D V-Cache enables RAM disk to hit 182 GB/s speeds - over 12X faster than the fastest PCIe 5.0 SSDs

AMD 3D V-Cache enables RAM disk to hit 182 GB/s speeds — over 12X faster than the fastest PCIe 5.0 SSDs