'GPUs still rule' asserts graphics guru Raja Koduri in response to custom AI silicon advocate

imaginary_num6er@alien.top · 1 year ago

'GPUs still rule' asserts graphics guru Raja Koduri in response to custom AI silicon advocate

blueredscreen@alien.top · 1 year ago

He’s only right in the short term when the technology isn’t stable and the AI software architectures are constantly changing.

Once things stabilize, we’re most likely switching to either analog compute in memory or silicon photonics both of which will be far less generic than a GPU, but with such a massive power, performance, and cost advantage that GPUs simply cannot compete.

That’s what they said. Nothing about AI is going to stabilize. The pace of innovation is impossible. I’m sure things were happy too at SambaNova until they went bye bye and Nvidia itself hired their lead architect.

theQuandary@alien.top · 1 year ago

I heard this same stuff in the 90s about GPUs. “GPUs are too specialized and don’t have the flexibility of CPUs”.

Startups failing doesn’t prove anything. There are dozens of startups and there will only be 2-4 winners. Of course MOST are going to fail. Moving in too early before things have settled down or too late after your competitors are too established are both guaranteed ways to fail.

In any case, algorithms and languages have a symbiotic relationship with hardware.

C is considered fast, but did you know that it SUCKS for old CISC ISAs? They are too irregular and make a lot of assumptions that don’t mesh well with the compute model of C. C pls x86 is where things changed. x86 could be adapted to run C code well. C compilers then adapted to be fast on x86 then x86 adapted to run that compiled C code better then the loop goes round and round.

This is true for GPUs too. Apple’s M1/M2 GPU design isn’t fundamentally bad, but it is different from AMD/Nvidia, so the programmer’s hardware assumptions and normal optimizations aren’t effective. Same applies to some extent for Intel Xe where they’ve been spending huge amounts to “optimize” various games (most likely literally writing new code to replace the original game code with versions optimized for their ISA).

The same will happen to AI.

Imagine that one of those startups gets compute-in-SSD working. Now you can do compute on models that would require terabytes of RAM on a GPU. You could do massive amounts of TOPS on massive working sets using just a few watts of power on a device costing just a few hundred dollars. This is in stark contrast to a GPU costing tens of thousands of dollars and costing a fortune in power to run that can’t even work on a model that big because the memory hierarchy is too slow.

Such a technology would warp the algorithms around it. You’ll simply be told to “make it work” and creative people will find a way to harness that compute power – especially as it is already naturally tuned to AI needs. Once that loop gets started in earnest, the cost of switching algorithms and running them on a GPU will be far too high. Over time it will be not just cost, but also ecosystem lockin.

I’m not saying that compute-in-memory will be the winner, but I’m quite certain that GPU is not because literally ALL of the prominent algorithms get faster and lower power with their specific ASIC accelerators.

Even if we accept the worst-case scenario and 2-4 approaches rise to the top and each requires a separate ASIC, the situation STILL favors the ASIC approach. We can support dozens of ISAs for dozens of purposes. We can certainly support 2-4 ISAs with 1-3 competitors for each.

blueredscreen@alien.top · 1 year ago

I heard this same stuff in the 90s about GPUs. “GPUs are too specialized and don’t have the flexibility of CPUs”.

Startups failing doesn’t prove anything. There are dozens of startups and there will only be 2-4 winners. Of course MOST are going to fail. Moving in too early before things have settled down or too late after your competitors are too established are both guaranteed ways to fail.

It’s relatively convenient to blame your failure due to being too smart too early instead of just facing the genuine lack of demand for your product.

C is considered fast, but did you know that it SUCKS for old CISC ISAs? They are too irregular and make a lot of assumptions that don’t mesh well with the compute model of C. C pls x86 is where things changed. x86 could be adapted to run C code well. C compilers then adapted to be fast on x86 then x86 adapted to run that compiled C code better then the loop goes round and round.

Nothing about modern x86 architectures constitutes any classic model of “CISC” under the hood, the silicon runs machine code and ops that for all intents and purposes can be abstracted to any ISA.

This is true for GPUs too. Apple’s M1/M2 GPU design isn’t fundamentally bad, but it is different from AMD/Nvidia, so the programmer’s hardware assumptions and normal optimizations aren’t effective. Same applies to some extent for Intel Xe where they’ve been spending huge amounts to “optimize” various games (most likely literally writing new code to replace the original game code with versions optimized for their ISA).

What?

Even if we accept the worst-case scenario and 2-4 approaches rise to the top and each requires a separate ASIC, the situation STILL favors the ASIC approach. We can support dozens of ISAs for dozens of purposes. We can certainly support 2-4 ISAs with 1-3 competitors for each.

Again, they all said that before you, and look where they are now. (hint hint: Nvidia)