I am trying to understand how the GPUs work currently.

I saw that Apple M2 Max has for example 30 GPU cores. I was surprised because I heard that GPU have hundreds of cores. So I made a bit research and I got my answer:

on Apple silicon each core is made up of 16 execution units, which each have 8 distinct compute units (ALU)

That makes more sense now. 30 cores is actually 30*16*8=3840 execution units.
But why separate in “cores” like this then ? I heard that GPUs, conversely to GPUs can have all their units corking on the SAME task.
why separate in 16 then in 8 rather than a full grid ? I don’t get it.
Does it have implications or is it just pure marketing ?