But if the PS4 is rated at 176 GBps (10^9) or 5.5 GHz x 32 bytes then as long as the PS4 can reach 5.5 GHz and can move 32 bytes in any one cycle then the actual theoretical max bandwidth can be achieved.
Measured over a single (or few) cycles, yes, measured over a second, no.
The PS4's GPU has 18 CUs, each CU has 4 16-wide SIMD execution units. Each of these exec units can issue a load or store per cycle. That's 72 memory transactions per cycle. The memory system consists of 8 1GB x32 modules, each with 16 banks. That means you have a total of 128 banks. If two memory transactions hit the same bank, one will have to wait for the other to complete. The memory controller tries to reorder memory transactions to resolve bank conflicts (as well as optimize for open pages in a given bank), but there will
always be conflicts that induce stalls.
That means you will
never ever reach the nameplate bandwidth outside of specially engineered Mickey Mouse benchmarks.
I'm guessing CPU transactions aren't reordered with GPU transactions; The GPU can handle memory latency quite well, a CPU, not so much. Raw latency of GDDR5 is on the order of 50-something ns, but the GPU normally sees ~200ns latency for memory operations, - because of the heavy buffering going on. The CPU memory transactions throws a spanner into the bank/open page optimization strategy and the memory bandwidth utilization drops.
HBM2 has 8 channels per stack, two pseudo channels per channel and 16 banks per pseudo channel, or 256 banks per stack,- and normally multiple stacks. With more banks you have fewer conflicts and higher utilization.
Cheers