Can you apply the same scenario with other benchmarks? Try LuxMark or Vray bench.
Wider pipes can provide better single thread performance with ILP. I'm personally still waiting to see how Zen works with APUs that may take on part of the work. The Zen design choices might still lack context.Looks like AMD made Zen with such wide pipeline mostly for SMT gains, not so much about single-thread performance.
Software changes for HSA and ROCm. Any indication of tightly integrated CPU and GPU. A connection via IF with higher bandwidth and mess latency/distance would be a good start. Adding a stack of HBM2 would be nice, but not absolutely necessary. Any indication AMD is attempting to accelerate AVX512 and similar instructions with a GPU.But sure, APUs will provide more information. I'm not quite sure how the context is that different, though. Anything you expect we could be looking for beyond changes in caches & Infinity fabric?
Single core + HT: 693, on my Broadwell.Luxmark C++ 1 core + 1 SMT
Single core + HT: 693, on my Broadwell.
At this point, I blame the CCX domain cross penalty, since LuxRender, as any PT renderer, is rather sensitive to thread sync latencies.
The data set could still spill over the local CCX.it should not pass the other CCX though,
The data set could still spill over the local CCX.
The two L3 clusters are virtually addressed as a single uniform space, regardless of thread allocation.
Also, the non-inclusive relation to the L2 cache makes the L3 a bit extra slower, due to the extra write cycle, when a cache line is evicted.
This test might have been done using a single SO-DIMM.
Single channel is good enough for fast dual core or low clocked quad core CPU. But it certainly isn't good enough for a fast iGPU.Dell Inspirion 7570 https://browser.geekbench.com/v4/cpu/3999148
Differences between single channel and double channel aren't very high. But on the other hand there are no results of RR APUs with dual channel.
Single channel is good enough for fast dual core or low clocked quad core CPU. But it certainly isn't good enough for a fast iGPU.
11 CU Raven Ridge iGPU is roughly equivalent to Xbox One in GPU performance. Xbox One has quad channel memory controller with 68 GB/s main memory bandwidth and 32 MB ESRAM to reduce the main memory bandwidth bottleneck. It is still often memory bandwidth bound. I would guess that 11 CU Raven Ridge with single channel memory is crippled by lack of memory bandwidth in games. Even double channel should still be memory bandwidth bound. You would need at least a quad channel memory (like Xbox One) to avoid the memory bandwidth bottleneck. This is of course assuming that the 11 CU Raven Ridge has high enough TDP to run the GPU at ~900 MHz in games.