AMD RyZen CPU Architecture for 2017

I'd say actually 5+ times cheaper since 1700 can be OC to same recuency and cost 330 bucks. Also taking into account cheaper boards the price for the system drops drastically.
Professional developers do not overclock their CPUs. Data corruption and crashes are the worst thing for productivity. But Ryzen 1700 is perfect for indies with low budget. Compiles much faster than alternatives (quad core i7).
 
Sebbi (or whoever) I have a question
With games (roughly) how many cores could a game make use of
(for example if I want a piece of toast there's no point hiring 5 chefs)
 
Professional developers do not overclock their CPUs. Data corruption and crashes are the worst thing for productivity. But Ryzen 1700 is perfect for indies with low budget. Compiles much faster than alternatives (quad core i7).
Really? buy they could safe a lot of money just going into the bios and change some settings its the same cpu afterall just lower quality silicon.
 
Sebbi (or whoever) I have a question
With games (roughly) how many cores could a game make use of
(for example if I want a piece of toast there's no point hiring 5 chefs)
That depends of many factors. If you want something that can play at 1080/60 even an 2500k can do the job or a g4560 but some games use more cores and some use less and games will use more and more cores with time so you will see the performance drops drastically with each new gen. Also you dont just game, you turn on ur pc, compress, decompress data, use several apps at the same time, etc.

With that said, for games you should buy the best VGA ur wallet can afford and then look at a cpu but I would suggest at least a 4/8 cpu that would age much better than a 4/4 or 2/4.
 
What I really mean is imagine your where coding a game and everyone has cpu's with 50,000 cores
how many could you make use of (yes I know all games are different)
Maximum occupation on Fury X needs 64 CU * 40 waves/CU * 64 threads/wave = 163840 threads. Most games can fill it easily (at least some part of the frame).

"Just" write your whole game code with ISPC and it will scale: https://ispc.github.io/. Much easier said than done :)
 
Sebbi, Fury X is a graphics card I'm talking about cpu cores
could you of written your last game to make use of any amount of cpu cores
or is there a limit
for example Ive written small programs and even if I could of written them to use 8 cores it would of been pointless
 
Last edited:
Sebbi, Fury X is a graphics card I'm talking about cpu cores
could you of written your last game to make use of any amount of cpu cores
or is there a limit
for example Ive written small programs and even if I could of written them to use 8 cores it would of been pointless

i think this is a good presentation on the subject 1/2 on CPU 1/2 on GPU,
 
Thanks for that. Just watched it
and it appears they can max out 4 threads they cant max out 16 but they want to they dont mention if they achieved it or how many cores they could max out.
(unless they did I and just didnt understand it)
 
Sebbi, Fury X is a graphics card I'm talking about cpu cores
could you of written your last game to make use of any amount of cpu cores
or is there a limit
for example Ive written small programs and even if I could of written them to use 8 cores it would of been pointless
I have been spending lots of time past year in programming a GPGPU physics simulator. If it was running on a CPU (ISPC code instead of DirectCompute), it would scale to very high core counts.

People tend to say that things like AI doesn't scale to multiple cores, but that's not true at all. If you need to build an AI that handles tens of thousands of units, you can definitely write it in a way that scales up to big amount of CPU cores or even run it on GPU. There's no serial dependencies between multiple humans in real life either. Everybody monitors the world state themselves and makes their own decisions. To avoid synchronization in computer implementation, everybody first monitors the world state (read only, no hazards), then everybody modifies their own state (writes to different memory locations, no hazards). Expensive serial operations can often be replaced with something better suited for parallel execution: For example path finding with A* can be replaced by vector fields that lead to all important locations. Combine that with simple local path-finding and you have O(1) solution for each unit. Obviously you need to update vector fields when the environment changes, but that operation cost doesn't scale by the unit count (and it's a highly parallel operation in itself).

Writing parallel programs just needs a mindset change. You never do operations to single object at time, always everyone in the same class. Object oriented mindset isn't well suited for programming like this. You aren't concerned about objects, you are concerned about processing steps and their memory access patterns. Which steps need which inputs and outputs and how to solve common parallel problems in the most efficient way (dependencies, reductions, compaction, prefix-sums, etc).

This isn't mainstream game development obviously. But many of the same ideas are applied in so called "data oriented design".
 
In terms of PCI-E lanes able to be saturated concurrently:

Broadwell-E: 40 (some SKUs are locked down to 28 lanes)
Summit Ridge: 24
Kaby Lake: 20 (As far as I can tell)
Bristol Ridge: 16 (But it is an outdated and budget-orientated product, hopefully the upcoming Zen-based APU will have more)

All then use chipsets to multiplex 4 of those lanes into various storage, LAN, etc. connections but those will all share the 4 lane bandwidth.

For Intel you need to differentiate between native to CPU x16/x8/x4 configuration as seen with GPU (and also direct PCIe attached SSD on these connections ala Intel's 750 SSD) and the HSIO that combines rest PCIe/Sata/USB/LAN/etc through the contended-shared DMI 3 (x4 PCIe lanes equivalent).
The M.2 also is integral to the PCH/DMI with Intel motherboard designs.


For those interested here is how the HSIO lanes break downs on latest Kaby Lake (1st 6 are reserved and was like that in the past) separate to the native-direct 16 PCIe Gen 3 lanes to CPU on consumer.
So you have 16 PCIe gen 3 lanes native-direct to CPU, and 'up to' 24 PCIe Gen 3 lanes via the HSIO with more recent consumer Intel CPUs such as Kaby Lake.

HSIO.JPG

This is one benefit of HEDT as you have a serious amount of PCIe lanes that are no longer shared-contested.

With AMD it seems all go through the IOHub, does anyone have any further details on the performance/behaviour of the IOHub and its relationship with the Data Fabric beyond the handy document and diagram provided by itsmydamnation and hoom on the previous page.
Would be nice to know exactly the limits of the IOHub->Data Fabric and all that is shared across the IOHub (of course not all devices will be in use at the same time) relative to what Intel has done with the PCH/DMI.
Cheers
 
Last edited:
Sebbi (or whoever) I have a question
With games (roughly) how many cores could a game make use of
(for example if I want a piece of toast there's no point hiring 5 chefs)
Are we talking about a game that simulates a big city, with lots of people on streets, lots of cars, trucks and buses (and people inside them)? Kids playing in the fully packed park nearby (with physically simulated sand of course). It's autumn and thousands of falling tree leaves. Wind simulation carries those leaves everywhere. Birds flying with flocking AI + reacting to kids nearby. Maybe it is a rainy ray and we also need robust water simulation. Water slides down the roofs and drops to severs and the river crossing the city. Boats crossing the river, all properly simulated, etc. A big city ecosystem...

Would you believe me if I said that 4 cores is enough for all of this, now and 50 years in the future? How about eight? 16? 32? Brand new Xeon Phi with 64 cores / 256 threads and 16-wide AVX-512?
 
Last edited:
Would you believe me if I said that 4 cores is enough for all of this, now and 50 years in the future?How about eight? 16? 32?
I dont know, thats why I asked, not just is it technically possible, but is it doable from a cost/time point of view
I guess what I'm asking is now and in the near future will/are devs writing games that will use as many cores that are available or are they coded to use up to X amount of cores.
eg: Trials Fusion (thats your game yes) if it was ran on a cpu with thousands of cores would it use all of them or only use X amount ?
 
Trials Fusion (thats your game yes) if it was ran on a cpu with thousands of cores would it use all of them or only use X amount ?
Trials Fusion was a cross gen game (last gen + current gen). Its technology was originally designed for CPUs with 3 cores and 6 hardware threads (Xbox 360).

I have made two tech interviews about that engine with Digital Foundry:
http://www.eurogamer.net/articles/digitalfoundry-trials-evolution-tech-interview
http://www.eurogamer.net/articles/digitalfoundry-2014-trials-fusion-tech-interview

There's some mentions about the core mappings, and what we are doing with CPUs.

Obviously if you program a current gen game engine from scratch, you will use a task based system that scales automatically to any amount of cores. However a task system alone is not enough to actually saturate all those cores with meaningful work. The game needs to be really CPU heavy to fill 16 threads at 3.6 GHz (8 core Ryzen 1800X) with work 100% of the time. Obviously you could just do more stuff, like add hundreds of thousands of physics objects in the level. Huge amount of destruction, etc. But it might be hard to actually render all these objects. DX12 and Vulkan help a lot, but you will end up being GPU bound pretty easily if you render too many independent small objects. But if you had those 50,000 CPU cores you could simply ray trace everything on CPU side. No need for GPU :)
 
I have been spending lots of time past year in programming a GPGPU physics simulator. If it was running on a CPU (ISPC code instead of DirectCompute), it would scale to very high core counts.

People tend to say that things like AI doesn't scale to multiple cores, but that's not true at all. If you need to build an AI that handles tens of thousands of units, you can definitely write it in a way that scales up to big amount of CPU cores or even run it on GPU. There's no serial dependencies between multiple humans in real life either. Everybody monitors the world state themselves and makes their own decisions. To avoid synchronization in computer implementation, everybody first monitors the world state (read only, no hazards), then everybody modifies their own state (writes to different memory locations, no hazards). Expensive serial operations can often be replaced with something better suited for parallel execution: For example path finding with A* can be replaced by vector fields that lead to all important locations. Combine that with simple local path-finding and you have O(1) solution for each unit. Obviously you need to update vector fields when the environment changes, but that operation cost doesn't scale by the unit count (and it's a highly parallel operation in itself).

Writing parallel programs just needs a mindset change. You never do operations to single object at time, always everyone in the same class. Object oriented mindset isn't well suited for programming like this. You aren't concerned about objects, you are concerned about processing steps and their memory access patterns. Which steps need which inputs and outputs and how to solve common parallel problems in the most efficient way (dependencies, reductions, compaction, prefix-sums, etc).

This isn't mainstream game development obviously. But many of the same ideas are applied in so called "data oriented design".

Physics through GPU is now the future?

http://wccftech.com/microsoft-trademarks-direct-physics-dx12/

Btw I always though physics were serial because of how objects interacts with each other.
 
Back
Top