Could Sony create fixed function pipelines for the PS4 even after release?

onQ · Oct 18, 2013

I know this sound silly but it seems like it's exactly what Sony is planning to do with the 8 ACE's.

It's a few things that I have read over the last year or so that's leading me to believe this is what they are doing I'll try to go back & find all the quotes later but for now I have a question.

If Sony was to config the 64 command queues to make the pipelines emulate real fixed function pipelines could they work just as efficient as real fixed function hardware?

3dilettante · Oct 18, 2013

A fixed-function pipeline is distinguished by it's being fixed.
It has a function, and that is the task it performs without being able to be significantly modified.

In the context of this console, how does one create a pipeline that can--through the act of changing its function--become a pipeline that cannot change its function?

patsu · Oct 18, 2013

I am surprised you didn't lose it, 3dilettante. Great patience.

onQ · Oct 18, 2013

3dilettante said:
A fixed-function pipeline is distinguished by it's being fixed.
It has a function, and that is the task it performs without being able to be significantly modified.

In the context of this console, how does one create a pipeline that can--through the act of changing its function--become a pipeline that cannot change its function?

By creating the fixed function pipelines at the driver level once you figure out just what fixed functions you want the pipelines to be used for.

patsu · Oct 18, 2013

In the usual terminology, I believe fixed function = non-programmable. If the ACE and the associated pipeline and CUs are programmable, then they cannot be called fixed functions by definition.

May want to rephrase your thoughts.

3dilettante · Oct 18, 2013

onQ said:
By creating the fixed function pipelines at the driver level once you figure out just what fixed functions you want the pipelines to be used for.

There is a physical component inherent to the defining whether something is fixed-function, and it is also this physical component from which it derives its utility.

A fixed-function pipeline or unit can strip out a lot of logic necessary for problems outside of its target workload.
The means by which it can be controlled--be it the control logic or ability to receive software commands that could retarget it--can also be fixed or mostly stripped away.

Software instructions don't create circuits that aren't there. Barring something more exotic, it's a limitation of conventional silicon integration and true for the SoCs described so far.
If they can, it's something like a programmable gate array. A one-time programmable gate array is almost what you are describing, but it would be a bad idea because it is one-time.
A FPGA can be reprogrammed, but then as a whole the device is not fixed-function.

onQ · Oct 18, 2013

patsu said:
In the usual terminology, I believe fixed function = non-programmable. If the ACE and the associated pipeline and CUs are programmable, then they cannot be called fixed functions by definition.

May want to rephrase your thoughts.

I said create fixed function pipelines.

3dilettante said:
There is a physical component inherent to the defining whether something is fixed-function, and it is also this physical component from which it derives its utility.

A fixed-function pipeline or unit can strip out a lot of logic necessary for problems outside of its target workload.
The means by which it can be controlled--be it the control logic or ability to receive software commands that could retarget it--can also be fixed or mostly stripped away.

Software instructions don't create circuits that aren't there. Barring something more exotic, it's a limitation of conventional silicon integration and true for the SoCs described so far.
If they can, it's something like a programmable gate array. A one-time programmable gate array is almost what you are describing, but it would be a bad idea because it is one-time.
A FPGA can be reprogrammed, but then as a whole the device is not fixed-function.

Kinda but what I read about was actually config-able pipelines instead of FPGA to get about the same result. I'll search for it.

Also this is one of the articles that triggered this thought.

http://www.gamasutra.com/view/feature/191007/inside_the_playstation_4_with_mark_.php?print=1

Familiar Architecture, Future-Proofed

So what does Cerny really think the console will gain from this design approach? Longevity.

Cerny is convinced that in the coming years, developers will want to use the GPU for more than pushing graphics -- and believes he has determined a flexible and powerful solution to giving that to them. "The vision is using the GPU for graphics and compute simultaneously," he said. "Our belief is that by the middle of the PlayStation 4 console lifetime, asynchronous compute is a very large and important part of games technology."

Cerny envisions "a dozen programs running simultaneously on that GPU" -- using it to "perform physics computations, to perform collision calculations, to do ray tracing for audio."

But that vision created a major challenge: "Once we have this vision of asynchronous compute in the middle of the console lifecycle, the question then becomes, 'How do we create hardware to support it?'"

One barrier to this in a traditional PC hardware environment, he said, is communication between the CPU, GPU, and RAM. The PS4 architecture is designed to address that problem.

"A typical PC GPU has two buses," said Cerny. "There’s a bus the GPU uses to access VRAM, and there is a second bus that goes over the PCI Express that the GPU uses to access system memory. But whichever bus is used, the internal caches of the GPU become a significant barrier to CPU/GPU communication -- any time the GPU wants to read information the CPU wrote, or the GPU wants to write information so that the CPU can see it, time-consuming flushes of the GPU internal caches are required."

Enabling the Vision: How Sony Modified the Hardware

The three "major modifications" Sony did to the architecture to support this vision are as follows, in Cerny's words:

"First, we added another bus to the GPU that allows it to read directly from system memory or write directly to system memory, bypassing its own L1 and L2 caches. As a result, if the data that's being passed back and forth between CPU and GPU is small, you don't have issues with synchronization between them anymore. And by small, I just mean small in next-gen terms. We can pass almost 20 gigabytes a second down that bus. That's not very small in today’s terms -- it’s larger than the PCIe on most PCs!
"Next, to support the case where you want to use the GPU L2 cache simultaneously for both graphics processing and asynchronous compute, we have added a bit in the tags of the cache lines, we call it the 'volatile' bit. You can then selectively mark all accesses by compute as 'volatile,' and when it's time for compute to read from system memory, it can invalidate, selectively, the lines it uses in the L2. When it comes time to write back the results, it can write back selectively the lines that it uses. This innovation allows compute to use the GPU L2 cache and perform the required operations without significantly impacting the graphics operations going on at the same time -- in other words, it radically reduces the overhead of running compute and graphics together on the GPU."
Thirdly, said Cerny, "The original AMD GCN architecture allowed for one source of graphics commands, and two sources of compute commands. For PS4, we’ve worked with AMD to increase the limit to 64 sources of compute commands -- the idea is if you have some asynchronous compute you want to perform, you put commands in one of these 64 queues, and then there are multiple levels of arbitration in the hardware to determine what runs, how it runs, and when it runs, alongside the graphics that's in the system."
"The reason so many sources of compute work are needed is that it isn’t just game systems that will be using compute -- middleware will have a need for compute as well. And the middleware requests for work on the GPU will need to be properly blended with game requests, and then finally properly prioritized relative to the graphics on a moment-by-moment basis."

This concept grew out of the software Sony created, called SPURS, to help programmers juggle tasks on the CELL's SPUs -- but on the PS4, it's being accomplished in hardware.

The team, to put it mildly, had to think ahead. "The time frame when we were designing these features was 2009, 2010. And the timeframe in which people will use these features fully is 2015? 2017?" said Cerny.

"Our overall approach was to put in a very large number of controls about how to mix compute and graphics, and let the development community figure out which ones they want to use when they get around to the point where they're doing a lot of asynchronous compute."

Cerny expects developers to run middleware -- such as physics, for example -- on the GPU. Using the system he describes above, you can run at peak efficiency, he said.

"If you look at the portion of the GPU available to compute throughout the frame, it varies dramatically from instant to instant. For example, something like opaque shadow map rendering doesn't even use a pixel shader, it’s entirely done by vertex shaders and the rasterization hardware -- so graphics aren't using most of the 1.8 teraflops of ALU available in the CUs. Times like that during the game frame are an opportunity to say, 'Okay, all that compute you wanted to do, turn it up to 11 now.'"

Sounds great -- but how do you handle doing that? "There are some very simple controls where on the graphics side, from the graphics command buffer, you can crank up or down the compute," Cerny said. "The question becomes, looking at each phase of rendering and the load it places on the various GPU units, what amount and style of compute can be run efficiently during that phase?"

Brad Grenz · Oct 18, 2013

Your question doesn't make sense. Sony can create specialized libraries that leverage the hardware in certain ways for game developers to use, just like they did throughout the life of the PS3, but you wouldn't call them "fixed function pipelines" because they'd be software, and developers would be free to modify them for their needs.

3dilettante · Oct 18, 2013

onQ said:
Kinda but what I read about was actually config-able pipelines instead of FPGA to get about the same result. I'll search for it.

The more configurable or alterable something is, the less it can be described as fixed.
If there are specific areas of silicon that you are describing as being able to be configured or programmed to perform different and changing tasks past the point of manufacturing, fixed-function isn't the term for them.

Pete · Oct 18, 2013

onQ said:
If Sony was to config the 64 command queues to make the pipelines emulate real fixed function pipelines could they work just as efficient as real fixed function hardware?

Efficient in what sense? Transistor count / area / power / manufacturing cost? Or do you mean in terms of giving developers ideal or optimum use cases for the hardware? Isn't that what Cerny & co. did for the PS3 (mostly Cell)? And isn't that the point of flexible hardware, to accommodate better algorithms down the line rather than fixing you to one? Isn't your very question (Sony doesn't know what to bake into the GPU at launch) an argument against fixed functionality? Who's to say Sony doesn't discover even better "fixed functionality" down the road, in which case they've wasted die space if PS4's GPU had a bunch of fixed functionality to begin with?

Do you mean simplifying development by offering optimal paths to achieving certain effects? Don't devs get there with experience / libraries / APIs / middleware?

Or are you talking about something like a PhysX or 3D sound "pipeline?" (That'd take me back to xp / libs / APIs / mw, but what do I know.)

What advantages do you see from fixed function pipes? Efficiency is a nebulous term until you say what it is you want the GPU to be efficient at.

onQ · Oct 18, 2013

I found the PDF that I read a few months ago http://www.retarget.com/resources/pdfs/cox-multicoreexpo10.pdf

Brad Grenz said:
Your question doesn't make sense. Sony can create specialized libraries that leverage the hardware in certain ways for game developers to use, just like they did throughout the life of the PS3, but you wouldn't call them "fixed function pipelines" because they'd be software, and developers would be free to modify them for their needs.

I'm not saying that the hardware would be fixed function I'm saying create what would look like a fixed function pipeline so that the software would see it as if it was running on fixed function hardware.

3dilettante said:
The more configurable or alterable something is, the less it can be described as fixed.
If there are specific areas of silicon that you are describing as being able to be configured or programmed to perform different and changing tasks past the point of manufacturing, fixed-function isn't the term for them.

OK this is what I'm saying create fixed function pipelines so it would be like this 'Pipeline 1 will run the physics code & Pipeline 2 will run the lighting code' because the pipelines are designed to look as if they are actually hardware created for Physics & Lighting.

The fixed function pipelines haven't been created yet but in a few years the devs & Sony will chose how the data paths should be laid out to create the fixed function pipelines.

3dilettante · Oct 18, 2013

onQ said:
OK this is what I'm saying create fixed function pipelines so it would be like this Pipeline 1 will run the physics code & Pipeline 2 will run the lighting code because the pipelines are designed to look as if they are actually hardware created for Physics & Lighting.

The fixed function pipelines haven't been created yet but in a few years the devs & Sony will chose how the data paths should be laid out to create the fixed function pipelines.

It sounds like you're asking if they can create a software service that performs certain functions, and then put it behind an interface that presents a certain set of inputs and outputs to the programmer trying to use the service.

The answer to that would be yes, but that wouldn't be unique. Abstracting away nitty-gritty details would come with middleware or modularization done by the engine programmers.
On the other side, fixed-function hardware can have more software or hardware added to massage away some of the rough edges so that software can more readily use it.

The interface wouldn't normally create the impression that there's a specific fixed-function hardware pipeline behind it unless it's actually there for the sake of emulation.
Having ultra-specific timings, working around bugs, and juggling random control registers doesn't produce any benefit unless you had games like those on ancient consoles that specifically took advantage of things like cycle timings and scan line progression for different effects.

Grall · Oct 18, 2013

onQ said:
I'm not saying that the hardware would be fixed function I'm saying create what would look like a fixed function pipeline so that the software would see it as if it was running on fixed function hardware.

What you're saying doesn't make sense, it is not possible, and even if it was, there'd be no benefit to it as even if you could dedicate certain compute queues to certain tasks it would not speed anything up in any way.

'Pipeline 1 will run the physics code & Pipeline 2 will run the lighting code' because the pipelines are designed to look as if they are actually hardware created for Physics & Lighting.

Well, it isn't. All the pipes are functionally identical, from what we know (and from what logic would dictate.)

The fixed function pipelines haven't been created yet but in a few years the devs & Sony will chose how the data paths should be laid out to create the fixed function pipelines.

Reality just changes to fit whatever ideas you come up with, is that what you're essentially saying here...?

NO. Doesn't work like that!

Laa-Yosh · Oct 18, 2013

What is this I don't even

BoardBonobo · Oct 18, 2013

If the entire SoC is based on an fPGA then it could potentially be reconfigured to be fixed function or whatever.

As it is, it's just not going to happen.

onQ · Oct 18, 2013

It didn't make sense to any of you when I posted about the PS4's GPU being able to run Graphics & Compute together neither.

Grall · Oct 18, 2013

What, you mean the ability which sony explicitly said was there, and which we knew it had all along?

I don't get it. You didn't reveal anything new then, and you're making something up out of nothing now. What's the deal here, you seeking attention and approval, or what?

patsu · Oct 18, 2013

onQ said:
It didn't make sense to any of you when I posted about the PS4's GPU being able to run Graphics & Compute together neither.

Correction. Some of the folks here do know way before you posted your question. But they were not sure if you knew exactly what you typed.

onQ · Oct 18, 2013

Grall said:
What, you mean the ability which sony explicitly said was there, and which we knew it had all along?

I don't get it. You didn't reveal anything new then, and you're making something up out of nothing now. What's the deal here, you seeking attention and approval, or what?

I'm not seeking attention or approval.

patsu said:
Correction. Some of the folks here do know way before you posted your question. But they were not sure if you knew exactly what you typed.

Some people was flat out saying that it couldn't be done & trying to make it seem as if I was saying that the GPU had 2X as much power or something.

Anyway I'll explain again.

Instead of creating 3 or 4 fixed function hardware chips you will use one general purpose chip but creating 3 or 4 fixed function pipelines.

Shifty Geezer · Oct 18, 2013

onQ said:
Some people was flat out saying that it couldn't be done & trying to make it seem as if I was saying that the GPU had 2X as much power or something.

No-one ever said concurrent GPU + compute was impossible. Everyone said you could not have compute in addition to 1.84 TF of GPU work which is what you said. Let me quote...

You said:
so you can use the full 1.84TFLOP for graphics & still run physics & other compute tasks on the GPGPU as long as the tasks are not blocking one another.

If you never meant 1.84 TF of graphics and extra compute on top of that, you failed to express yourself clearly. Similarly, here you talk of fixed-function pipelines in flexible hardware, which is an odd concept and one everyone's getting muddled with fixed-function hardware. You do have a habit of asking questions or presenting ideas in a way that conflicts with the general board's interpretations and leads to rather flustered discussions.

Instead of creating 3 or 4 fixed function hardware chips you will use one general purpose chip but creating 3 or 4 fixed function pipelines.

Why would you? You have a thread on this board that you started where you were asking if PS4 would result in better software renderers, precisely because of flexibility. There's no reason in walling off parts of the hardware to enforce use except in the case of doing so at the hardware level where the restricted flexibility is balanced out with greater relative performance. 'Fixed-function pipelines' can be implemented at the middleware level, providing developers with hooks to do the basic chores of rendering, but the hardware remains open to be exploited however the developers choose. It's the only sensible way to do it. One could, in theory, provide a forced software interface for various functions that can then be implemented in dedicated hardware in future machines, but that'd be a crazy step backwards, like implementing GC's fixed functions in shaders and then replacing your shader based GPU with a fixed-function one.

Could Sony create fixed function pipelines for the PS4 even after release?

onQ

3dilettante

patsu

onQ

patsu

3dilettante

onQ

Brad Grenz

Philosopher & Poet

3dilettante

Pete

Moderate Nuisance

onQ

3dilettante

Grall

Invisible Member

Laa-Yosh

I can has custom title?

BoardBonobo

My hat is white(ish)!

onQ

Grall

Invisible Member

patsu

onQ

Shifty Geezer

uber-Troll!

Similar threads